Flow 1.0.1
Flow project: Full implementation reference.
cong_ctl_classic.cpp
Go to the documentation of this file.
1/* Flow
2 * Copyright 2023 Akamai Technologies, Inc.
3 *
4 * Licensed under the Apache License, Version 2.0 (the
5 * "License"); you may not use this file except in
6 * compliance with the License. You may obtain a copy
7 * of the License at
8 *
9 * https://www.apache.org/licenses/LICENSE-2.0
10 *
11 * Unless required by applicable law or agreed to in
12 * writing, software distributed under the License is
13 * distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
14 * CONDITIONS OF ANY KIND, either express or implied.
15 * See the License for the specific language governing
16 * permissions and limitations under the License. */
17
18/// @file
20
21namespace flow::net_flow
22{
23// Implementations.
24
26 Congestion_control_strategy(logger_ptr, sock),
27 // Initialize the basics (CWND, SSTHRESH, etc.) in the usual way.
28 m_classic_data(logger_ptr, sock)
29{
30 // Nothing.
31}
32
34{
36}
37
38void Congestion_control_classic::on_acks(size_t bytes, [[maybe_unused]] size_t packets) // Virtual.
39{
40 /* We have received new (non-duplicate) individual acknowledgments for data we'd sent. This may
41 * affect (grow) CWND. This is a standard Reno algorithm, so just forward it to m_classic_data
42 * which takes care of all non-congestion-event-related CWND changing. */
43 m_classic_data.on_acks(bytes); // Will log.
44}
45
46void Congestion_control_classic::on_loss_event(size_t bytes, [[maybe_unused]] size_t packets)
47 // Virtual.
48{
49 using std::max;
50
51 /* Node has detected at least some lost bytes (at least one dropped packet), or at least so it
52 * thinks, but it's not a "major" (Drop Timeout-based) loss. Follow classic Reno, RFC 5681:
53 * guess that the available pipe (CWND) is half of what we thought it was; and set SSTHRESH to
54 * that same value, so that we immediately begin congestion avoidance (slow linear increase in
55 * CWND), and so that if we later enter slow start for whatever reason, it will end at this CWND
56 * value (unless something else changes SSTHRESH before then of course). In addition, we make the
57 * decay configurable: instead of it being a 1/2 decay, it is settable.
58 *
59 * More accurately, set it to max(m_flying_bytes / 2, 2 * max-block-size), where m_flying_bytes is
60 * the # of bytes we think are currently in the network (sent but not received by the other side
61 * or dropped somewhere).
62 *
63 * Why m_flying_bytes and not CWND? It used to be the latter until getting corrected at least
64 * in RFC 5681 formula (4). The explanation for the change given in the RFC is cryptic, but
65 * basically I surmised it's based on the intuition that we may not be filling the available
66 * pipe, yet congestion has been caused, so we shouldn't just change based on CWND, as that may
67 * not even affect how many bytes we allow to be In-flight in the near future (e.g., say pipe is
68 * only 10% full). One of the writers of the RFC explains here:
69 *
70 * http://www.postel.org/pipermail/end2end-interest/attachments/20060514/20ead524/attachment.ksh
71 *
72 * Note that the DCCP CCID 2 (similar to our protocol) RFC 4341 says "cwnd is halved," but there
73 * is no formula, and they're probably just writing in semi-short-hand; let's follow RFC 5681.
74 * In most cases m_flying_bytes and m_cong_wnd_bytes are equal or nearly equal. */
75
76 const Peer_socket::Const_ptr sock = socket();
77 const size_t& snd_flying_bytes = sock->m_snd_flying_bytes;
78
79 /* RFC 5689-3.2.1 and formula (4). The halving (or other reduction) of the window is computed in
80 * congestion_window_decay().
81 *
82 * Subtlety: FlightSize in the RFC represents the # of bytes in the pipe *before* the loss event.
83 * Well, in the RFC it's just (SND_NXT - SND_UNA), which is not changed by a dupe-ACK-exposed
84 * loss. However intuitively it's also correct: we want to reduce by N% the number of bytes which
85 * "caused" the loss, and that # was in effect before we marked those packets Dropped. Anyway,
86 * since we must be called AFTER the Drop, and bytes contains the # dropped, we can compute the
87 * pre-Drop total easily. */
88 const unsigned int window_decay_percent = congestion_window_decay();
89 const size_t new_wnd_bytes = max((snd_flying_bytes + bytes) * window_decay_percent / 100,
90 2 * sock->max_block_size());
91
92 FLOW_LOG_TRACE("cong_ctl [" << sock << "] update: loss event; "
93 "set sl_st_thresh=cong_wnd to [" << window_decay_percent << "%] of "
94 "In-flight [" << sock->bytes_blocks_str(snd_flying_bytes + bytes) << "].");
95
96 m_classic_data.on_congestion_event(new_wnd_bytes, new_wnd_bytes); // Will log but not enough (so log the above).
97
98 /* At this point m_classic_data will just start congestion avoidance. Classic Reno here would do
99 * a special phase called Fast Recovery, first. Understanding no such special phase is necessary
100 * in our case requires intuitive understanding of Fast Recovery in context. Fast Recovery
101 * (following a retransmit) is basically trying to inflate CWND with each duplicate cumulative ACK
102 * to compensate for the fact there is no separate "pipe" (bytes In-flight) variable, and that the
103 * normal measure of "pipe" (SND_NXT - SND_UNA) is inaccurate when packets in the [SND_UNA,
104 * SND_NXT) range are lost, which is the case after a loss event. So to allow more data (including
105 * the retransmitted segment) to enter the pipe, instead of increasing "pipe," Fast Recovery
106 * increases CWND. Once all of the lost packets have been ACKed (after retransmission, as
107 * needed), Fast Recovery ends, and regular congestion avoidance begins; at this point (SND_UNA -
108 * SND_NXT) again accurately represents "pipe," so CWND is deflated to what it "should" be ("pipe"
109 * / 2).
110 *
111 * So why don't we do this? Because we have full selective ACK information. Therefore we have an
112 * explicit pipe variable (m_snd_flying_bytes), which fully accounts for any dropped packets, and
113 * we needn't mess with CWND to compensate for anything. There is also no ambiguity as to what
114 * the duplicate ACK means, since we have acknowledgments for each individual packet. So we
115 * basically do what the SACK RFC 3517 recommends. The only real difference is that while RFC 3517
116 * still is written in the context of Fast Recovery being a separate phase (between loss and
117 * congestion avoidance) -- which the RFC 3517 algorithm replaces -- which ends once the gaps in
118 * the pipe are filled, we use the scoreboard principle (with "pipe", etc.) for our entire
119 * operation. Therefore the phase between loss and congestion avoidance is not special, or even a
120 * different phase, and is just regular congestion avoidance.
121 *
122 * This reasoning is validated by DCCP CCID 2 RFC 4341-5, which specifies just this course of
123 * action. */
124} // Congestion_control_classic::on_loss_event()
125
126void Congestion_control_classic::on_drop_timeout(size_t bytes, [[maybe_unused]] size_t packets) // Virtual.
127{
128 using std::max;
129
130 /* Node has detected a major loss event in the form of a Drop Timeout. Follow classic Reno, RFC
131 * 5681: guess that the available pipe (CWND) is half of what we thought it was, but only set
132 * SSTHRESH to this halved value, and begin slow start at a low CWND, so that it can slow-start
133 * until reaching this new SSTHRESH and then enter congestion avoidance. Since CWND is set
134 * basically to its initial value, the subtleties of on_loss_event() (Fast Recovery and all that)
135 * don't apply; we basically just start over almost as if it's a new connection, except SSTHRESH
136 * is not infinity. */
137
138 // Determine SSTHRESH.
139
140 const Peer_socket::Const_ptr sock = socket();
141 const size_t& snd_flying_bytes = sock->m_snd_flying_bytes;
142
143 /* RFC 5689-3.1 and formula (4). The halving (or other reduction) of the window is computed in
144 * congestion_window_decay().
145 *
146 * See also subtlety in on_loss_event() at this point. It matters even more here, as a Drop
147 * Timeout may well set snd_flying_bytes to zero (and we're called after registering the Drop in
148 * snd_flying_bytes)! */
149 const unsigned int window_decay_percent = congestion_window_decay();
150 const size_t new_slow_start_thresh_bytes = max((snd_flying_bytes + bytes) * window_decay_percent / 100,
151 2 * sock->max_block_size());
152
153 FLOW_LOG_TRACE("cong_ctl [" << sock << "] update: DTO event; set sl_st_thresh "
154 "to [" << window_decay_percent << "%] of "
155 "cong_wnd [" << sock->bytes_blocks_str(snd_flying_bytes + bytes) << "]; "
156 "cong_wnd to minimal value.");
157
158 // Now set SSTHRESH and let m_classic_data set CWND to a low value (common to most congestion control strategies).
159
160 // Will log but not enough (so log the above).
161 m_classic_data.on_drop_timeout(new_slow_start_thresh_bytes);
162} // Congestion_control_classic::on_drop_timeout()
163
165{
166 const Peer_socket::Const_ptr sock = socket();
167 unsigned int window_decay_percent = sock->opt(sock->m_opts.m_st_cong_ctl_classic_wnd_decay_percent);
168 assert(window_decay_percent <= 100); // Should have checked this during option application.
169 if (window_decay_percent == 0) // 0 is special value meaning "use RFC 5681 classic value," which is 1/2 (50%).
170 {
171 window_decay_percent = 50;
172 }
173 return window_decay_percent;
174}
175
177{
178 /* Node has detected that nothing has been sent out by us for a while. (Note that this is
179 * different from a Drop Timeout. A Drop Timeout causes data to be sent (either retransmitted [if
180 * used] or new data), which resets the state back to non-idle for a while.)
181 *
182 * This is handled similarly in most congestion control strategies; let m_classic_data handle it
183 * (will set CWND to initial window and not touch SSTHRESH). */
184
186} // Congestion_control_classic::on_idle_timeout()
187
188} // namespace flow::net_flow
Interface that the user should implement, passing the implementing Logger into logging classes (Flow'...
Definition: log.hpp:1291
void on_drop_timeout(size_t new_slow_start_thresh_bytes)
Adjust state, including CWND and SSTHRESH, assuming a Drop Timeout just occurred.
void on_congestion_event(size_t new_slow_start_thresh_bytes, size_t new_cong_wnd_bytes)
Sets internally stored SSHTRESH and CWND to the given values; appropriately resets internal state so ...
void on_acks(size_t bytes)
Adjusts state, including potentially CWND, based on either "congestion avoidance" or "slow start" alg...
size_t congestion_window_bytes() const
Return current stored CWND value in bytes.
void on_idle_timeout()
Adjust state, namely CWND, assuming an Idle Timeout just occurred.
Congestion_control_classic_data m_classic_data
The Reno CWND/SSTHRESH-changing engine and CWND/SSTHRESH storage.
void on_acks(size_t bytes, size_t packets) override
Implements Congestion_control_strategy::on_acks() API.
void on_idle_timeout() override
Implements Congestion_control_strategy::on_idle_timeout() API.
void on_drop_timeout(size_t bytes, size_t packets) override
Implements Congestion_control_strategy::on_drop_timeout() API.
unsigned int congestion_window_decay() const
Returns the decay (as a percentage) to apply to the congestion window upon encounterling loss.
size_t congestion_window_bytes() const override
Implements Congestion_control_strategy::congestion_window_bytes() API.
Congestion_control_classic(log::Logger *logger_ptr, Peer_socket::Const_ptr sock)
Constructs object by setting up logging and saving a pointer to the containing Peer_socket.
void on_loss_event(size_t bytes, size_t packets) override
Implements Congestion_control_strategy::on_loss_event() API.
The abstract interface for a per-socket module that determines the socket's congestion control behavi...
Definition: cong_ctl.hpp:180
Peer_socket::Const_ptr socket() const
Utility for subclasses that returns a handle to the containing Peer_socket.
Definition: cong_ctl.cpp:63
Const_target_ptr Const_ptr
Short-hand for ref-counted pointer to immutable values of type Target_type::element_type (a-la T cons...
#define FLOW_LOG_TRACE(ARG_stream_fragment)
Logs a TRACE message into flow::log::Logger *get_logger() with flow::log::Component get_log_component...
Definition: log.hpp:227
Flow module containing the API and implementation of the Flow network protocol, a TCP-inspired stream...
Definition: node.cpp:25