Flow 1.0.1
Flow project: Full implementation reference.
bandwidth.cpp
Go to the documentation of this file.
1/* Flow
2 * Copyright 2023 Akamai Technologies, Inc.
3 *
4 * Licensed under the Apache License, Version 2.0 (the
5 * "License"); you may not use this file except in
6 * compliance with the License. You may obtain a copy
7 * of the License at
8 *
9 * https://www.apache.org/licenses/LICENSE-2.0
10 *
11 * Unless required by applicable law or agreed to in
12 * writing, software distributed under the License is
13 * distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
14 * CONDITIONS OF ANY KIND, either express or implied.
15 * See the License for the specific language governing
16 * permissions and limitations under the License. */
17
18/// @file
20#include "flow/util/util.hpp"
21
22namespace flow::net_flow
23{
24// Implementations.
25
27 Log_context(logger_ptr, Flow_log_component::S_NET_FLOW),
28 m_sock(sock), // Get a weak_ptr from a shared_ptr.
29 m_bytes_per_time_smoothed(0), // Just to get the advertised answer in bandwidth_estimate_per_time() immediately.
30 m_bytes_per_time_less_smoothed(0), // For cleanliness.
31 m_bytes_this_sample(0), // on_acks() does assume this is already zero before it is first called.
32 m_no_acks_yet(true), // As documented.
33 m_no_samples_yet(true) // As documented.
34{
35 // Nothing.
36}
37
39{
41}
42
44{
45 using boost::chrono::microseconds;
46 using boost::chrono::round;
47 using boost::chrono::ceil;
48 using std::max;
49
50 /* Received "bytes" bytes of clean acknowledgment(s). Assuming both user and Node are sending
51 * data at the maximum possible rate, try to estimate the bandwidth as the rate of sending. Keep
52 * passing the bandwidth measurements into a pair of filters to get the result. Each sample is an
53 * accumulation of acked bytes over the time period > SRTT.
54 *
55 * This implementation is a reasonably informed rip-off of tcp_westwood.c from the Linux kernel.
56 * That implementation is by the creators of the Westwood+ algorithm, so it's probably
57 * trustworthy (also we have tested it).
58 *
59 * The basic pattern is:
60 *
61 * - on_acks(N) called the first time:
62 * - m_bytes_this_sample = 0; // True after construction.
63 * - m_this_sample_start_time = now();
64 * - m_bytes_this_sample += N; // Equivalent to m_bytes_this_sample = N;
65 * - ...Time passes....
66 * - on_acks(N) called subsequently:
67 * - since_sample_start = now() - m_this_sample_start_time;
68 * - if (since_sample_start > SRTT):
69 * - // Sample is long enough; update bandwidth estimate; start new sample.
70 * - bytes_per_time_this_sample = m_bytes_this_sample / since_sample_start
71 * - m_bytes_per_time_smoothed = filtered mix of m_bytes_per_time_smoothed and
72 * bytes_per_time_this_sample;
73 * - m_bytes_this_sample = 0; m_this_sample_start_time = now();
74 * - else:
75 * - // Sample not long enough. Can't update bandwidth estimate yet.
76 * - m_bytes_this_sample += N;
77 * - ...Time passes....
78 * - on_acks(N) called subsequently (see above).
79 * - Etc.
80 *
81 * Some interesting observations about this:
82 *
83 * Note that m_bytes_this_sample is incremented AFTER the bandwidth estimation is recomputed (if
84 * it is recomputed). So the B/W computation does not include the newest acked data; the newest
85 * acked data instead counts in the NEXT sample. I am not exactly sure why they did it this way
86 * and not in the opposite ("greedier") order, but it doesn't seem like it's worse.
87 *
88 * The start of the pattern has some subtlety to it. m_bytes_this_sample and
89 * m_this_sample_start_time are initialized to zero, as seems reasonable: we don't try to set
90 * m_this_sample_start_time at construction or something, as that can be a long time ago and
91 * bogus. (tcp_westwood.c agrees with this. Things are initialized in the .init callback, which
92 * is called upon first TCP ACK receipt, not earlier during connection setup.) Then
93 * m_bytes_this_sample is incremented by N to N (0 + N == N). The weird thing is that
94 * tcp_westwood.c for some reason forces this increment in the first on_acks() equivalent to be 0.
95 * If one traces the code, if (first_ack) they set "w->snd_una = tcp_sk(sk)->snd_una;", the
96 * following if () probably does not trigger, and then the likeliest caller of that function
97 * (westwood_fast_bw()) executes "w->bk += tp->snd_una - w->snd_una;", which increments w->bk by
98 * 0. w->bk is the exact equivalent of our m_bytes_this_sample.
99 *
100 * The comments in tcp_westwood.c about why the snd_una assignment happens there are not entirely
101 * clear to me. I think basically tp->snd_una may be incorrect for some reason (?) at connection
102 * open, though I don't know why, so they just force the first part of the sample to be 0. I am
103 * not really sure. I have chosen to deviate slightly and in fact perform that first increment.
104 * Since it's just the first measurement, it is very very unlikely to matter, so I chose
105 * simplicity... the "bytes" argument for us is given directly (no need to perform snd_una math),
106 * and we have no reason to distrust it. */
107
108 const Peer_socket::Const_ptr sock = socket();
109
110 if (m_no_acks_yet)
111 {
112 // First on_acks() call (first acked bytes). Begin first sample by marking the time.
113
114 FLOW_LOG_TRACE("bw_est [" << sock << "] update: start first sample at 0 bytes.");
115
116 m_no_acks_yet = false;
117 m_this_sample_start_time = Fine_clock::now();
118 }
119 else // if (2nd on_acks() and later)
120 {
121 /* Standard on_acks() call. Some acks were just received from the other side. See if enough
122 * time has passed since the sample was first started. "Enough time" is defined as at least
123 * SRTT (which is the estimate for the round trip time maintained elsewhere by measuring how
124 * quickly packets are acked). */
125
126 const Fine_duration since_sample_start = Fine_clock::now() - m_this_sample_start_time;
127 assert(since_sample_start.count() >= 0);
128
129 const Fine_duration& srtt = sock->m_snd_smoothed_round_trip_time;
130
131 if (srtt == Fine_duration::zero())
132 {
133 /* A documented on_acks() requirement is that the bytes that were acked have already
134 * affected SRTT. Therefore srtt should not be zero. We needn't overreact by assert()ing
135 * though -- let's just keep building the sample then.... */
136
137 FLOW_LOG_WARNING("bw_est [" << sock << "] update: received acks but have no SRTT yet! This may be "
138 "a bug. Adding [" << sock->bytes_blocks_str(bytes) << "] to current sample "
139 "of [" << sock->bytes_blocks_str(size_t(m_bytes_this_sample)) << "] bytes "
140 "started [" << since_sample_start << "] ago.");
141 }
142 else // if (srtt is valid)
143 {
144 /* As in tcp_westwood.cpp, if the SRTT is quite small (as in LAN), bandwidth calculations
145 * may get crazy; so put a floor on the sample length, as specified in this socket option. */
146 const Fine_duration sample_period_floor = sock->opt(sock->m_opts.m_st_snd_bandwidth_est_sample_period_floor);
147 const Fine_duration min_sample_period = max(srtt, sample_period_floor);
148
149 FLOW_LOG_TRACE("bw_est [" << sock << "] update: received acks; current sample "
150 "total [" << sock->bytes_blocks_str(size_t(m_bytes_this_sample)) << "]; "
151 "started [" << round<microseconds>(since_sample_start) << "] ago; "
152 "SRTT [" << round<microseconds>(srtt) << "]; "
153 "sample period floor [" << round<microseconds>(sample_period_floor) << "].");
154
155 if (since_sample_start > min_sample_period)
156 {
157 /* Cool, enough time has passed since this sample was started; take a bandwidth sample over
158 * that time period (just bytes/time). As promised, our units are bytes per Time_unit(1),
159 * so convert from Fine_duration to (the likely less fine) Time_unit before dividing. We
160 * use ceil() instead of truncation or round() to avoid rounding down to zero and
161 * resulting in division by zero. (Shouldn't really happen due to sample_period_floor,
162 * but who knows what they've set it to? Better safe than sorry.) */
163
164 const n_bytes_t bytes_per_time_this_sample
165 = m_bytes_this_sample / round<Time_unit>(since_sample_start).count();
166
167 // B/W sample computed. Shove into the filter(s), ultimately updating the main B/W estimate.
168
170 {
171 // First sample completed; simply set the estimates to that first sample.
172
173 m_no_samples_yet = false;
174
175 FLOW_LOG_TRACE("bw_est [" << sock << "] update: first complete sample; bw_est = bw_est_less_smoothed "
176 "= [" << bytes_per_time_this_sample << "] bytes per [" << Time_unit(1) << "]"
177 "= [" << util::to_mbit_per_sec<Time_unit>(bytes_per_time_this_sample) << " Mbit/s]; "
178 "start new sample at 0 bytes.");
179
180 m_bytes_per_time_less_smoothed = bytes_per_time_this_sample;
182 }
183 else
184 {
185 /* 2nd or later sample collected. Blend the existing value of the filters with the new
186 * sample.
187 *
188 * I am not sure why there are two quantities maintained. _less_smoothed I get; but
189 * smoothing further to get _smoothed (which is the publicly available B/W estimate) I
190 * don't. I didn't see that in the Westwood/Westwood+ papers, nor is it really explained
191 * in tcp_westwood.c except for one cryptic comment on the westwood->bw_ns_est struct
192 * member. Just do it.... */
193
194 const n_bytes_t prev_bytes_per_time_less_smoothed = m_bytes_per_time_less_smoothed;
195 const n_bytes_t prev_bytes_per_time_smoothed = m_bytes_per_time_smoothed;
196
198 = apply_filter(m_bytes_per_time_less_smoothed, bytes_per_time_this_sample);
201
202 FLOW_LOG_TRACE("bw_est [" << sock << "] update: complete sample; "
203 "bw_est_less_smoothed "
204 "= filter[" << prev_bytes_per_time_less_smoothed << ", " << bytes_per_time_this_sample << "] "
205 "= [" << m_bytes_per_time_less_smoothed << "] units "
206 "= [" << util::to_mbit_per_sec<Time_unit>(m_bytes_per_time_less_smoothed) << " Mbit/s]; "
207 "bw_est "
208 "= filter[" << prev_bytes_per_time_smoothed << ", " << m_bytes_per_time_less_smoothed << "] "
209 "= [" << m_bytes_per_time_smoothed << "] units "
210 "= [" << util::to_mbit_per_sec<Time_unit>(m_bytes_per_time_smoothed) << " Mbit/s]; "
211 "units = bytes per [" << Time_unit(1) << "].");
212 } // if (since_sample_start > min_sample_period)
213
214 /* Start new sample. Note that m_bytes_this_sample is about to get immediately incremented, as explained
215 * at the top. */
216
218 m_this_sample_start_time = Fine_clock::now();
219 } // if (since_sample_start > min_sample_period)
220 } // if (srtt is valid)
221 } // if (2nd on_acks() and later)
222
223 // As explained earlier, the actual acked bytes contribute to the next sample (if applicable), not preceding one.
224
225 m_bytes_this_sample += bytes;
226 FLOW_LOG_TRACE("bw_est [" << sock << "] update: received acks; sample "
227 "total now [" << sock->bytes_blocks_str(size_t(m_bytes_this_sample)) << "]; "
228 "after increment by [" << sock->bytes_blocks_str(bytes) << "].");
229} // Send_bandwidth_estimator::on_acks()
230
232 n_bytes_t new_sample_val) // Static.
233{
234 // Westwood filter: 7/8 * a + 1/8 * b.
235 return ((7 * prev_val) + new_sample_val) / 8;
236} // Send_bandwidth_estimator::apply_filter()
237
239{
240 const Peer_socket::Const_ptr sock = m_sock.lock();
241
242 // See comment in same spot in Congestion_control_strategy::socket().
243 assert(sock);
244
245 return sock;
246}
247
248} // namespace flow::net_flow
249
Interface that the user should implement, passing the implementing Logger into logging classes (Flow'...
Definition: log.hpp:1291
n_bytes_t m_bytes_this_sample
The number of bytes acknowledged by receiver since m_this_sample_start_time (the time when the curren...
Definition: bandwidth.hpp:295
n_bytes_t bandwidth_bytes_per_time() const
Returns the current estimate of the available outgoing bandwidth per unit time for the containing soc...
Definition: bandwidth.cpp:38
boost::chrono::milliseconds Time_unit
The primary time unit over which this class reports bandwidth.
Definition: bandwidth.hpp:167
static n_bytes_t apply_filter(n_bytes_t prev_val_per_time, n_bytes_t new_sample_val_per_time)
Applies the low-pass filter that takes the given previous result of the filter and blends in the give...
Definition: bandwidth.cpp:231
boost::weak_ptr< Peer_socket::Const_ptr::element_type > m_sock
The containing socket (read-only access).
Definition: bandwidth.hpp:269
uint64_t n_bytes_t
Type that represents the number of bytes, either as an absolute amount or over some time period.
Definition: bandwidth.hpp:134
Peer_socket::Const_ptr socket() const
Utility that returns a handle to the containing Peer_socket.
Definition: bandwidth.cpp:238
Fine_time_pt m_this_sample_start_time
The time at which the currently ongoing bandwidth sample began to accumulate.
Definition: bandwidth.hpp:298
bool m_no_samples_yet
true until a sample has been completed and fed into the filter; false forever thereafter.
Definition: bandwidth.hpp:307
Send_bandwidth_estimator(log::Logger *logger_ptr, Peer_socket::Const_ptr sock)
Constructs object by setting up logging and saving a pointer to the containing Peer_socket.
Definition: bandwidth.cpp:26
bool m_no_acks_yet
true until on_acks() called for the first time; false forever thereafter. Used to begin the sample se...
Definition: bandwidth.hpp:301
n_bytes_t m_bytes_per_time_less_smoothed
In the same units as m_bytes_per_time_smoothed, the less smoothed bandwidth estimate,...
Definition: bandwidth.hpp:288
n_bytes_t m_bytes_per_time_smoothed
The current smoothed bandwidth estimate, to be returned by bandwidth_bytes_per_time(),...
Definition: bandwidth.hpp:275
void on_acks(size_t bytes)
Informs the bandwidth estimator strategy that 1 or more previously sent packets whose status was In-f...
Definition: bandwidth.cpp:43
Const_target_ptr Const_ptr
Short-hand for ref-counted pointer to immutable values of type Target_type::element_type (a-la T cons...
#define FLOW_LOG_WARNING(ARG_stream_fragment)
Logs a WARNING message into flow::log::Logger *get_logger() with flow::log::Component get_log_compone...
Definition: log.hpp:152
#define FLOW_LOG_TRACE(ARG_stream_fragment)
Logs a TRACE message into flow::log::Logger *get_logger() with flow::log::Component get_log_component...
Definition: log.hpp:227
Flow module containing the API and implementation of the Flow network protocol, a TCP-inspired stream...
Definition: node.cpp:25
Flow_log_component
The flow::log::Component payload enumeration comprising various log components used by Flow's own int...
Definition: common.hpp:632
Fine_clock::duration Fine_duration
A high-res time duration as computed from two Fine_time_pts.
Definition: common.hpp:410