Flow 1.0.2
Flow project: Full implementation reference.
peer_socket.hpp
Go to the documentation of this file.
1/* Flow
2 * Copyright 2023 Akamai Technologies, Inc.
3 *
4 * Licensed under the Apache License, Version 2.0 (the
5 * "License"); you may not use this file except in
6 * compliance with the License. You may obtain a copy
7 * of the License at
8 *
9 * https://www.apache.org/licenses/LICENSE-2.0
10 *
11 * Unless required by applicable law or agreed to in
12 * writing, software distributed under the License is
13 * distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
14 * CONDITIONS OF ANY KIND, either express or implied.
15 * See the License for the specific language governing
16 * permissions and limitations under the License. */
17
18/// @file
19#pragma once
20
25#include "flow/error/error.hpp"
29#include "flow/util/util.hpp"
35#include <boost/utility.hpp>
36#include <boost/shared_ptr.hpp>
37#include <boost/enable_shared_from_this.hpp>
38#include <boost/move/unique_ptr.hpp>
39
40namespace flow::net_flow
41{
42// Types.
43
44/**
45 * A peer (non-server) socket operating over the Flow network protocol, with optional stream-of-bytes and
46 * reliability support.
47 *
48 * Reliability is enabled or disabled via a socket option, Peer_socket_options::m_st_rexmit_on,
49 * at socket creation. Use unreliable mode with care -- see send() method doc header for details.
50 *
51 * ### Life cycle of a Peer_socket ###
52 * A given Peer_socket can arise either by connecting to
53 * Server_socket on a Node (Node::connect() or Node::sync_connect()), or by listening on a
54 * Node's Server_socket and accepting such a connection (Server_socket::accept() or
55 * Server_socket::sync_accept()). In all cases, Node or Server_socket generates a new Peer_socket
56 * and returns it (factory pattern). Peer_socket is not instantiable otherwise. A Peer_socket
57 * cannot be deleted explicitly by the user and will only be returned via `boost::shared_ptr<>`; when
58 * both the Node and all user code no longer refers to it, the Peer_socket will be destroyed.
59 *
60 * Once a `net_flow` user has a Peer_socket object, that object represents a socket in one of the
61 * following basic states:
62 *
63 * - Open.
64 * - Sub-states:
65 * - Connecting. (Never Writable, never Readable.)
66 * - Connected. (May be Writable, may be Readable.)
67 * - Disconnecting. (May be Readable, never Writable.)
68 * - Closed.
69 * - Socket can neither read nor write.
70 *
71 * Open.Connecting means means Node initiated a connect to the given server, and this is in
72 * progress. Open.Connected means the connection to the other Node is fully functional.
73 * Open.Disconnecting means either our side or the other side has initiated a clean or abrupt
74 * disconnect, but it is not yet entirely finished (background handshaking is happening, you have
75 * not read all available data or sent all queued data, etc.).
76 *
77 * In either case, reading and writing may or may not be possible at a given time, depending on the
78 * state of the internal buffers and the data having arrived on the logical connection. Thus all
79 * Open sub-states can and often should be treated the same way in a typical Flow-protocol-using algorithm:
80 * simply determine when the Peer_socket is Readable, and read; and similarly for Writable and
81 * write. Thus the sub-states are distinguished for informational/diagnostic purposes only, as user
82 * reading/writing logic in these states should usually be identical.
83 *
84 * @todo Closing connection considerations. May implement closing only via timeouts at first (as
85 * opposed to explicit closing). Below text refers to `close_final()` and `close_start()`, but those
86 * are just ideas and may be replaced with timeout, or nothing. At this time, the only closing
87 * supported is abrupt close due to error or abrupt close via close_abruptly().
88 *
89 * Closed means that the Peer_socket has become disconnected, and no data can possibly be
90 * received or sent, AND that Node has no more background internal operations to perform and has
91 * disowned the Peer_socket. In other words, a Closed Peer_socket is entirely dead.
92 *
93 * Exactly the following state transitions are possible for a given Peer_socket returned by Node:
94 *
95 * - start => Closed
96 * - start => Open
97 * - Open => Closed
98 *
99 * Note, in particular, that Closed is final; socket cannot move from Closed to
100 * Open. If after an error or valid disconnection you want to reestablish a
101 * connection, obtain a new Peer_socket from Node's factories. Rationale (subject to change):
102 * this cuts down on state having to be tracked inside a Peer_socket, while the interface becomes
103 * simpler without much impact on usability. Anti-rationale: contradicts BSD socket and boost.asio
104 * established practices; potentially more resource-intensive/slower in the event of errors and
105 * disconnects. Why IMO rationale > anti-rationale: it's simpler, and the potential problems do not
106 * appear immediately serious; added statefulness can be added later if found desirable.
107 *
108 * Receving, sending, and buffers: Peer_socket, like a TCP socket, has a Receive buffer (a/k/a
109 * FIFO queue of bytes) of some maximum size and a Send buffer (a/k/a FIFO queue of bytes) of some
110 * maximum size. They are typically not directly exposed via the interface, but their existence
111 * affects documented behavior. I formally describe them here, but generally they work similarly to
112 * TCP socket Send/Receive buffers.
113 *
114 * The Receive buffer: Contains bytes asynchronously received on the connection that have not yet
115 * been removed with a `*receive()` method. Any bytes that asynchronously arrive on the connection are
116 * asynchronously stored to the buffer on the other side of the buffer in a queued fashion.
117 *
118 * The Send buffer: Contains bytes intended to be asynchronously sent on the connection that have
119 * been placed there by a `*send()` method but not yet sent on the connection. Any bytes that are
120 * asynchronously sent on the connection are asynchronously removed from the buffer on the other
121 * side of the buffer in a queued fashion.
122 *
123 * With that in mind, here are the definitions of Readable and Writable while state is Open:
124 *
125 * - Readable <=> Data available in internal Receive buffer, and user has not explicitly announced
126 * via `close_final()` they're not interested in reading further.
127 * - Writable <=> Space for data available in internal Send buffer, and the state is Open.Connected.
128 *
129 * Note that neither definition really cares about the state of the network connection (e.g., could
130 * bytes actually be sent over the network at the moment?). There is one caveat: A
131 * socket is not Writable until Open.Connecting state is transitioned away from; this prevents user
132 * from buffering up send data before the connection is ready. (Allowing that would not necessarily
133 * be wrong, but I'm taking a cue from BSD socket semantics on this, as they seem to be convenient.)
134 *
135 * In Open, the following archetypal operations are provided. (In Closed all
136 * immediately fail; in Open.Disconnecting some immediately fail if `close*()` has been called.) Let
137 * R be the current size of data in the Receive buffer, and S be the available space for data in the
138 * Send buffer.
139 *
140 * - `receive(N)`. If Readable, return to caller `min(N, R)` oldest data to have been received from
141 * the other side, and remove them from Receive buffer. Otherwise do nothing.
142 * - `send(N)`. If Writable, take from caller `min(N, S)` data to be appended to the Send
143 * buffer and, when possible, sent to the other side. Otherwise do nothing.
144 * - `sync_receive(N)`. If Readable, `receive(N)`. Otherwise sleep until Readable, then `receive(N)`.
145 * - `sync_send(N)`. If Writable, `send(N)`. Otherwise sleep until Writable, then `send(N)`.
146 *
147 * These are similar to TCP Receive and Send APIs in non-blocking mode, and TCP Receive and Send APIs in
148 * blocking mode, respectively. There may be other similarly themed methods, but all use these as
149 * semantic building blocks.
150 *
151 * To understand the order of events, one can think of a disconnect-causing event (like a graceful
152 * close initiation from the remote socket) as a piece of data itself. Thus, for example, if 5
153 * bytes are received and placed into the Receive buffer without being read by the user, and then a
154 * connection close is detected, the socket will be Readable until the 5 bytes have been
155 * receive()ed, and the next receive() (or send()) would yield the error, since that's the order
156 * things happened. Similarly, suppose you've sent 5 bytes, but they haven't been yet
157 * sent over the wire and are sitting in the Send buffer. Then you trigger a graceful connection close.
158 * First the 5 bytes will be sent if possible, and then the closing procedure will actually begin.
159 *
160 * Abrupt closes such as connection resets may force both buffers to be immediately emptied without
161 * giving to the user or writing to the other side, so that the above rule does not have to apply.
162 * Typically a connection reset means the socket is immediately unusable no matter what was in the
163 * buffers at the time, per BSD socket semantics.
164 *
165 * ### Efficiently reading/writing ###
166 * The `sync_*`() methods are efficient, in that they use no processor
167 * cycles until Readable or Writable is achieved (i.e., they sleep until that point). The
168 * non-blocking versions don't sleep/block, however. For a program using them to be efficient it
169 * should sleep until Readable or Writable and only then call receive()/send(), when data are
170 * certainly available for immediate reading or writing. Moreover, a complex program is likely to
171 * want to perform this sleep-and-conditional-wake on a set of several Peer_socket objects simultaneously
172 * (similarly to `select()`, `epoll*()`, etc.). Use class Event_set for this purpose.
173 *
174 * ### Thread safety ###
175 * Same as for Node. (Briefly: all operations safe for simultaneous execution on
176 * separate or the same object.)
177 *
178 * @internal
179 *
180 * Implementation notes
181 * --------------------
182 *
183 * While to a user a Peer_socket appears as a nearly self-sufficient object (i.e., you can do things
184 * like `s->send()`, which means 'socket `s`, send some data!''), the most reasonable way to internally
185 * implement this is to have Node contain the logic behind a Peer_socket (and how it works together
186 * with other Peer_socket objects and other internal infrastructure). Thus Node is the class with all of
187 * the logic behind (for example) `s->send()`. Peer_socket then, privately, is not too much more than a
188 * collection of data (like a `struct` almost) to help Node.
189 *
190 * Therefore Peer_socket provides a clean object-oriented public interface to the user but, on the
191 * implementation side, is basically a data store (with Node as `friend`) and forwards the logic to
192 * the originating Node. One idea to make this dichotomy more cleanly expressed (e.g., without
193 * `friend`) was to make Peer_socket a pure interface and have Node produce `Peer_socket_impl`
194 * objects, where `Peer_socket_impl` implements Peer_socket and is itself private to the user (a
195 * classic factory pattern). Unfortunately defining function templates such as `send<Buffers>()`
196 * (where `Buffers` is an arbitrary `Buffers` concept model) as pure `virtual` functions is not really
197 * possible in C++. Since such a templated interface can be highly convenient (see boost.asio with
198 * its seamless support for buffers and buffer sequences of most types, including scatter-gather),
199 * the usefulness of the interface trumps implementation beauty.
200 *
201 * To prevent node.cpp from being unmanageably large (and also because it makes sense),
202 * implementations for Node methods that deal only with an individual Peer_socket reside in
203 * peer_socket.cpp (even though they are members of Node, since, again, the logic is all forwarded to Node).
204 *
205 * @todo Rename `State` and `Open_sub_state` to `Phase` and `Open_sub_phase` respectively; and
206 * `Int_state` to `State`. Explain difference between phases (application-layer, user-visible and used
207 * close to application layer) and states (transport layer, internal).
208 *
209 * @todo Look into a way to defeat the need for boiler-plate trickery -- with low but non-zero perf cost --
210 * involving `*_socket`-vs-`Node` circular references in method templates, such as the way
211 * Peer_socket::send() and Peer_socket::receive() internally make `Function<>`s before forwarding to the core
212 * in Node. Can this be done with `.inl` files? Look into how Boost internally uses `.inl` files; this could
213 * inspire a solution... or not.
214 */
217 // Endow us with shared_ptr<>s ::Ptr and ::Const_ptr (syntactic sugar).
218 public util::Shared_ptr_alias_holder<boost::shared_ptr<Peer_socket>>,
219 // Allow access to Ptr(this) from inside Peer_socket methods. Just call shared_from_this().
220 public boost::enable_shared_from_this<Peer_socket>,
221 public log::Log_context,
222 private boost::noncopyable
223{
224public:
225 // Types.
226
227 /// State of a Peer_socket.
228 enum class State
229 {
230 /// Future reads or writes may be possible. A socket in this state may be Writable or Readable.
231 S_OPEN,
232 /// Neither future reads nor writes are possible, AND Node has disowned the Peer_socket.
233 S_CLOSED
234 };
235
236 /// The sub-state of a Peer_socket when state is State::S_OPEN.
237 enum class Open_sub_state
238 {
239 /**
240 * This Peer_socket was created through an active connect (Node::connect() and the like), and
241 * the connection to the remote Node is currently being negotiated by this socket's Node.
242 * A socket in this state may be Writable but cannot be Readable. However, except for
243 * diagnostic purposes, this state should generally be treated the same as S_CONNECTED.
244 */
245 S_CONNECTING,
246
247 /**
248 * This Peer_socket was created through a passive connect (Node::accept() and the like) or an
249 * active connect (Node::connect() and the like), and the connection is (as far this socket's
250 * Node knows) set up and functioning. A socket in this state may be Writable or Readable.
251 */
252 S_CONNECTED,
253
254 /**
255 * This Peer_socket was created through a passive connect (Node::accept() and the like) or an
256 * active connect (Node::connect() and the like), but since then either an active close,
257 * passive close, or an error has begun to close the connection, but data may still possibly
258 * arrive and be Readable; also data may have been "sent" but still sitting in the Send buffer
259 * and needs to be sent over the network. A socket in this state may be Readable but cannot
260 * be Writable.
261 *
262 * This implies that a non-S_CLOSED socket may be, at a lower level, disconnected. For
263 * example, say there are 5 bytes in the Receive buffer, and the other side sends a graceful
264 * disconnect packet to this socket. This means the connection is finished, but the user can
265 * still receive() the 5 bytes (without blocking). Then state will remain
266 * S_OPEN.S_DISCONNECTING until the last of the 5 bytes is received (gone from the buffer); at
267 * this point state may change to S_CLOSED (pending any other work Node must do to be able to
268 * disown the socket).
269 */
270 S_DISCONNECTING
271 }; // enum class Open_sub_state
272
273 // Constructors/destructor.
274
275 /// Boring `virtual` destructor. Note that deletion is to be handled exclusively via `shared_ptr`, never explicitly.
276 ~Peer_socket() override;
277
278 // Methods.
279
280 /**
281 * Current State of the socket.
282 *
283 * @param open_sub_state
284 * Ignored if null. Otherwise, if and only if State::S_OPEN is returned, `*open_sub_state` is set to
285 * the current sub-state of `S_OPEN`.
286 * @return Current main state of the socket.
287 */
288 State state(Open_sub_state* open_sub_state = 0) const;
289
290 /**
291 * Node that produced this Peer_socket.
292 *
293 * @return Pointer to (guaranteed valid) Node; null if state is State::S_CLOSED.
294 */
295 Node* node() const;
296
297 /**
298 * Intended other side of the connection (regardless of success, failure, or current State).
299 * For a given Peer_socket, this will always return the same value, even if state is
300 * State::S_CLOSED.
301 *
302 * @return See above.
303 */
304 const Remote_endpoint& remote_endpoint() const;
305
306 /**
307 * The local Flow-protocol port chosen by the Node (if active or passive open) or user (if passive open) for
308 * this side of the connection. For a given Peer_socket, this will always return the same value,
309 * even if state is State::S_CLOSED. However, when state is State::S_CLOSED, the port may be unused or
310 * taken by another socket.
311 *
312 * @return See above.
313 */
314 flow_port_t local_port() const;
315
316 /**
317 * Obtains the serialized connect metadata, as supplied by the user during the connection handshake.
318 * If this side initiated the connection (Node::connect() and friends), then this will equal what
319 * was passed to the connect_with_metadata() (or similar) method. More likely, if this side
320 * accepted the connection (Server_socket::accept() and friends), then this will equal what the
321 * user on the OTHER side passed to connect_with_metadata() or similar.
322 *
323 * @note It is up to the user to deserialize the metadata portably. One recommended convention is to
324 * use `boost::endian::native_to_little()` (and similar) before connecting; and
325 * on the other side use the reverse (`boost::endian::little_to_native()`) before using the value.
326 * Packet dumps will show a flipped (little-endian) representation, while with most platforms the conversion
327 * will be a no-op at compile time. Alternatively use `native_to_big()` and vice-versa.
328 * @note If a connect() variant without `_with_metadata` in the name was used, then the metadata are
329 * composed of a single byte with the zero value.
330 * @param buffer
331 * A buffer to copy the metadata into.
332 * @param err_code
333 * See flow::Error_code docs for error reporting semantics.
334 * @return The size of the copied metadata.
335 */
336 size_t get_connect_metadata(const boost::asio::mutable_buffer& buffer,
337 Error_code* err_code = 0) const;
338
339 /**
340 * Sends (adds to the Send buffer) the given bytes of data up to a maximum internal buffer size;
341 * and asynchronously sends them to the other side. The data given is copied into `*this`, in the order
342 * given. Only as many bytes as possible without the Send buffer size exceeding a certain max are
343 * copied.
344 *
345 * The method does not block. Data are then sent asynchronously (in the background).
346 *
347 * Method does nothing except possibly logging if there are no bytes in data.
348 *
349 * ### Error handling ###
350 * These are the possible outcomes.
351 * 1. There is no space in the Send buffer (usually due to network congestion). Socket not
352 * Writable. 0 is returned; `*err_code` is set to success unless null; no data buffered.
353 * 2. The socket is not yet fully connected (`S_OPEN+S_CONNECTING` state). Socket not
354 * Writable. 0 is returned; `*err_code` is set to success unless null; no data buffered.
355 * 3. There is space in the Send buffer, and socket connection is open (`S_OPEN+S_CONNECTED`).
356 * Socket Writable. >= 1 is returned; `*err_code` is set to success; data buffered.
357 * 4. The operation cannot proceed due to an error. 0 is returned; `*err_code` is set to the
358 * specific error unless null; no data buffered. (If `err_code` null, Runtime_error thrown.)
359 *
360 * The semantics of -3- (the success case) are as follows. N bytes will be copied into Send
361 * buffer from the start of the Const_buffer_sequence data. These N bytes may be spread across 1
362 * or more buffers in that sequence; the subdivision structure of the sequence of bytes into
363 * buffers has no effect on what will be buffered in Send buffer (e.g., "data" could be N+ 1-byte
364 * buffers, or one N+-byte buffer -- the result would be the same). N equals the smaller of: the
365 * available space in the Send buffer; and `buffer_size(data)`. We return N.
366 *
367 * ### Reliability and ordering guarantees: if the socket option rexmit-on is enabled ###
368 * Reliability and ordering are guaranteed, and there is no notion of message boundaries. There is no possibility
369 * of data duplication. In other words full stream-of-bytes functionality is provided, as in TCP.
370 *
371 * ### Reliability and ordering guarantees: if the socket option rexmit-on is NOT enabled ###
372 * NO reliability guarantees are given, UNLESS *ALL* calls to send() (and other `*send`() methods)
373 * satisfy the condition: '`buffer_size(data)` is a multiple of `sock->max_block_size()`'; AND all
374 * calls to receive() (and other `*receive()` methods) on the OTHER side satisfy the condition:
375 * '`buffer_size(target)` is a multiple of `sock->max_block_size()`.' If and only if these guidelines
376 * are followed, and there is no connection closure, the following reliability guarantee is made:
377 *
378 * Let a "block" be a contiguous chunk of bytes in a "data" buffer sequence immediately following
379 * another "block," except the first "block" in a connection, which begins with the first byte of
380 * the "data" buffer sequence passed to the first `*send()` call on that connection. Then: Each
381 * given block will either be available to `*receive()` on the other side exactly once and without
382 * corruption; or not available to `*receive()` at all. Blocks may arrive in a different order than
383 * specified here, including with respect to other `*send()` calls performed before or after this
384 * one. In other words, these are guaranteed: block boundary preservation, protection against
385 * corruption, protection again duplication. These are not guaranteed: order preservation,
386 * delivery. Informally, the latter factors are more likely to be present on higher quality
387 * network paths.
388 *
389 * @tparam Const_buffer_sequence
390 * Type that models the boost.asio `ConstBufferSequence` concept (see Boost docs).
391 * Basically, it's any container with elements convertible to `boost::asio::const_buffer`;
392 * and bidirectional iterator support. Examples: `vector<const_buffer>`, `list<const_buffer>`.
393 * Why allow `const_buffer` instead of, say, `Sequence` of bytes? Same reason as boost.asio's
394 * send functions: it allows a great amount of flexibility without sacrificing performance,
395 * since `boost::asio::buffer()` function can adapt lots of different objects (arrays,
396 * `vector`s, `string`s, and more -- composed of bytes, integers, and more).
397 * @param data
398 * Buffer sequence from which a stream of bytes to add to Send buffer will be obtained.
399 * @param err_code
400 * See flow::Error_code docs for error reporting semantics.
401 * Error implies that neither this send() nor any subsequent `*send()` on this socket
402 * will succeeed. (In particular a clean disconnect is an error.)
403 * @return Number of bytes (possibly zero) added to buffer. Always 0 if `bool(*err_code) == true` when
404 * send() returns.
405 */
406 template<typename Const_buffer_sequence>
407 size_t send(const Const_buffer_sequence& data, Error_code* err_code = 0);
408
409 /**
410 * Blocking (synchronous) version of send(). Acts just like send(), except that if Socket is not
411 * immediately Writable (i.e., send() would return 0 and no error), waits until it is Writable
412 * (send() would return either >0, or 0 and an error) and returns `send(data, err_code)`. If a
413 * timeout is specified, and this timeout expires before socket is Writable, acts like send()
414 * executed on an un-Writable socket.
415 *
416 * ### Error handling ###
417 * These are the possible outcomes (assuming there are data in the argument `data`).
418 * 1. There is space in the Send buffer, and socket connection
419 * is open (`S_OPEN+S_CONNECTED`). Socket Writable. >= 1 is returned; `*err_code` is set to
420 * success unless null; data buffered.
421 * 2. The operation cannot proceed due to an error. 0 is returned; `*err_code` is set to the
422 * specific error unless null; no data buffered. (If `err_code` null, Runtime_error thrown.)
423 * The code error::Code::S_WAIT_INTERRUPTED means the wait was interrupted
424 * (similarly to POSIX's `EINTR`).
425 * 3. Neither condition above is detected before the timeout expires (if provided).
426 * Output semantics are the same as in 2, with the specific code error::Code::S_WAIT_USER_TIMEOUT.
427 *
428 * The semantics of -1- (the success case) equal those of send().
429 *
430 * Note that it is NOT possible to return 0 and no error.
431 *
432 * Tip: Typical types you might use for `max_wait`: `boost::chrono::milliseconds`,
433 * `boost::chrono::seconds`, `boost::chrono::high_resolution_clock::duration`.
434 *
435 * @see The version of sync_send() with no timeout.
436 * @tparam Rep
437 * See boost::chrono::duration documentation (and see above tip).
438 * @tparam Period
439 * See boost::chrono::duration documentation (and see above tip).
440 * @tparam Const_buffer_sequence
441 * See send().
442 * @param data
443 * See send().
444 * @param max_wait
445 * The maximum amount of time from now to wait before giving up on the wait and returning.
446 * `"duration<Rep, Period>::max()"` will eliminate the time limit and cause indefinite wait
447 * (i.e., no timeout).
448 * @param err_code
449 * See flow::Error_code docs for error reporting semantics.
450 * Error, except `WAIT_INTERRUPTED` or `WAIT_USER_TIMEOUT`, implies that
451 * neither this send() nor any subsequent send() on this socket
452 * will succeeed. (In particular a clean disconnect is an error.)
453 * @return Number of bytes (possibly zero) added to Send buffer. Always 0 if `bool(*err_code) == true`
454 * when sync_send() returns.
455 */
456 template<typename Rep, typename Period, typename Const_buffer_sequence>
457 size_t sync_send(const Const_buffer_sequence& data,
458 const boost::chrono::duration<Rep, Period>& max_wait, Error_code* err_code = 0);
459
460 /**
461 * `sync_send()` operating in `null_buffers` mode, wherein -- if Writable state is reached -- the actual data
462 * are not moved out of any buffer, leaving that to the caller to do if desired. Hence, this is a way of waiting
463 * for Writable state that could be more concise in some situations than Event_set::sync_wait().
464 *
465 * ### Error handling ###
466 * These are the possible outcomes:
467 * 1. There is space in the Send buffer; and socket is fully connected
468 * (`S_OPEN+S_CONNECTED`). Socket
469 * Writable. `true` is returned; `*err_code` is set to success unless null.
470 * 2. The operation cannot proceed due to an error. `false` is returned; `*err_code` is set to the
471 * specific error unless null. `*err_code == S_WAIT_INTERRUPTED` means the wait was
472 * interrupted (similarly to POSIX's `EINTR`). (If `err_code` null, Runtime_error thrown.)
473 * 3. Neither condition above is detected before the timeout expires (if provided).
474 * Output semantics are the same as in 2, with the specific code error::Code::S_WAIT_USER_TIMEOUT.
475 *
476 * Note that it is NOT possible to return `false` and no error.
477 *
478 * Tip: Typical types you might use for `max_wait`: `boost::chrono::milliseconds`,
479 * `boost::chrono::seconds`, `boost::chrono::high_resolution_clock::duration`.
480 *
481 * @tparam Rep
482 * See other sync_send().
483 * @tparam Period
484 * See other sync_send().
485 * @param max_wait
486 * See other sync_receive().
487 * @param err_code
488 * See flow::Error_code docs for error reporting semantics.
489 * Error, except `WAIT_INTERRUPTED` or `WAIT_USER_TIMEOUT`, implies that
490 * neither this nor any subsequent send() on this socket
491 * will succeeed. (In particular a clean disconnect is an error.)
492 * @return `true` if 1+ bytes are possible to add to Send buffer; `false` if either a timeout has occurred (bytes
493 * not writable), or another error has occurred.
494 */
495 template<typename Rep, typename Period>
496 bool sync_send(const boost::asio::null_buffers&,
497 const boost::chrono::duration<Rep, Period>& max_wait, Error_code* err_code = 0);
498
499 /**
500 * Equivalent to `sync_send(data, duration::max(), err_code)`; i.e., sync_send() with no timeout.
501 *
502 * @tparam Const_buffer_sequence
503 * See other sync_send().
504 * @param data
505 * See other sync_send().
506 * @param err_code
507 * See other sync_send().
508 * @return See other sync_send().
509 */
510 template<typename Const_buffer_sequence>
511 size_t sync_send(const Const_buffer_sequence& data, Error_code* err_code = 0);
512
513 /**
514 * Equivalent to `sync_send(null_buffers(), duration::max(), err_code)`; i.e., `sync_send(null_buffers)`
515 * with no timeout.
516 *
517 * @param err_code
518 * See other sync_receive().
519 * @param tag
520 * Tag argument.
521 * @return See other sync_receive().
522 */
523 bool sync_send(const boost::asio::null_buffers&, Error_code* err_code = 0);
524
525 /**
526 * Receives (consumes from the Receive buffer) bytes of data, up to a given maximum
527 * cumulative number of bytes as inferred from size of provided target buffer sequence. The data
528 * are copied into the user's structure and then removed from the Receive buffer.
529 *
530 * The method does not block. In particular if there are no data already received from the other
531 * side, we return no data.
532 *
533 * If the provided buffer has size zero, the method is a NOOP other than possibly logging.
534 *
535 * ### Error handling ###
536 * These are the possible outcomes.
537 * 1. There are no data in the Receive buffer. Socket not Readable. 0 is returned;
538 * `*err_code` is set to success unless null; no data returned.
539 * 2. The socket is not yet fully connected (`S_OPEN+S_CONNECTING`). Socket not
540 * Readable. 0 is returned; `*err_code` is set to success unless null; no data returned.
541 * 3. There are data in the Receive buffer; and socket is fully connected (`S_OPEN+S_CONNECTED`)
542 * or gracefully shutting down (`S_OPEN+S_DISCONNECTING`). Socket Readable. >= 1 is returned;
543 * *err_code is set to success; data returned.
544 * 4. The operation cannot proceed due to an error. 0 is returned; `*err_code` is set to the
545 * specific error; no data buffered. (If `err_code` null, Runtime_error thrown.)
546 *
547 * The semantics of -3- (the success case) are as follows. N bytes will be copied from Receive
548 * buffer beginning at the start of the `Mutable_buffer_sequence target`. These N bytes may be
549 * spread across 1 or more buffers in that sequence; the subdivision structure of the sequence of
550 * bytes into buffers has no effect on the bytes, or order thereof, that will be moved from the
551 * Receive buffer (e.g., `target` could be N+ 1-byte buffers, or one N+-byte buffer
552 * -- the popped Receive buffer would be the same, as would be the extracted bytes). N equals the
553 * smaller of: the available bytes in the Receive buffer; and `buffer_size(target)`. We return N.
554 *
555 * ### Reliability and ordering guarantees ###
556 * See the send() doc header.
557 *
558 * @tparam Mutable_buffer_sequence
559 * Type that models the boost.asio `MutableBufferSequence` concept (see Boost docs).
560 * Basically, it's any container with elements convertible to `boost::asio::mutable_buffer`;
561 * and bidirectional iterator support. Examples: `vector<mutable_buffer>`,
562 * `list<mutable_buffer>`. Why allow `mutable_buffer` instead of, say, `Sequence` of bytes?
563 * Same reason as boost.asio's receive functions: it allows a great amount of flexibility
564 * without sacrificing performance, since `boost::asio::buffer()` function can adapt lots of
565 * different objects (arrays, `vector`s, `string`s, and more of bytes, integers, and more).
566 * @param target
567 * Buffer sequence to which a stream of bytes to consume from Receive buffer will be
568 * written.
569 * @param err_code
570 * See flow::Error_code docs for error reporting semantics.
571 * Error implies that neither this receive() nor any subsequent receive() on this socket
572 * will succeeed. (In particular a clean disconnect is an error.)
573 * @return The number of bytes consumed (placed into `target`). Always 0 if `bool(*err_code) == true`
574 * when receive() returns.
575 */
576 template<typename Mutable_buffer_sequence>
577 size_t receive(const Mutable_buffer_sequence& target, Error_code* err_code = 0);
578
579 /**
580 * Blocking (synchronous) version of receive(). Acts just like receive(), except that if socket
581 * is not immediately Readable (i.e., receive() would return 0 and no error), waits until it is
582 * Readable (receive() would return either >0, or 0 and an error) and returns
583 * `receive(target, err_code)`. If a timeout is specified, and this timeout expires before socket is
584 * Readable, it acts as if receive() produced error::Code::S_WAIT_USER_TIMEOUT.
585 *
586 * ### Error handling ###
587 * These are the possible outcomes:
588 * 1. There are data in the Receive buffer; and socket is fully connected
589 * (`S_OPEN+S_CONNECTED`) or gracefully shutting down (`S_OPEN+S_DISCONNECTING`). Socket
590 * Readable. >= 1 is returned; `*err_code` is set to success unless null; data returned.
591 * 2. The operation cannot proceed due to an error. 0 is returned; `*err_code` is set to the
592 * specific error unless null; no data buffered. `*err_code == S_WAIT_INTERRUPTED` means the wait was
593 * interrupted (similarly to POSIX's `EINTR`). (If `err_code` null, Runtime_error thrown.)
594 * 3. Neither condition above is detected before the timeout expires (if provided).
595 * Output semantics are the same as in 2, with the specific code error::Code::S_WAIT_USER_TIMEOUT.
596 *
597 * The semantics of -1- (the success case) equal those of receive().
598 *
599 * Note that it is NOT possible to return 0 and no error.
600 *
601 * Tip: Typical types you might use for `max_wait`: `boost::chrono::milliseconds`,
602 * `boost::chrono::seconds`, `boost::chrono::high_resolution_clock::duration`.
603 *
604 * @see The version of sync_receive() with no timeout.
605 * @tparam Rep
606 * See `boost::chrono::duration` documentation (and see above tip).
607 * @tparam Period
608 * See `boost::chrono::duration` documentation (and see above tip).
609 * @tparam Mutable_buffer_sequence
610 * See receive().
611 * @param target
612 * See receive().
613 * @param max_wait
614 * The maximum amount of time from now to wait before giving up on the wait and returning.
615 * `"duration<Rep, Period>::max()"` will eliminate the time limit and cause indefinite wait
616 * (i.e., no timeout).
617 * @param err_code
618 * See flow::Error_code docs for error reporting semantics.
619 * Error, except `WAIT_INTERRUPTED` or `WAIT_USER_TIMEOUT`, implies that
620 * neither this receive() nor any subsequent receive() on this socket
621 * will succeeed. (In particular a clean disconnect is an error.)
622 * @return Number of bytes (possibly zero) added to target. Always 0 if `bool(*err_code) == true` when
623 * sync_receive() returns.
624 */
625 template<typename Rep, typename Period, typename Mutable_buffer_sequence>
626 size_t sync_receive(const Mutable_buffer_sequence& target,
627 const boost::chrono::duration<Rep, Period>& max_wait, Error_code* err_code = 0);
628
629 /**
630 * `sync_receive()` operating in `null_buffers` mode, wherein -- if Readable state is reached -- the actual data
631 * are not moved into any buffer, leaving that to the caller to do if desired. Hence, this is a way of waiting
632 * for Readable state that could be more concise in some situations than Event_set::sync_wait().
633 *
634 * ### Error handling ###
635 * These are the possible outcomes:
636 * 1. There are data in the Receive buffer; and socket is fully connected
637 * (`S_OPEN+S_CONNECTED`) or gracefully shutting down (`S_OPEN+S_DISCONNECTING`). Socket
638 * Readable. `true` is returned; `*err_code` is set to success unless null.
639 * 2. The operation cannot proceed due to an error. `false` is returned; `*err_code` is set to the
640 * specific error unless null. `*err_code == S_WAIT_INTERRUPTED` means the wait was
641 * interrupted (similarly to POSIX's `EINTR`). (If `err_code` null, Runtime_error thrown.)
642 * 3. Neither condition above is detected before the timeout expires (if provided).
643 * Output semantics are the same as in 2, with the specific code error::Code::S_WAIT_USER_TIMEOUT.
644 *
645 * Note that it is NOT possible to return `false` and no error.
646 *
647 * Tip: Typical types you might use for `max_wait`: `boost::chrono::milliseconds`,
648 * `boost::chrono::seconds`, `boost::chrono::high_resolution_clock::duration`.
649 *
650 * @tparam Rep
651 * See other sync_receive().
652 * @tparam Period
653 * See other sync_receive().
654 * @param max_wait
655 * See other sync_receive().
656 * @param err_code
657 * See flow::Error_code docs for error reporting semantics.
658 * Error, except `WAIT_INTERRUPTED` or `WAIT_USER_TIMEOUT`, implies that
659 * neither this nor any subsequent receive() on this socket
660 * will succeeed. (In particular a clean disconnect is an error.)
661 * @return `true` if there are 1+ bytes ready to read; `false` if either a timeout has occurred (no bytes ready), or
662 * another error has occurred.
663 */
664 template<typename Rep, typename Period>
665 bool sync_receive(const boost::asio::null_buffers&,
666 const boost::chrono::duration<Rep, Period>& max_wait, Error_code* err_code = 0);
667
668 /**
669 * Equivalent to `sync_receive(target, duration::max(), err_code)`; i.e., sync_receive()
670 * with no timeout.
671 *
672 * @tparam Mutable_buffer_sequence
673 * See other sync_receive().
674 * @param target
675 * See other sync_receive().
676 * @param err_code
677 * See other sync_receive().
678 * @return See other sync_receive().
679 */
680 template<typename Mutable_buffer_sequence>
681 size_t sync_receive(const Mutable_buffer_sequence& target, Error_code* err_code = 0);
682
683 /**
684 * Equivalent to `sync_receive(null_buffers(), duration::max(), err_code)`; i.e., `sync_receive(null_buffers)`
685 * with no timeout.
686 *
687 * @param err_code
688 * See other sync_receive().
689 * @param tag
690 * Tag argument.
691 * @return See other sync_receive().
692 */
693 bool sync_receive(const boost::asio::null_buffers&, Error_code* err_code = 0);
694
695 /**
696 * Acts as if fatal error error::Code::S_USER_CLOSED_ABRUPTLY has been discovered on the
697 * connection. Does not block.
698 *
699 * Post-condition: `state() == State::S_CLOSED`. Additionally, assuming no loss on the
700 * network, the other side will close the connection with error
701 * error::Code::S_CONN_RESET_BY_OTHER_SIDE.
702 *
703 * Note: Discovering a fatal error on the connection would trigger all event waits on this socket
704 * (sync_send(), sync_receive(), Event_set::sync_wait(), Event_set::async_wait()) to execute on-event
705 * behavior (return, return, return, invoke handler, respectively). Therefore this method will cause
706 * just that, if applicable.
707 *
708 * Note: As a corollary, a socket closing this way (or any other way) does NOT cause that socket's
709 * events (if any) to be removed from any Event_set objects. Clearing an Event_set of all or some
710 * sockets is the Event_set user's responsibility (the classic way being Event_set::close()).
711 *
712 * @warning The moment the other side is informed we have abruptly closed the connection, they
713 * will no longer be able to receive() any of it (even if data had been queued up in
714 * their Receive buffer).
715 *
716 * @todo Currently this close_abruptly() is the only way for the user to explicitly close one specified socket.
717 * All other ways are due to error (or other side starting graceful shutdown, once we
718 * implement that). Once we implement graceful close, via `close_start()` and `close_final()`,
719 * use of close_abruptly() should be discouraged, or it may even be deprecated (e.g.,
720 * `Node`s lack a way to initiate an abrupt close for a specific socket).
721 *
722 * @todo close_abruptly() return `bool` (`false` on failure)?
723 *
724 * @param err_code
725 * See flow::Error_code docs for error reporting semantics. Generated codes:
726 * error::Code::S_NODE_NOT_RUNNING, or -- if socket already closed (`state() == State::S_CLOSED`) --
727 * then the error that caused the closure.
728 */
729 void close_abruptly(Error_code* err_code = 0);
730
731 /**
732 * Dynamically replaces the current options set (options()) with the given options set.
733 * Only those members of `opts` designated as dynamic (as opposed to static) may be different
734 * between options() and `opts`. If this is violated, it is an error, and no options are changed.
735 *
736 * Typically one would acquire a copy of the existing options set via options(), modify the
737 * desired dynamic data members of that copy, and then apply that copy back by calling
738 * set_options().
739 *
740 * @param opts
741 * The new options to apply to this socket. It is copied; no reference is saved.
742 * @param err_code
743 * See flow::Error_code docs for error reporting semantics. Generated codes:
744 * error::Code::S_STATIC_OPTION_CHANGED, error::Code::S_OPTION_CHECK_FAILED,
745 * error::Code::S_NODE_NOT_RUNNING.
746 * @return `true` on success, `false` on error.
747 */
748 bool set_options(const Peer_socket_options& opts, Error_code* err_code = 0);
749
750 /**
751 * Copies this socket's option set and returns that copy. If you intend to use set_options() to
752 * modify a socket's options, we recommend you make the modifications on the copy returned by
753 * options().
754 *
755 * @todo Provide a similar options() method that loads an existing structure (for structure
756 * reuse).
757 *
758 * @return See above.
759 */
761
762 /**
763 * Returns a structure containing the most up-to-date stats about this connection.
764 *
765 * @note At the cost of reducing locking overhead in 99.999999% of the Peer_socket's operation,
766 * this method may take a bit of time to run. It's still probably only 10 times or so slower than
767 * a simple lock, work, unlock -- there is a condition variable and stuff involved -- but this may
768 * matter if done very frequently. So you probably should not. (Hmmm... where did I get these estimates,
769 * namely "10 times or so"?)
770 *
771 * @todo Provide a similar info() method that loads an existing structure (for structure
772 * reuse).
773 *
774 * @return See above.
775 */
776 Peer_socket_info info() const;
777
778 /**
779 * The maximum number of bytes of user data per received or sent packet on this connection. See
780 * Peer_socket_options::m_st_max_block_size. Note that this method is ESSENTIAL when using the
781 * socket in unreliable mode (assuming you want to implement reliability outside of `net_flow`).
782 *
783 * @return Ditto.
784 */
785 size_t max_block_size() const;
786
787 /**
788 * The error code that perviously caused state() to become State::S_CLOSED, or success code if state
789 * is not CLOSED. For example, error::code::S_CONN_RESET_BY_OTHER_SIDE (if was connected) or
790 * error::Code::S_CONN_TIMEOUT (if was connecting)
791 *
792 * @return Ditto.
793 */
795
796protected:
797 // Constructors/destructor.
798
799 /**
800 * Constructs object; initializes most values to well-defined (0, empty, etc.) but not necessarily
801 * meaningful values.
802 *
803 * @param logger_ptr
804 * The Logger implementation to use subsequently.
805 * @param task_engine
806 * IO service for the timer(s) stored as data member(s).
807 * @param opts
808 * The options set to copy into this Peer_socket and use subsequently.
809 */
810 explicit Peer_socket(log::Logger* logger_ptr,
811 util::Task_engine* task_engine,
812 const Peer_socket_options& opts);
813
814private:
815 // Friends.
816
817 /**
818 * See rationale for `friend`ing Node in class Peer_socket documentation header.
819 * @see Node.
820 */
821 friend class Node;
822 /**
823 * See rationale for `friend`ing Server_socket in class Peer_socket documentation header.
824 * @see Server_socket.
825 */
826 friend class Server_socket;
827 /**
828 * For access to `Sent_pkt_by_sent_when_map` and Sent_packet types, at least.
829 * (Drop_timer has no actual Peer_socket instance to mess with.)
830 */
831 friend class Drop_timer;
832 /**
833 * Stats modules have const access to all socket internals.
834 * @see Send_bandwidth_estimator.
835 */
837 /**
838 * Congestion control modules have const access to all socket internals.
839 * @see Congestion_control_classic_data.
840 */
842 /**
843 * Congestion control modules have const access to all socket internals.
844 * @see Congestion_control_classic.
845 */
847 /**
848 * Congestion control modules have const access to all socket internals.
849 * @see Congestion_control_classic_with_bandwidth_est.
850 */
852
853 // Types.
854
855 /// Short-hand for `shared_ptr` to immutable Drop_timer (can't use Drop_timer::Ptr due to C++ and circular reference).
856 using Drop_timer_ptr = boost::shared_ptr<Drop_timer>;
857
858 /**
859 * Short-hand for high-performance, non-reentrant, exclusive mutex used to lock #m_opts.
860 *
861 * ### Rationale ###
862 * You might notice this seems tailor-made for shared/exclusive (a/k/a multiple-readers-single-writer) mutex.
863 * Why a 2-level mutex instead of a normal exclusive mutex? Because options can be accessed by
864 * thread W and various user threads, in the vast majority of the time to read option values. On
865 * the other hand, rarely, #m_opts may be modified via set_options(). To avoid thread contention
866 * when no one is writing (which is usual), we could use that 2-level type of mutex and apply the appropriate
867 * (shared or unique) lock depending on the situation. So why not? Answer:
868 * While a shared/exclusive mutex sounds lovely in theory -- and perhaps
869 * if its implementation were closer to the hardware it would be lovely indeed -- in practice it seems its
870 * implementation just causes performance problems rather than solving them. Apparently that's why
871 * it was rejected by C++11 standards people w/r/t inclusion in that standard. The people involved
872 * explained their decision here: http://permalink.gmane.org/gmane.comp.lib.boost.devel/211180.
873 * So until that is improved, just do this. I'm not even adding a to-do for fixing this, as that seems
874 * unlikely anytime soon. Update: C++17 added `std::shared_mutex`, and C++14 added a similar thing named
875 * something else. Seems like a good time to revisit this -- if not to materially improve #Options_mutex
876 * performance then to gain up-to-date knowledge on the topic, specifically whether `shared_mutex` is fast now.
877 * Update: Apparently as of Boost-1.80 the Boost.thread impl of `shared_mutex` is lacking in perf, and there
878 * is a ticket filed for many years for this. Perhaps gcc `std::shared_mutex` is fine. However research
879 * suggests it's less about this nitty-gritty of various impls and more the following bottom line:
880 * A simple mutex is *very* fast to lock/unlock, and perf problems occur only if one must wait for a lock.
881 * Experts say that it is possible but quite rare that there is enough lock contention to make it "worth it":
882 * a shared mutex is *much* slower to lock/unlock sans contention. Only when the read critical sections are
883 * long and very frequently accessed does it become "worth it."
884 */
886
887 /// Short-hand for lock that acquires exclusive access to an #Options_mutex.
889
890 /**
891 * Short-hand for reentrant mutex type. We explicitly rely on reentrant behavior, so this isn't "just in case."
892 *
893 * @todo This doc header for Peer_socket::Mutex should specify what specific behavior requires mutex reentrance, so
894 * that for example one could reevaluate whether there's a sleeker code pattern that would avoid it.
895 */
897
898 /// Short-hand for RAII lock guard of #Mutex.
900
901 /// Type used for #m_security_token.
902 using security_token_t = uint64_t;
903
904 /// Short-hand for order number type. 0 is reserved. Caution: Keep in sync with Drop_timer::packet_id_t.
906
907 /**
908 * The state of the socket (and the connection from this end's point of view) for the internal state
909 * machine governing the operation of the socket.
910 *
911 * @todo Peer_socket::Int_state will also include various states on way to a graceful close, once we implement that.
912 */
913 enum class Int_state
914 {
915 /// Closed (dead or new) socket.
916 S_CLOSED,
917
918 /**
919 * Public state is OPEN+CONNECTING; user requested active connect; we sent SYN and are
920 * awaiting response.
921 */
922 S_SYN_SENT,
923
924 /**
925 * Public state is OPEN+CONNECTING; other side requested passive connect via SYN; we sent
926 * SYN_ACK and are awaiting response.
927 */
928 S_SYN_RCVD,
929
930 /// Public state is OPEN+CONNECTED; in our opinion the connection is established.
931 S_ESTABLISHED
932 }; // enum class Int_state
933
934 // Friend of Peer_socket: For access to private alias Int_state.
935 friend std::ostream& operator<<(std::ostream& os, Int_state state);
936
937 struct Sent_packet;
938
939 /// Short-hand for #m_snd_flying_pkts_by_sent_when type; see that data member.
941
942 /// Short-hand for #m_snd_flying_pkts_by_sent_when `const` iterator type.
944
945 /// Short-hand for #m_snd_flying_pkts_by_sent_when iterator type.
947
948 /// Short-hand for #m_snd_flying_pkts_by_seq_num type; see that data member.
949 using Sent_pkt_by_seq_num_map = std::map<Sequence_number, Sent_pkt_ordered_by_when_iter>;
950
951 /// Short-hand for #m_snd_flying_pkts_by_seq_num `const` iterator type.
952 using Sent_pkt_ordered_by_seq_const_iter = Sent_pkt_by_seq_num_map::const_iterator;
953
954 /// Short-hand for #m_snd_flying_pkts_by_seq_num iterator type.
955 using Sent_pkt_ordered_by_seq_iter = Sent_pkt_by_seq_num_map::iterator;
956
957 struct Received_packet;
958
959 /**
960 * Short-hand for #m_rcv_packets_with_gaps type; see that data member. `struct`s are stored via
961 * shared pointers instead of as direct objects to minimize copying of potentially heavy-weight
962 * data. They are stored as shared pointers instead of as raw pointers to avoid having to
963 * worry about `delete`.
964 */
965 using Recvd_pkt_map = std::map<Sequence_number, boost::shared_ptr<Received_packet>>;
966
967 /// Short-hand for #m_rcv_packets_with_gaps `const` iterator type.
968 using Recvd_pkt_const_iter = Recvd_pkt_map::const_iterator;
969
970 /// Short-hand for #m_rcv_packets_with_gaps iterator type.
971 using Recvd_pkt_iter = Recvd_pkt_map::iterator;
972
973 struct Individual_ack;
974
975 /**
976 * Type used for #m_rcv_syn_rcvd_data_q. Using `vector` because we only need `push_back()` and
977 * iteration at the moment. Using pointer to non-`const` instead of `const` because when we actually handle the
978 * packet as received we will need to be able to modify the packet for performance (see
979 * Node::handle_data_to_established(), when it transfers data to Receive buffer).
980 */
981 using Rcv_syn_rcvd_data_q = std::vector<boost::shared_ptr<Data_packet>>;
982
983 /* Methods: as few as possible. Logic should reside in class Node as discussed (though
984 * implementation may be in peer_socket.cpp). See rationale in class Peer_socket
985 * documentation header. */
986
987 /**
988 * Non-template helper for template send() that forwards the send() logic to Node::send(). Would
989 * be pointless to try to explain more here; see code and how it's used. Anyway, this has to be
990 * in this class.
991 *
992 * @param snd_buf_feed_func
993 * Function that will perform and return `m_snd_buf->feed(...)`. See send().
994 * @param err_code
995 * See send().
996 * @return Value to be returned by calling Node::send().
997 */
998 size_t node_send(const Function<size_t (size_t max_data_size)>& snd_buf_feed_func,
999 Error_code* err_code);
1000
1001 /**
1002 * Same as sync_send() but uses a #Fine_clock-based #Fine_duration non-template type for
1003 * implementation convenience and to avoid code bloat to specify timeout.
1004 *
1005 * @tparam Const_buffer_sequence
1006 * See sync_send().
1007 * @param data
1008 * See sync_send().
1009 * @param wait_until
1010 * See `sync_send(timeout)`. This is the absolute time point corresponding to that.
1011 * `"duration<Rep, Period>::max()"` maps to the value `Fine_time_pt()` (Epoch) for this argument.
1012 * @param err_code
1013 * See sync_send().
1014 * @return See sync_send().
1015 */
1016 template<typename Const_buffer_sequence>
1017 size_t sync_send_impl(const Const_buffer_sequence& data, const Fine_time_pt& wait_until,
1018 Error_code* err_code);
1019
1020 /**
1021 * Helper similar to sync_send_impl() but for the `null_buffers` versions of `sync_send()`.
1022 *
1023 * @param wait_until
1024 * See sync_send_impl().
1025 * @param err_code
1026 * See sync_send_impl().
1027 * @return See `sync_send(null_buffers)`. `true` if and only if Writable status successfuly reached in time.
1028 */
1029 bool sync_send_reactor_pattern_impl(const Fine_time_pt& wait_until, Error_code* err_code);
1030
1031 /**
1032 * This is to sync_send() as node_send() is to send().
1033 *
1034 * @param snd_buf_feed_func_or_empty
1035 * See node_send(). Additionally, if this is `.empty()` then `null_buffers` a/k/a "reactor pattern" mode is
1036 * engaged.
1037 * @param wait_until
1038 * See sync_send_impl().
1039 * @param err_code
1040 * See sync_send().
1041 * @return See sync_send().
1042 */
1043 size_t node_sync_send(const Function<size_t (size_t max_data_size)>& snd_buf_feed_func_or_empty,
1044 const Fine_time_pt& wait_until,
1045 Error_code* err_code);
1046
1047 /**
1048 * Non-template helper for template receive() that forwards the receive() logic to
1049 * Node::receive(). Would be pointless to try to explain more here; see code and how it's used.
1050 * Anyway, this has to be in this class.
1051 *
1052 * @param rcv_buf_consume_func
1053 * Function that will perform and return `m_rcv_buf->consume(...)`. See receive().
1054 * @param err_code
1055 * See receive().
1056 * @return Value to be returned by calling Node::receive().
1057 */
1058 size_t node_receive(const Function<size_t ()>& rcv_buf_consume_func,
1059 Error_code* err_code);
1060
1061 /**
1062 * Same as sync_receive() but uses a #Fine_clock-based #Fine_duration non-template type
1063 * for implementation convenience and to avoid code bloat to specify timeout.
1064 *
1065 * @tparam Block_sequence
1066 * See sync_receive().
1067 * @param target
1068 * See sync_receive().
1069 * @param wait_until
1070 * See `sync_receive(timeout)`. This is the absolute time point corresponding to that.
1071 * `"duration<Rep, Period>::max()"` maps to the value `Fine_time_pt()` (Epoch) for this argument.
1072 * @param err_code
1073 * See sync_receive().
1074 * @return See sync_receive().
1075 */
1076 template<typename Mutable_buffer_sequence>
1077 size_t sync_receive_impl(const Mutable_buffer_sequence& target,
1078 const Fine_time_pt& wait_until, Error_code* err_code);
1079
1080 /**
1081 * Helper similar to sync_receive_impl() but for the `null_buffers` versions of `sync_receive()`.
1082 *
1083 * @param wait_until
1084 * See sync_receive_impl().
1085 * @param err_code
1086 * See sync_receive_impl().
1087 * @return See `sync_receive(null_buffers)`. `true` if and only if Readable status successfuly reached in time.
1088 */
1089 bool sync_receive_reactor_pattern_impl(const Fine_time_pt& wait_until, Error_code* err_code);
1090
1091 /**
1092 * This is to sync_receive() as node_receive() is to receive().
1093 *
1094 * @param rcv_buf_consume_func_or_empty
1095 * See node_receive(). Additionally, if this is `.empty()` then `null_buffers` a/k/a "reactor pattern" mode is
1096 * engaged.
1097 * @param wait_until
1098 * See sync_receive_impl().
1099 * @param err_code
1100 * See sync_receive().
1101 * @return See sync_receive().
1102 */
1103 size_t node_sync_receive(const Function<size_t ()>& rcv_buf_consume_func_or_empty,
1104 const Fine_time_pt& wait_until,
1105 Error_code* err_code);
1106
1107 /**
1108 * Analogous to Node::opt() but for per-socket options. See that method.
1109 *
1110 * Do NOT read option values without opt().
1111 *
1112 * @tparam Opt_type
1113 * The type of the option data member.
1114 * @param opt_val_ref
1115 * A reference (important!) to the value you want; this may be either a data member of
1116 * this->m_opts or the entire this->m_opts itself.
1117 * @return A copy of the value at the given reference.
1118 */
1119 template<typename Opt_type>
1120 Opt_type opt(const Opt_type& opt_val_ref) const;
1121
1122 /**
1123 * Returns the smallest multiple of max_block_size() that is >= the given option value, optionally
1124 * first inflated by a certain %. The intended use case is to obtain a Send of Receive buffer max
1125 * size that is about equal to the user-specified (or otherwise obtained) value, in bytes, but is
1126 * a multiple of max-block-size -- to prevent fragmenting max-block-size-sized chunks of data unnecessarily -- and
1127 * to possibly inflate that value by a certain percentage for subtle flow control reasons.
1128 *
1129 * @param opt_val_ref
1130 * A reference to a `size_t`-sized socket option, as would be passed to opt(). See opt().
1131 * This is the starting value.
1132 * @param inflate_pct_val_ptr
1133 * A pointer to an `unsigned int`-sized socket option, as would be passed to opt(). See
1134 * opt(). This is the % by which to inflate opt_val_ref before rounding up to nearest
1135 * max_block_size() multiple. If null, the % is assumed to be 0.
1136 * @return See above.
1137 */
1138 size_t max_block_size_multiple(const size_t& opt_val_ref,
1139 const unsigned int* inflate_pct_val_ptr = 0) const;
1140
1141 /**
1142 * Whether retransmission is enabled on this connection. Short-hand for appropriate opt() call.
1143 * Note this always returns the same value for a given object.
1144 *
1145 * @return Ditto.
1146 */
1147 bool rexmit_on() const;
1148
1149 /**
1150 * Helper that is equivalent to Node::ensure_sock_open(this, err_code). Used by templated
1151 * methods which must be defined in this header file, which means they cannot access Node members
1152 * directly, as Node is an incomplete type.
1153 *
1154 * @param err_code
1155 * See Node::ensure_sock_open().
1156 * @return See Node::ensure_sock_open().
1157 */
1158 bool ensure_open(Error_code* err_code) const;
1159
1160 /**
1161 * Helper that, given a byte count, returns a string with that byte count and the number of
1162 * max_block_size()-size blocks that can fit within it (rounded down).
1163 *
1164 * @param bytes
1165 * @return See above.
1166 */
1167 std::string bytes_blocks_str(size_t bytes) const;
1168
1169 // Data.
1170
1171 /**
1172 * This socket's per-socket set of options. Initialized at construction; can be subsequently
1173 * modified by set_options(), although only the dynamic members of this may be modified.
1174 *
1175 * Accessed from thread W and user thread U != W. Protected by #m_opts_mutex. When reading, do
1176 * NOT access without locking (which is encapsulated in opt()).
1177 */
1179
1180 /// The mutex protecting #m_opts.
1182
1183 /// `true` if we connect() to server; `false` if we are to be/are `accept()`ed. Should be set once and not modified.
1185
1186 /**
1187 * See state(). Should be set before user gets access to `*this`. Must not be modified by non-W threads after that.
1188 *
1189 * Accessed from thread W and user threads U != W (in state() and others). Must be protected
1190 * by mutex.
1191 */
1193
1194 /**
1195 * See state(). Should be set before user gets access to `*this`. Must not be modified by non-W
1196 * threads after that.
1197 *
1198 * Accessed from thread W and user threads U != W (in state() and others). Must be protected by
1199 * mutex.
1200 */
1202
1203 /**
1204 * See node(). Should be set before user gets access to `*this` and not changed, except to null when
1205 * state is State::S_CLOSED. Must not be modified by non-W threads.
1206 *
1207 * Invariant: `x->node() == y` if and only if `y->m_socks` contains `x`; otherwise `!x->node()`.
1208 * The invariant must hold by the end of the execution of any thread W boost.asio handler (but
1209 * not necessarily at all points within that handler, or generally).
1210 *
1211 * Accessed from thread W and user threads U != W (in node() and others). Must be protected by
1212 * mutex.
1213 *
1214 * @todo `boost::weak_ptr<Node>` would be ideal for this, but of course then Node would have to
1215 * (only?) be available via shared_ptr<>.
1216 */
1218
1219 /**
1220 * For sockets that come a Server_socket, this is the inverse of Server_socket::m_connecting_socks: it is
1221 * the Server_socket from which this Peer_socket will be Server_socket::accept()ed (if that succeeds); or null if
1222 * this is an actively-connecting Peer_socket or has already been `accept()`ed.
1223 *
1224 * More formally, this is null if #m_active_connect; null if not the case but already accept()ed; and otherwise:
1225 * `((y->m_connecting_socks contains x) || (y->m_unaccepted_socks contains x))` if and only if
1226 * `x->m_originating_serv == y`. That is, for a socket in state Int_state::S_SYN_RCVD, or in state
1227 * Int_state::S_ESTABLISHED, but before being accept()ed by the user, this is the server socket that spawned this
1228 * peer socket.
1229 *
1230 * ### Thread safety ###
1231 * This can be write-accessed simultaneously by thread W (e.g., when closing a
1232 * socket before it is accepted) and a user thread U != W (in Server_socket::accept()). It is
1233 * thus protected by a mutex -- but it's Server_socket::m_mutex, not Peer_socket::m_mutex. I
1234 * know it's weird, but it makes sense. Basically Server_socket::m_unaccepted_socks and
1235 * Server_socket::m_originating_serv -- for each element of `m_unaccepted_socks` -- are modified together
1236 * in a synchronized way.
1237 *
1238 * @see Server_socket::m_connecting_socks and Server_socket::m_unaccepted_socks for the closely
1239 * related inverse.
1240 */
1242
1243 /**
1244 * The Receive buffer; Node feeds data at the back; user consumes data at the front. Contains
1245 * application-layer data received from the other side, to be read by user via receive() and
1246 * similar.
1247 *
1248 * A maximum cumulative byte count is maintained. If data are received that would exceed this max
1249 * (i.e., the user is not retrieving the data fast enough to keep up), these data are dropped (and
1250 * if we use ACKs would be eventually treated as dropped by the other side).
1251 *
1252 * Note that this is a high-level structure, near the application layer. This does not store any
1253 * metadata, like sequence numbers, or data not ready to be consumed by the user (such as
1254 * out-of-order packets, if we implement that). Such packets and data should be stored elsewhere.
1255 *
1256 * ### Thread safety ###
1257 * This can be write-accessed simultaneously by thread W (when receiving by Node)
1258 * and a user thread U != W (in receive(), etc.). It is thus protected by a mutex.
1259 */
1261
1262 /**
1263 * The Send buffer; user feeds data at the back; Node consumes data at the front. Contains
1264 * application-layer data to be sent to the other side as supplied by user via send() and friends.
1265 *
1266 * A maximum cumulative byte count is maintained. If data are supplied that would exceed
1267 * this max (i.e., the Node is not sending the data fast enough to keep up), send() will
1268 * inform the caller that fewer bytes than intended have been buffered. Typically this happens if
1269 * the congestion control window is full, so data are getting buffered in #m_snd_buf instead of
1270 * being immediately consumed and sent.
1271 *
1272 * Note that this is a high-level structure, near the application layer. This does not store any
1273 * metadata, like sequence numbers, or data not ready to be consumed by the user (such as
1274 * out-of-order packets, if we implement that). Such packets and data should be stored elsewhere.
1275 *
1276 * Thread safety: Analogous to #m_rcv_buf.
1277 */
1279
1280 /**
1281 * The #Error_code causing disconnection (if one has occurred or is occurring)
1282 * on this socket; otherwise a clear (success) #Error_code. This starts as success
1283 * and may move to one non-success value and then never change after that. Graceful connection
1284 * termination is (unlike in BSD sockets, where this is indicated with receive() returning 0, not an
1285 * error) indeed counted as a non-success value for #m_disconnect_cause.
1286 *
1287 * Exception: if, during graceful close, the connection must be closed abruptly (due to error,
1288 * including error::Code::S_USER_CLOSED_ABRUPTLY), #m_disconnect_cause may change a second time (from "graceful close"
1289 * to "abrupt closure").
1290 *
1291 * As in TCP net-stacks, one cannot recover from a transmission error or termination on the socket
1292 * (fake "error" `EWOULDBLOCK`/`EAGAIN` excepted), which is why this can only go success ->
1293 * non-success and never change after that.
1294 *
1295 * How to report this to the user: attempting to `*receive()` when not Readable while
1296 * #m_disconnect_cause is not success => `*receive()` returns this #Error_code to the user; and
1297 * similarly for `*send()` and Writable.
1298 *
1299 * I emphasize that this should happen only after Receive buffer has been emptied; otherwise user
1300 * will not be able to read queued up received data after the Node internally detects connection
1301 * termination. By the same token, if the Node can still reasonably send data to the other side,
1302 * and Send buffer is not empty, and #m_disconnect_cause is not success, the Node should only halt
1303 * the packet sending once Send buffer has been emptied.
1304 *
1305 * This should be success in all states except State::S_CLOSED and
1306 * State::S_OPEN + Open_sub_state::S_DISCONNECTING.
1307 *
1308 * ### Thread safety ###
1309 * Since user threads will access this at least via receive() and send(), while
1310 * thread W may set it having detected disconnection, this must be protected by a mutex.
1311 */
1313
1314 /**
1315 * If `!m_active_connect`, this contains the serialized metadata that the user supplied on
1316 * the other side when initiating the connect; otherwise this is the serialized metadata that the user
1317 * supplied on this side when initiating the connect. In either case (though obviously more
1318 * useful in the `!m_active_connect` case) it can be obtained via get_connect_metadata().
1319 * In the `m_active_connect` case, this is also needed if we must re-send the original SYN
1320 * (retransmission).
1321 *
1322 * ### Thread safety ###
1323 * Same as #m_snd_buf and #m_rcv_buf (protected by #m_mutex). This would not be
1324 * necessary, since this value is immutable once user gets access to `*this`, and threads other than
1325 * W can access it, but sock_free_memory() does clear it while the user may be accessing it. Due
1326 * to that caveat, we have to lock it.
1327 */
1329
1330 /**
1331 * This object's mutex. The protected items are #m_state, #m_open_sub_state, #m_disconnect_cause,
1332 * #m_node, #m_rcv_buf, #m_snd_buf, #m_serialized_metadata.
1333 *
1334 * Generally speaking, if 2 or more of the protected variables must be changed in the same
1335 * non-blocking "operation" (for some reasonable definition of "operation"), they should probably
1336 * be changed within the same #m_mutex-locking critical section. For example, if closing the
1337 * socket in thread W due to an incoming RST, one should lock #m_mutex, clear both buffers, set
1338 * #m_disconnect_cause, change `m_state = State::S_CLOSED`, and then unlock #m_mutex. Then thread U != W
1339 * will observe all this state changed at the "same time," which is desirable.
1340 */
1342
1343 /// See remote_endpoint(). Should be set before user gets access to `*this` and not changed afterwards.
1345
1346 /// See local_port(). Should be set before user gets access to `*this` and not changed afterwards.
1348
1349 /**
1350 * Current internal state of the socket. Note this is a very central piece of information and is analogous
1351 * to TCP's "state" (ESTABLISHED, etc. etc.).
1352 *
1353 * This gains meaning only in thread W. This should NOT be
1354 * accessed outside of thread W and is not protected by a mutex.
1355 */
1357
1358 /**
1359 * The queue of DATA packets received while in Int_state::S_SYN_RCVD state before the
1360 * Syn_ack_ack_packet arrives to move us to Int_state::S_ESTABLISHED
1361 * state, at which point these packets can be processed normally. Such
1362 * DATA packets would not normally exist, but they can exist if the SYN_ACK_ACK is lost or DATA
1363 * packets are re-ordered to go ahead of it. See Node::handle_data_to_syn_rcvd() for more
1364 * detail.
1365 *
1366 * This gains meaning only in thread W. This should NOT be accessed outside of thread W
1367 * and is not protected by a mutex.
1368 */
1370
1371 /**
1372 * The running total count of bytes in the `m_data` fields of #m_rcv_syn_rcvd_data_q. Undefined
1373 * when the latter is empty. Used to limit its size. This gains meaning only in thread W. This
1374 * should NOT be accessed outside of thread W and is not protected by a mutex.
1375 */
1377
1378 /**
1379 * The Initial Sequence Number (ISN) contained in the original Syn_packet or
1380 * Syn_ack_packet we received.
1381 *
1382 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1383 * not protected by a mutex. Useful at least in verifying the validity of duplicate SYNs and
1384 * SYN_ACKs.
1385 */
1387
1388 /**
1389 * The maximal sequence number R from the remote side such that all data with sequence numbers
1390 * strictly less than R in this connection have been received by us and placed into the Receive
1391 * buffer. This first gains meaning upon receiving SYN and is the sequence number of that SYN,
1392 * plus one (as in TCP); or upon receiving SYN_ACK (similarly). Note that received packets past
1393 * this sequence number may exist, but if so there is at least one missing packet (the one at
1394 * #m_rcv_next_seq_num) preceding all of them.
1395 *
1396 * @see #m_rcv_packets_with_gaps.
1397 *
1398 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1399 * not protected by a mutex.
1400 */
1402
1403 /**
1404 * The sequence-number-ordered collection of all
1405 * received-and-not-dropped-due-to-buffer-overflow packets such that at least
1406 * one unreceived-or-otherwise-unknown datum precedes all sequence numbers in this collection;
1407 * a/k/a the reassembly queue if retransmission is enabled.
1408 * With retransmission off, the only purpose of keeping this structure at all is to detect any
1409 * already-received-and-given-to-Receive-buffer packet coming in again; such a packet should be
1410 * ACKed but NOT given to the Receive buffer again (avoid data duplication). With retransmission
1411 * on, this is additionally used as the reassembly queue (storing the non-contiguous data until
1412 * the gaps are filled in).
1413 *
1414 * The structure is best explained by breaking down the sequence number space. I list the
1415 * sequence number ranges in increasing order starting with the ISN. Let `last_rcv_seq_num` be the
1416 * sequence number of the last datum to have been received (and not dropped due to insufficient
1417 * Receive buffer space), for exposition purposes.
1418 *
1419 * - #m_rcv_init_seq_num =
1420 * - SYN or SYN_ACK
1421 *
1422 * - [`m_rcv_init_seq_num + 1`, `m_rcv_next_seq_num - 1`] =
1423 * - Largest possible range of sequence numbers such that each datum represented by this range
1424 * has been received (and not dropped due to insufficient Receive buffer space) and copied to
1425 * the Receive buffer for user retrieval.
1426 *
1427 * - [#m_rcv_next_seq_num, `m_rcv_next_seq_num + N - 1`] =
1428 * - The first packet after the ISN that has not yet been received (or has been received but has
1429 * been dropped due to insufficient Receive buffer space). `N` is the (unknown to us) length of
1430 * that packet. `N` > 0. This can be seen as the first "gap" in the received sequence number
1431 * space.
1432 *
1433 * - [`m_rcv_next_seq_num + N`, `last_rcv_seq_num`] =
1434 * - The remaining packets up to and including the last byte that has been received (and not
1435 * dropped due to insufficient Receive buffer space). Each packet in this range is one of the
1436 * following:
1437 * - received (and not dropped due to insufficient Receive buffer space);
1438 * - not received (or received and dropped due to insufficient Receive buffer space).
1439 *
1440 * - [`last_rcv_seq_num + 1`, ...] =
1441 * - All remaining not-yet-received (or received but dropped due to insufficient Receive buffer
1442 * space) packets.
1443 *
1444 * #m_rcv_packets_with_gaps contains all Received_packets in the range [`m_rcv_next_seq_num + N`,
1445 * `last_rcv_seq_num`], with each particular Received_packet's first sequence number as its key. If
1446 * there are no gaps -- all received sequence numbers are followed by unreceived sequence numbers
1447 * -- then that range is empty and so is #m_rcv_packets_with_gaps. All the other ranges can be
1448 * null (empty) as well. If there are no received-and-undropped packets, then `m_rcv_init_seq_num
1449 * == m_rcv_next_seq_num`, which is the initial situation.
1450 *
1451 * The above is an invariant, to be true at the end of each boost.asio handler in thread W, at
1452 * least.
1453 *
1454 * Each received-and-undropped packet therefore is placed into #m_rcv_packets_with_gaps, anywhere
1455 * in the middle. If retransmission is off, the data in the packet is added to Receive buffer.
1456 * If retransmission is on, the data in the packet is NOT added to Receive buffer but instead
1457 * saved within the structure for later reassembly (see next paragraph).
1458 *
1459 * If the [#m_rcv_next_seq_num, ...] (first gap) packet is received-and-not-dropped, then
1460 * #m_rcv_next_seq_num is incremented by N (the length of that packet), filling the gap. Moreover,
1461 * any contiguous packets at the front of #m_rcv_packets_with_gaps, assuming the first packet's
1462 * sequence number equals #m_rcv_next_seq_num, must be removed from #m_rcv_packets_with_gaps, and
1463 * #m_rcv_next_seq_num should be incremented accordingly. All of this maintains the invariant. If
1464 * retransmission is on, the data in the byte sequence formed by this operation is to be placed
1465 * (in sequence number order) into the Receive buffer (a/k/a reassembly).
1466 *
1467 * Conceptually, this is the action of receiving a gap packet which links up following
1468 * already-received packets to previous already-received packets, which means all of these can go
1469 * away, as the window slides forward beyond them.
1470 *
1471 * If a packet arrives and is already in #m_rcv_packets_with_gaps, then it is a duplicate and is
1472 * NOT placed on the Receive buffer. The same holds for any packet with sequence numbers
1473 * preceding #m_rcv_next_seq_num.
1474 *
1475 * The type used is a map sorted by starting sequence number of each packet. Why? We need pretty
1476 * fast middle operations, inserting and checking for existence of arriving packet. We need fast
1477 * access to the earliest (smallest sequence number) packet, for when the first gap is filled.
1478 * `std::map` satisfies these needs: `insert()` and `lower_bound()` are <em>O(log n)</em>; `begin()` gives the
1479 * smallest element and is @em O(1). Iteration is @em O(1) as well. (All amortized.)
1480 *
1481 * ### Memory use ###
1482 * The above scheme allows for unbounded memory use given certain behavior from the
1483 * other side, when retransmission is off. Suppose packets 1, 2, 3 are received; then packets 5,
1484 * 6, ..., 1000 are received. Retransmission is off, so eventually the sender
1485 * may give up on packet 4 and consider it Dropped. So the gap will forever exist; hence
1486 * #m_rcv_packets_with_gaps will always hold per-packet data for 5, 6, ..., 1000 (and any
1487 * subsequent packets). With retransmission, packet 4 would eventually arrive, or the connection
1488 * would get RSTed, but without retransmission that doesn't happen. Thus memory use will just
1489 * grow and grow. Solution: come up with some heuristic that would quite conservatively declare
1490 * that packet 4 has been "received," even though it hasn't. This will plug the hole (packet 4)
1491 * and clear #m_rcv_packets_with_gaps in this example. Then if packet 4 does somehow come in, it
1492 * will get ACKed (like any valid received packet) but will NOT be saved into the Receive buffer,
1493 * since it will be considered "duplicate" due to already being "received." Of course, the
1494 * heuristic must be such that few if any packets considered "received" this way will actually get
1495 * delivered eventually, otherwise we may lose a lot of data. Here is one such heuristic, that is
1496 * both simple and conservative: let N be some constant (e.g., N = 100). If
1497 * `m_rcv_packets_with_gaps.size()` exceeds N (i.e., equals (N + 1)), consider all gap packets
1498 * preceding #m_rcv_packets_with_gaps's first sequence number as "received." This will, through
1499 * gap filling logic described above, reduce `m_rcv_packets_with_gaps.size()` to N or less. Thus it
1500 * puts a simple upper bound on #m_rcv_packets_with_gaps's memory; if N = 100 the memory used by
1501 * the structure is not much (since we don't store the actual packet data there [but this can get
1502 * non-trivial with 100,000 sockets all filled up]). Is it conservative? Yes. 100 packets
1503 * arriving after a gap are a near-guarantee those gap packets will never arrive (especially
1504 * without retransmission, which is the predicate for this entire problem). Besides, the Drop
1505 * heuristics on the Sender side almost certainly will consider gap packets with 100 or near 100
1506 * Acknowledged packets after them as Dropped a long time ago; if the receiving side's heuristics
1507 * are far more conservative, then that is good enough.
1508 *
1509 * If retransmission is on, then (as noted) the sender's CWND and retransmission logic will ensure
1510 * that gaps are filled before more future data are sent, so the above situation will not occur.
1511 * However if the sender is a bad boy and for some reason sends new data and ignores gaps
1512 * (possibly malicious behavior), then it would still be a problem. Since in retransmission mode
1513 * it's not OK to just ignore lost packets, we have no choice but to drop received packets when
1514 * the above situation occurs (similarly to when Receive buffer is exceeded). This is basically a
1515 * security measure and should not matter assuming well-behaved operation from the other side.
1516 * Update: With retransmission on, this structure is now subject to overflow protection with a tighter
1517 * limit than with rexmit-off; namely, the limit controlling #m_rcv_buf overflow actually applies to
1518 * the sum of data being stored in #m_rcv_buf and this structure, together. I.e., a packet is dropped
1519 * if the total data stored in #m_rcv_buf and #m_rcv_packets_with_gaps equal or exceed the configured
1520 * limit. Accordingly, rcv-wnd advertised to other side is based on this sum also.
1521 *
1522 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1523 * not protected by a mutex.
1524 *
1525 * @see #m_rcv_reassembly_q_data_size.
1526 *
1527 * @todo The memory use of this structure could be greatly optimized if, instead of storing each
1528 * individual received packet's metadata separately, we always merged contiguous sequence number
1529 * ranges. So for example if packet P1, P2, P3 (contiguous) all arrived in sequence, after
1530 * missing packet P0, then we'd store P1's first sequence number and the total data size of
1531 * P1+P2+P3, in a single `struct` instance. Since a typical pattern might include 1 lost packet
1532 * followed by 100 received packets, we'd be able to cut down memory use by a factor of about 100
1533 * in that case (and thus worry much less about the limit). Of course the code would get more
1534 * complex and potentially slower (but not necessarily significantly).
1535 */
1537
1538 /**
1539 * With retransmission enabled, the sum of Received_packet::m_size over all packets stored in the
1540 * reassembly queue, #m_rcv_packets_with_gaps. Stored for performance.
1541 *
1542 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1543 * not protected by a mutex.
1544 */
1546
1547 /**
1548 * The received packets to be acknowledged in the next low-level ACK packet to be sent to the
1549 * other side, ordered in the chronological order they were received. They are accumulated in a
1550 * data structure because we may not send each desired acknowledgment right away, combining
1551 * several together, thus reducing overhead at the cost of short delays (or even nearly
1552 * non-existent delays, as in the case of several DATA packets handled in one
1553 * NodeLLlow_lvl_recv_and_handle() invocation, i.e., having arrived at nearly at the same time).
1554 *
1555 * Any two packets represented by these Individual_ack objects may be duplicates of each other (same
1556 * Sequence_number, possibly different Individual_ack::m_received_when values). It's up to the sender (receiver
1557 * of ACK) to sort it out. However, again, they MUST be ordered chronologicaly based on the time
1558 * when they were received; from earliest to latest.
1559 *
1560 * Storing shared pointers to avoid copying of structs (however small) during internal
1561 * reshuffling; shared instead of raw pointers to not worry about delete.
1562 *
1563 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1564 * not protected by a mutex.
1565 */
1566 std::vector<boost::shared_ptr<Individual_ack>> m_rcv_pending_acks;
1567
1568 /**
1569 * Helper state, to be used while inside either Node::low_lvl_recv_and_handle() or
1570 * async part of Node::async_wait_latency_then_handle_incoming(), set only at the beginning of either and equal to
1571 * `m_rcv_pending_acks.size()` at that time. Because, for efficiency, individual acknowledgements are
1572 * accumulated over the course of those two methods, and an ACK with those acknowledgments is
1573 * sent at the end of that method (in perform_accumulated_on_recv_tasks()) at the earliest, this
1574 * member is used to determine whether we should start a delayed ACK timer at that point.
1575 *
1576 * This gains meaning only in thread W and only within Node::low_lvl_recv_and_handle()/etc.
1577 * and loses meaning after either method exits. This should NOT
1578 * be accessed outside of thread W and is not protected by a mutex.
1579 */
1581
1582 /**
1583 * While Node::low_lvl_recv_and_handle() or async part of Node::async_wait_latency_then_handle_incoming() is running,
1584 * accumulates the individual acknowledgments contained in all incoming ACK low-level packets
1585 * received in those methods. More precisely, this accumulates the elements of
1586 * `packet.m_rcv_acked_packets` for all packets such that `packet` is an Ack_packet. They are accumulated
1587 * in this data structure for a similar reason that outgoing acknowledgments are accumulated in
1588 * `Peer_socket::m_rcv_pending_acks`. The situation here is simpler, however, since the present
1589 * structure is always scanned and cleared at the end of the current handler and never carried
1590 * over to the next, as we always want to scan all individual acks received within a non-blocking
1591 * amount of time from receipt. See Node::handle_ack_to_established() for details.
1592 *
1593 * This structure is empty, accumulated over the course of those methods, is used to finally scan
1594 * all individual acknowledgments (in the exact order received), and then cleared for the next
1595 * run.
1596 *
1597 * Storing shared pointers to avoid copying of structs (however small) during internal
1598 * reshuffling; shared instead of raw pointers to not worry about delete.
1599 *
1600 * This gains meaning only in thread W and only within Node::low_lvl_recv_and_handle()/etc.
1601 * and loses meaning after either method exits. This should NOT
1602 * be accessed outside of thread W and is not protected by a mutex.
1603 */
1604 std::vector<Ack_packet::Individual_ack::Ptr> m_rcv_acked_packets;
1605
1606 /**
1607 * While Node::low_lvl_recv_and_handle() or async part of Node::async_wait_latency_then_handle_incoming() is running,
1608 * contains the rcv_wnd (eventual #m_snd_remote_rcv_wnd) value in the last observed ACK low-level
1609 * packet received in those methods. The reasoning is similar to #m_rcv_acked_packets. See
1610 * Node::handle_ack_to_established() for details.
1611 *
1612 * This gains meaning only in thread W and only within Node::low_lvl_recv_and_handle()/etc.
1613 * and loses meaning after either method exits. This should NOT
1614 * be accessed outside of thread W and is not protected by a mutex.
1615 */
1617
1618 /**
1619 * The last rcv_wnd value sent to the other side (in an ACK). This is used to gauge how much the
1620 * true rcv_wnd has increased since the value that the sender probably (assuming ACK was not lost)
1621 * knows.
1622 *
1623 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1624 * not protected by a mutex.
1625 */
1627
1628 /**
1629 * `true` indicates we are in a state where we've decided other side needs to be informed that
1630 * our receive window has increased substantially, so that it can resume sending data (probably
1631 * after a zero window being advertised).
1632 *
1633 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1634 * not protected by a mutex.
1635 */
1637
1638 /**
1639 * Time point at which #m_rcv_in_rcv_wnd_recovery was last set to true. It is only used when the
1640 * latter is indeed true.
1641 *
1642 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1643 * not protected by a mutex.
1644 */
1646
1647 /**
1648 * When #m_rcv_in_rcv_wnd_recovery is `true`, this is the scheduled task to possibly
1649 * send another unsolicited rcv_wnd-advertising ACK to the other side.
1650 *
1651 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1652 * not protected by a mutex.
1653 */
1655
1656 /**
1657 * Timer started, assuming delayed ACKs are enabled, when the first Individual_ack is placed onto
1658 * an empty #m_rcv_pending_acks; when it triggers, the pending individual acknowledgments are packed
1659 * into as few as possible ACKs and sent to the other side. After the handler exits
1660 * #m_rcv_pending_acks is again empty and the process can repeat starting with the next received valid
1661 * packet.
1662 *
1663 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1664 * not protected by a mutex.
1665 *
1666 * ### Implementation notes ###
1667 * In other places we have tended to replace a `Timer` with the far simpler util::schedule_task_from_now() (etc.)
1668 * facility (which internally uses a `Timer` but hides its various annoyances and caveats). Why not here?
1669 * Answer: This timer is scheduled and fires often (could be on the order of every 1-500 milliseconds) and throughout
1670 * a given socket's existence; hence the potential performance effects aren't worth the risk (or at least mental
1671 * energy spent on evaluating that risk, originally and over time). The conservative thing to do is reuse a single
1672 * `Timer` repeatedly, as we do here.
1673 */
1675
1676 /// Stats regarding incoming traffic (and resulting outgoing ACKs) for this connection so far.
1678
1679 /**
1680 * The Initial Sequence Number (ISN) used in our original SYN or SYN_ACK. Useful at least in re-sending the
1681 * original SYN or SYN_ACK if unacknowledged for too long.
1682 *
1683 * @see #m_snd_flying_pkts_by_seq_num and #m_snd_flying_pkts_by_sent_when.
1684 *
1685 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1686 * not protected by a mutex.
1687 */
1689
1690 /**
1691 * The sequence number for the start of the data in the next new DATA packet to be sent out. By
1692 * "new" I mean not-retransmitted (assuming retransmission is even enabled).
1693 *
1694 * @todo Possibly #m_snd_next_seq_num will apply to other packet types than DATA, probably anything to do with
1695 * connection termination.
1696 *
1697 * @see #m_snd_flying_pkts_by_seq_num and #m_snd_flying_pkts_by_sent_when.
1698 *
1699 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1700 * not protected by a mutex.
1701 */
1703
1704 /**
1705 * The collection of all In-flight packets, indexed by sequence number and ordered from most to
1706 * least recently sent (including those queued up to wire-send in pacing module). See also
1707 * #m_snd_flying_pkts_by_seq_num which is a similar map but in order of sequence number.
1708 * That map's keys are again sequence numbers, but its values are iterators into the present map
1709 * to save memory and avoid having to sync up data between the two (the only thing we must sync between them
1710 * are their key sets). The two maps together can be considered to be the sender-side "scoreboard."
1711 *
1712 * These are all the packets that have been sent but not Acknowledged that we have not yet considered
1713 * Dropped. (With retransmission on, packets are never considered
1714 * permanently Dropped, but they are considered Dropped until retransmitted.) With retransmission
1715 * off, the ultimate goal of having this structure at all is to handle ACKs, the ultimate goal of
1716 * which is, in turn, for the In-flight vs. Congestion Window comparison for congestion control.
1717 * With retransmission on, the structure additionally stores the data in the In-flight packets, so
1718 * that they can be retransmitted if we determine they were probably dropped.
1719 *
1720 * With retransmission on, this is NOT the retransmission queue itself -- i.e., this does NOT
1721 * store packet data that we know should be retransmitted when possible but rather only the data
1722 * already In-flight (whether from first attempt or from retransmission).
1723 *
1724 * Please see #m_snd_flying_pkts_by_seq_num for a breakdown of the sequence number space.
1725 * Since that structure contains iterators to exactly the values in the present map, that comment will
1726 * explain which packets are in the present map.
1727 *
1728 * #m_snd_flying_pkts_by_sent_when contains In-flight Sent_packet objects as values, with each particular
1729 * Sent_packet's first sequence number as its key. If there are no In-flight Sent_packet objects, then
1730 * `m_snd_flying_pkts_by_sent_when.empty()`.
1731 *
1732 * The above is an invariant, to be true at the end of each boost.asio handler in thread W, at
1733 * least.
1734 *
1735 * Each sent packet therefore is placed into #m_snd_flying_pkts_by_sent_when, at the front (as is standard
1736 * for a Linked_hash_map, and as is expected, since they are ordered by send time). (Note, however,
1737 * that being in this map does not mean it has been sent; it may only be queued up to be sent and
1738 * waiting in the pacing module; however, pacing does not change the order of packets but merely
1739 * the exact send moment, which cannot change the position in this queue.)
1740 * When a packet is Acknowledged, it is removed from #m_snd_flying_pkts_by_sent_when -- could be from anywhere
1741 * in the ordering. Similarly to Acknowledged packets, Dropped ones are also removed.
1742 *
1743 * The type used is a map indexed by starting sequence number of each packet but also in order of
1744 * being sent out. Lookup by sequence number is near constant time; insertion near the end is
1745 * near constant time; and iteration by order of when it was sent out is easy/fast, with iterators
1746 * remaining valid as long as the underlying elements are not erased. Why use this particular
1747 * structure? Well, the lookup by sequence number occurs all the time, such as when matching up
1748 * an arriving acknowledgment against a packet that we'd sent out. We'd prefer it to not invalidate
1749 * iterators when something is erased, so Linked_hash_map is good in that way
1750 * also. So finally, why order by time it was queued up for sending (as opposed to by sequence
1751 * number, as would be the case if this were an std::map)? In truth, both are needed, which is why
1752 * #m_snd_flying_pkts_by_seq_num exists. This ordering is needed particularly for the
1753 * `m_acks_after_me` logic, wherein we count how many times packets that were sent after a given packet
1754 * have been acknowledged so far; by arranging the packets in that same order, that value can be
1755 * easily and quickly accumulated by walking back from the most recently sent packet. On the other
1756 * hand, some operations need sequence number ordering, which is why we have #m_snd_flying_pkts_by_seq_num;
1757 * note (again) that the two maps have the same key set, with one's values being iterators pointing into
1758 * the other.
1759 *
1760 * Whenever a packet with `m_sent_when.back().m_sent_time == T` is acknowledged, we need to (by definition of
1761 * Sent_packet::m_acks_after_me) increment `m_acks_after_me` for each packet with
1762 * `m_sent_when.back().m_sent_time < T`. So, if we
1763 * find the latest-sent element that satisfies that, then all packets appearing to the right
1764 * (i.e., "sent less recently than") and including that one, in this ordering, should have `m_acks_after_me`
1765 * incremented. Using a certain priority queue-using algorithm (see Node::handle_accumulated_acks())
1766 * we can do this quite efficiently.
1767 *
1768 * Note that this means Sent_packet::m_acks_after_me is strictly increasing as one walks this map.
1769 *
1770 * Since any packet with `m_acks_after_me >= C`, where `C` is some constant, is considered Dropped and
1771 * removed from #m_snd_flying_pkts_by_seq_num and therefore this map also, we also get the property that
1772 * if we find a packet in this map for which that is true, then it is also true for all packets
1773 * following it ("sent less recently" than it) in this map. This allows us to more quickly determine which
1774 * packets should be removed from #m_snd_flying_pkts_by_sent_when, without having to walk the entire structure(s).
1775 *
1776 * ### Memory use ###
1777 * This structure's memory use is naturally bounded by the Congestion Window.
1778 * Congestion control will not let it grow beyond that many packets (bytes really, but you get the point).
1779 * At that point blocks will stay on the Send buffer, until that fills up too. Then send() will refuse to enqueue
1780 * any more packets (telling the user as much).
1781 *
1782 * ### Thread safety ###
1783 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1784 * not protected by a mutex.
1785 *
1786 * @see Sent_when and Sent_packet::m_sent_when, where if `X` is the the last element of the latter sequence, then
1787 * `X.m_sent_time` is the value by which elements in the present map are ordered. However, this only
1788 * happens to be the case, because by definition an element
1789 * is always placed at the front of the present map (Linked_hash_map), and this order is inductively maintained;
1790 * AND MEANWHILE A Sent_when::m_sent_time's constructed value can only increase over time (which is a guaranteed
1791 * property of the clock we use (::Fine_clock)).
1792 * @see #m_snd_flying_bytes, which must always be updated to be accurate w/r/t
1793 * #m_snd_flying_pkts_by_sent_when. Use Node::snd_flying_pkts_updated() whenever
1794 * #m_snd_flying_pkts_by_sent_when is changed.
1795 * @see #m_snd_flying_pkts_by_seq_num, which provides an ordering of the elements of
1796 * #m_snd_flying_pkts_by_sent_when by sequence number. Whereas the present structure is used to
1797 * determine `m_acks_after_me` (since logically "after" means "sent after"), `..._by_seq_num`
1798 * is akin to the more classic TCP scoreboard, which is used to subdivide the sequence number
1799 * space (closely related to #m_snd_next_seq_num and such). With
1800 * retransmission off, "after" would simply mean "having higher sequence number," so
1801 * #m_snd_flying_pkts_by_sent_when would already provide this ordering, but with retransmission on
1802 * a retransmitted packet with a lower number could be sent after one with a higher number.
1803 * To make the code simpler, we therefore rely on a separate structure in either situation.
1804 */
1806
1807 /**
1808 * The collection of all In-flight packets (including those queued up to send in pacing module),
1809 * indexed AND ordered by sequence number. See also #m_snd_flying_pkts_by_sent_when which is a similar map
1810 * but in order of time sent. Our map's keys are sequence numbers again, but its values are iterators
1811 * into #m_snd_flying_pkts_by_sent_when to save memory and avoid having to sync up data between the two
1812 * (only keys are in sync). The two maps together can be considered to be the "scoreboard," though in fact
1813 * the present structure alone is closer to a classic TCP scoreboard.
1814 *
1815 * The key sets of the two maps are identical. The values in this map are iterators to exactly all
1816 * elements of #m_snd_flying_pkts_by_sent_when. One can think of the present map as essentially achieving an
1817 * alternate ordering of the values in the other map.
1818 *
1819 * That said, the structure's contents and ordering are closely related to a breakdown of the sequence
1820 * number space. I provide this breakdown here. I list the
1821 * sequence number ranges in increasing order starting with the ISN. Let `first_flying_seq_num
1822 * = m_snd_flying_pkts_by_seq_num.begin()->first` (i.e., the first key Sequence_number in
1823 * #m_snd_flying_pkts_by_seq_num) for exposition purposes.
1824 *
1825 * - #m_snd_init_seq_num =
1826 * - SYN or SYN_ACK
1827 *
1828 * - [`m_snd_init_seq_num + 1`, `first_flying_seq_num - 1`] =
1829 * - Largest possible range of sequence numbers such that each datum represented by this range
1830 * has been sent and either:
1831 * - Acknowledged (ACK received for it); or
1832 * - Dropped (ACK not received; we consider it dropped due to some factor like timeout or
1833 * duplicate ACKs);
1834 *
1835 * - [`first_flying_seq_num`, `first_flying_seq_num + N - 1`] =
1836 * - The first packet that has been sent that is neither Acknowledged nor Dropped. `N` is length
1837 * of that packet. This is always the first packet, if any, to be considered Dropped in the
1838 * future. This packet is categorized In-flight.
1839 *
1840 * - [`first_flying_seq_num + N`, `m_snd_next_seq_num - 1`] =
1841 * - All remaining sent packets. Each packet in this range is one of the following:
1842 * - Acknowledged;
1843 * - not Acknowledged and not Dropped = categorized In-flight.
1844 *
1845 * - [#m_snd_next_seq_num, ...] =
1846 * - Unsent packets, if any.
1847 *
1848 * #m_snd_flying_pkts_by_sent_when and #m_snd_flying_pkts_by_seq_num contain In-flight Sent_packet objects as values
1849 * (though the latter indirectly via iterator into the former) with each particular Sent_packet's first sequence
1850 * number as its key in either structure. If there are no In-flight Sent_packet objects, then
1851 * `m_snd_flying_pkts_by_{sent_when|seq_num}.empty()` and hence `first_flying_seq_num` above does not
1852 * exist. Each of the [ranges] above can be null (empty).
1853 *
1854 * Each sent packet therefore is placed into #m_snd_flying_pkts_by_seq_num, at the back (if it's a new packet) or
1855 * possibly elsewhere (if it's retransmitted) -- while it is also placed into #m_snd_flying_pkts_by_sent_when but
1856 * always at the front (as, regardless of retransmission or anything else, it is the latest packet to be SENT). When
1857 * packet is Acknowledged, it is removed from `m_snd_flying_pkts_by_*` -- could be from anywhere in
1858 * the ordering. Similarly to Acknowledged packets, Dropped ones are also removed.
1859 *
1860 * Why do we need this map type in addition to `Linked_hash_map m_snd_flying_pkts_by_sent_when`? Answer: Essentially,
1861 * when an acknowledgment comes in, we need to be able to determine where in the sequence number space this is.
1862 * If packets are ordered by send time -- not sequence number -- and the sequence number does not match
1863 * exactly one of the elements here (e.g., it erroneously straddles one, or it is a duplicate acknowledgement,
1864 * which means that element isn't in the map any longer), then a tree-sorted-by-key map is invaluable
1865 * (in particular: to get `upper_bound()`, and also to travel to the previous-by-sequence-number packet
1866 * from the latter). So logarithmic-time upper-bound searches and iteration by sequence number are what we want and
1867 * get with this added ordering on top of #m_snd_flying_pkts_by_sent_when.
1868 *
1869 * ### Memory use ###
1870 * This structure's memory use is naturally bounded the same as #m_snd_flying_pkts_by_sent_when.
1871 *
1872 * ### Thread safety ###
1873 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1874 * not protected by a mutex.
1875 *
1876 * @see #m_snd_flying_pkts_by_sent_when. There's a "see also" comment there that contrasts these two
1877 * important structures.
1878 */
1880
1881 /**
1882 * The number of bytes contained in all In-flight packets, used at least for comparison against
1883 * the congestion window (CWND). More formally, this is the sum of all Sent_packet::m_size values
1884 * in #m_snd_flying_pkts_by_sent_when. We keep this, instead of computing it whenever needed, for
1885 * performance. In various TCP and related RFCs this value (or something spiritually similar, if
1886 * only cumulative ACKs are used) is called "pipe" or "FlightSize."
1887 *
1888 * Though in protocols like DCCP, where CWND is stored in packets, instead of bytes, "pipe" is
1889 * actually just `m_snd_flying_pkts_by_sent_when.size()`. Not for us though.
1890 *
1891 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1892 * not protected by a mutex.
1893 *
1894 * @see #m_snd_flying_pkts_by_sent_when, which must always be updated to be accurate w/r/t
1895 * #m_snd_flying_bytes. Use Node::snd_flying_pkts_updated() whenever #m_snd_flying_pkts_by_sent_when
1896 * is changed.
1897 */
1899
1900 /**
1901 * Helper data structure to store the packet IDs of packets that are marked Dropped during a single run
1902 * through accumulated ACKs; it is a data member instead of local variable for performance only. The pattern is
1903 * to simply `clear()` it just before use, then load it up with stuff in that same round of ACK handling;
1904 * and the same thing each time we need to handle accumulated ACKs. Normally one would just create one
1905 * of these locally within the code `{` block `}` each time instead.
1906 * Not doing so avoids unnecessary various internal-to-`vector` buffer
1907 * allocations. Instead the internal buffer will grow as large as it needs to and not go down from there, so
1908 * that it can be reused in subsequent operations. (Even `clear()` does not internally shrink the buffer.)
1909 * Of course some memory is held unnecessarily, but it's a small amount; on the other hand the performance
1910 * gain may be non-trivial due to the frequency of the ACK-handling code being invoked.
1911 *
1912 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1913 * not protected by a mutex.
1914 */
1915 std::vector<order_num_t> m_snd_temp_pkts_marked_to_drop;
1916
1917 /**
1918 * For the Sent_packet representing the next packet to be sent, this is the value to assign to
1919 * `m_sent_when.back().first`. In other words it's an ever-increasing number that is sort of like
1920 * a sequence number but one per packet and represents time at which sent, not order in the byte stream.
1921 * In particular the same packet retransmitted will have the same sequence number the 2nd time but
1922 * an increased order number. Starts at 0.
1923 *
1924 * This is only used for book-keeping locally and never transmitted over network.
1925 *
1926 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1927 * not protected by a mutex.
1928 */
1930
1931 /**
1932 * If retransmission is on, this is the retransmission queue. It's the queue of packets determined to have
1933 * been dropped and thus to be retransmitted, when Congestion Window allows this. Packet in Sent_packet::m_packet
1934 * field of element at top of the queue is to be retransmitted next; and the element itself is to be inserted into
1935 * #m_snd_flying_pkts_by_sent_when while popped from the present structure. The packet's Data_packet::m_rexmit_id
1936 * should be incremented before sending; and the Sent_packet::m_sent_when `vector` should be appended with the
1937 * then-current time (for future RTT calculation).
1938 *
1939 * If retransmission is off, this is empty.
1940 *
1941 * Why use `list<>` and not `queue<>` or `deque<>`? Answer: I'd like to use `list::splice()`.
1942 *
1943 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1944 * not protected by a mutex.
1945 */
1946 std::list<boost::shared_ptr<Sent_packet>> m_snd_rexmit_q;
1947
1948 /// Equals `m_snd_rexmit_q.size().` Kept since `m_snd_rexmit_q.size()` may be @em O(n) depending on implementation.
1950
1951 /**
1952 * The congestion control strategy in use for this connection on this side. Node informs
1953 * this object of events, such as acknowedgments and loss events; conversely this object informs
1954 * (or can inform if asked) the Node whether or not DATA packets can be sent, by means of
1955 * providing the Node with the socket's current Congestion Window (CWND) computed based on the
1956 * particular Congestion_control_strategy implementation's algorithm (e.g., Reno or Westwood+).
1957 * Node then determines whether data can be sent by comparing #m_snd_flying_bytes (# of bytes we think
1958 * are currently In-flight) to CWND (# of bytes the strategy allows to be In-flight currently).
1959 *
1960 * ### Life cycle ###
1961 * #m_snd_cong_ctl must be initialized to an instance before user gains access to this
1962 * socket; the pointer must never change subsequently except back to null (permanently). The
1963 * Peer_socket destructor, at the latest, will delete the underlying object, as #m_snd_cong_ctl
1964 * will be destructed. (Note `unique_ptr` has no copy operator or
1965 * constructor.) There is a 1-to-1 relationship between `*this` and #m_snd_cong_ctl.
1966 *
1967 * ### Visibility between Congestion_control_strategy and Peer_socket ###
1968 * #m_snd_cong_ctl gets read-only (`const`!) but otherwise complete
1969 * private access (via `friend`) to the contents of `*this` Peer_socket. For example, it can
1970 * read `this->m_snd_smoothed_round_trip_time` (the SRTT) and use it
1971 * for computations if needed. Node and Peer_socket get only strict public API access to
1972 * #m_snd_cong_ctl, which is a black box to it.
1973 *
1974 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1975 * not protected by a mutex.
1976 */
1977 boost::movelib::unique_ptr<Congestion_control_strategy> m_snd_cong_ctl;
1978
1979 /**
1980 * The receive window: the maximum number of bytes the other side has advertised it would be
1981 * willing to accept into its Receive buffer if they'd arrived at the moment that advertisement
1982 * was generated by the other side. This starts as 0 (undefined) and is originally set at SYN_ACK
1983 * or SYN_ACK_ACK receipt and then subsequently updated upon each ACK received. Each such update
1984 * is called a "rcv_wnd update" or "window update."
1985 *
1986 * #m_snd_cong_ctl provides congestion control; this value provides flow control. The socket's
1987 * state machine must be extremely careful whenever either this value or
1988 * `m_snd_cong_ctl->congestion_window_bytes()` may increase, as when that occurs it should call
1989 * Node::send_worker() in order to possibly send data over the network.
1990 *
1991 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
1992 * not protected by a mutex.
1993 */
1995
1996 /**
1997 * The outgoing available bandwidth estimator for this connection on this side. Node informs
1998 * this object of events, namely as acknowedgments; conversely this object informs
1999 * (or can inform if asked) the Node what it thinks is the current available bandwidth for
2000 * user data in DATA packets. This can be useful at least for some forms of congestion control
2001 * but possibly as information for the user, which is why it's an independent object and not
2002 * part of a specific congestion control strategy (I only mention this because the mechanics of such
2003 * a bandwidth estimator typically originate in service of a congestion control algorithm like Westwood+).
2004 *
2005 * ### Life cycle ###
2006 * It must be initialized to an instance before user gains access to this
2007 * socket; the pointer must never change subsequently except back to null (permanently). The
2008 * Peer_socket destructor, at the latest, will delete the underlying object, as #m_snd_bandwidth_estimator
2009 * is destroyed along with `*this`. The only reason it's a pointer is that it takes a Const_ptr in the constructor,
2010 * and that's not available during Peer_socket construction yet. (Note `unique_ptr` has no copy operator or
2011 * constructor.) There is a 1-to-1 relationship between `*this` and #m_snd_bandwidth_estimator.
2012 *
2013 * ### Visibility between Send_bandwidth_estimator and Peer_socket ###
2014 * The former gets read-only (`const`!) but otherwise complete private access (via `friend`) to the contents of
2015 * `*this` Peer_socket. For example, it can read #m_snd_smoothed_round_trip_time (the SRTT) and use it
2016 * for computations if needed. Node and Peer_socket get only strict public API access to
2017 * #m_snd_bandwidth_estimator, which is a black box to it.
2018 *
2019 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2020 * not protected by a mutex.
2021 */
2022 boost::movelib::unique_ptr<Send_bandwidth_estimator> m_snd_bandwidth_estimator;
2023
2024 /**
2025 * Estimated current round trip time of packets, computed as a smooth value over the
2026 * past individual RTT measurements. This is updated each time we make an RTT measurement (i.e.,
2027 * receive a valid, non-duplicate acknowledgment of a packet we'd sent). The algorithm to compute
2028 * it is taken from RFC 6298. The value is 0 (not a legal value otherwise) until the first RTT
2029 * measurement is made.
2030 *
2031 * We use #Fine_duration (the high fine-grainedness and large bit width corresponding to #Fine_clock) to
2032 * store this, and the algorithm we use to compute it avoids losing digits via unnecessary
2033 * conversions between units (e.g., nanoseconds -> milliseconds).
2034 *
2035 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2036 * not protected by a mutex.
2037 */
2039
2040 /// RTTVAR used for #m_snd_smoothed_round_trip_time calculation. @see #m_snd_smoothed_round_trip_time.
2042
2043 /**
2044 * The Drop Timer engine, which controls how In-flight (#m_snd_flying_pkts_by_sent_when) packets are
2045 * considered Dropped due to being unacknowledged for too long. Used while #m_int_state is
2046 * Int_state::S_ESTABLISHED.
2047 *
2048 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2049 * not protected by a mutex.
2050 */
2052
2053 /**
2054 * The Drop Timeout: Time period between the next time #m_snd_drop_timer schedules a Drop Timer and that timer
2055 * expiring. This is updated each time #m_snd_smoothed_round_trip_time is updated, and the Drop_timer
2056 * itself may change it under certain circumstances.
2057 *
2058 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2059 * not protected by a mutex.
2060 */
2062
2063 /**
2064 * The state of outgoing packet pacing for this socket; segregated into a simple `struct` to keep
2065 * Peer_socket shorter and easier to understand. Packet pacing tries to combat the burstiness of
2066 * outgoing low-level packet stream.
2067 *
2068 * @see `struct Send_pacing_data` doc header for much detail.
2069 */
2071
2072 /**
2073 * The last time that Node has detected a packet loss event and so informed #m_snd_cong_ctl by calling
2074 * the appropriate method of class Congestion_control_strategy. Roughly speaking, this is used to
2075 * determine whether the detection of a given dropped packet is part of the same loss event as the
2076 * previous one; if so then #m_snd_cong_ctl is not informed again (presumably to avoid dropping CWND
2077 * too fast); if not it is informed of the new loss event. Even more roughly speaking, if the new
2078 * event is within a certain time frame of #m_snd_last_loss_event_when, then they're considered in the
2079 * same loss event. You can find detailed discussion in a giant comment in
2080 * Node::handle_accumulated_acks().
2081 *
2082 * Before any loss events, this is set to its default value (zero time since epoch).
2083 *
2084 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2085 * not protected by a mutex.
2086 */
2088
2089 /**
2090 * Time at which the last Data_packet low-level packet for this connection was sent. We
2091 * use this when determining whether the connection is in Idle Timeout (i.e., has sent no traffic
2092 * for a while, which means there has been no data to send). It's used for congestion control.
2093 *
2094 * Before any packets are sent, this is set to its default value (zero time since epoch).
2095 *
2096 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2097 * not protected by a mutex.
2098 *
2099 * ### Pacing ###
2100 * See Send_packet_pacing #m_snd_pacing_data. See pacing-relevant note on Sent_packet::m_sent_when
2101 * which applies equally to this data member.
2102 */
2104
2105 /// Stats regarding outgoing traffic (and resulting incoming ACKs) for this connection so far.
2107
2108 /**
2109 * Random security token used during SYN_ACK-SYN_ACK_ACK. For a given connection handshake, the
2110 * SYN_ACK_ACK receiver ensures that #m_security_token it received is equal to the original one it
2111 * had sent in SYN_ACK. This first gains meaning upong sending SYN_ACK and it does not change afterwards.
2112 * It is not used unless `!m_active_connect`. See #m_active_connect.
2113 *
2114 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2115 * not protected by a mutex.
2116 */
2118
2119 /**
2120 * Connection attempt scheduled task; fires if an individual connection request packet is not answered with a reply
2121 * packet in time. It is readied when *any* SYN or SYN_ACK packet is sent, and fired if that packet has gone
2122 * unacknowledged with a SYN_ACK or SYN_ACK_ACK (respectively), long enough to be retransmitted.
2123 *
2124 * Connection establishment is aborted if it fires too many times, but #m_connection_timeout_scheduled_task is how
2125 * "too many times" is determined.
2126 *
2127 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2128 * not protected by a mutex.
2129 *
2130 * @see #m_connection_timeout_scheduled_task which keeps track of the entire process timing out, as opposed to the
2131 * individual attempts.
2132 */
2134
2135 /**
2136 * If currently using #m_init_rexmit_scheduled_task, this is the number of times the timer has already fired
2137 * in this session. So when the timer is readied the first time it's zero; if it fires and is
2138 * thus readied again it's one; again => two; etc., until timer is canceled or connection is
2139 * aborted due to too many retries.
2140 *
2141 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2142 * not protected by a mutex.
2143 */
2145
2146 /**
2147 * Connection timeout scheduled task; fires if the entire initial connection process does not complete within a
2148 * certain amount of time. It is started when the SYN or SYN_ACK is sent the very first time (NOT counting
2149 * resends), canceled when SYN_ACK or SYN_ACK_ACK (respectively) is received in response to ANY SYN or
2150 * SYN_ACK (respevtively), and fired if the the latter does not occur in time.
2151 *
2152 * This gains meaning only in thread W. This should NOT be accessed outside of thread W and is
2153 * not protected by a mutex.
2154 *
2155 * @see #m_init_rexmit_scheduled_task which keeps track of *individual* attempts timing out, as opposed to the
2156 * entire process.
2157 */
2159
2160 /**
2161 * This is the final set of stats collected at the time the socket was moved to S_CLOSED #m_state.
2162 * If it has not yet moved to that state, this is not applicable (but equals Peer_socket_info()).
2163 * It's used by info() to get at the final set of stats, before the source info is purged by the
2164 * resource cleanup in sock_free_memory().
2165 */
2167}; // class Peer_socket
2168
2169
2170/**
2171 * @private
2172 *
2173 * Metadata (and data, if retransmission is on) for a packet that has been sent one (if
2174 * retransmission is off) or possibly more (if on) times. This is purely a data store, not a
2175 * class. It is not copyable, and moving them around by smart Sent_packet::Ptr is encouraged.
2176 */
2178 // Endow us with shared_ptr<>s ::Ptr and ::Const_ptr (syntactic sugar).
2179 public util::Shared_ptr_alias_holder<boost::shared_ptr<Peer_socket::Sent_packet>>,
2180 private boost::noncopyable
2181{
2182 // Types.
2183
2184 struct Sent_when;
2185
2186 /**
2187 * Type used for #m_acks_after_me. Use a small type to save memory; this is easily large enough,
2188 * given that that we drop a packet after #m_acks_after_me exceeds a very small value (< 10), and
2189 * a given ACK can *conceivably* hold a few thousand individual acknowledgments (most likely many
2190 * fewer). In fact `uint8_t` is probably enough, but it seemed easier to not worry about
2191 * overflows when doing arithmetic with these.
2192 */
2193 using ack_count_t = uint16_t;
2194
2195 // Data.
2196
2197 /// Number of bytes in the Data_packet::m_data field of the sent packet.
2198 const size_t m_size;
2199
2200 /**
2201 * Time stamps, order numbers, and other info at the times when the different attempts (including
2202 * original sending and retransmissions) to send the packet were given to the UDP net-stack. These are
2203 * arranged in the order they were sent (0 = original transmission, 1 = first retransmission, 2 = second
2204 * retransmission, ...). If retransmission is off, this only ever has 1 element (set via
2205 * constructor).
2206 *
2207 * Along with each time stamp, we also store an order number. It is much like a sequence
2208 * number, except (a) we never send it anywhere (internal bookkeeping only), (b) it specifies
2209 * the order in which the packets were sent off. Why use that and not a `m_sent_time` timestamp? Because
2210 * `m_sent_time` can, in theory, have ties (i.e., it's kind of possible for 2 packets to be sent
2211 * within the same high-resolution clock tick). This gives us a solid way to order packets by send time
2212 * in certain key algorithms.
2213 *
2214 * Note: for reasons explained in the pacing module this should be the time stamp when we
2215 * actually send it off to boost.asio UDP net-stack, not when we queue it into the pacing module (which can be
2216 * quite a bit earlier, if pacing is enabled). Subtlety: Note that "time stamp when we actually send it off
2217 * to boost.asio UDP net-stack" IS the same as "time stamp when boost.asio UDP net-stack actually performs the
2218 * `sendto()` call" in practice -- a good thing for accurate RTT measurements.
2219 */
2220 std::vector<Sent_when> m_sent_when;
2221
2222 /**
2223 * The number of times any packet with `m_sent_when.back().m_order_num > this->m_sent_when.back().m_order_num`
2224 * has been acknowledged. Reworded slightly: start at 0; *IF* a packet, other than this one, is In-flight and then
2225 * acknowledged, and that packet was sent out (or, equivalently, scheduled to be sent out -- packets do not
2226 * change order once marked for sending -- a/k/a became In-flight) chronologically AFTER same happened to
2227 * `*this` packet *THEN* add 1. After adding up all such acks, the resulting value = this member.
2228 *
2229 * This is used to determine when `*this` packet should be considered Dropped,
2230 * as in TCP's Fast Retransmit heuristic (but better, since time-when-sent -- not sequence number, which has
2231 * retransmission ambiguities -- is used to order packets).
2232 *
2233 * Note, again, that "after" has little to do with *sequence* number but rather with send time, as ordered via
2234 * the *order* number. E.g., due to retransmission, `*this` may have sequence number greater than another
2235 * packet yet have the lesser order number. (This dichotomy is mentioned a lot, everywhere, and if I'm overdoing
2236 * it, it's only because sometimes TCP cumulative-ack habits die hard, so I have to keep reminding myself.)
2237 * (If retransmission is disabled, however, then sequence number order = order number order.)
2238 */
2240
2241 /**
2242 * If retransmission is on, this is the DATA packet itself that was sent; otherwise null. It is
2243 * stored in case we need to retransmit it. If we don't (it is acknowledged), the entire
2244 * Sent_packet is erased and goes away, so #m_packet's ref-count decreases to zero, and it
2245 * disappears. If we do, all we need to do to retransmit it is increment its Data_packet::m_rexmit_id and
2246 * place it into the retransmission queue for sending, when CWND space is available.
2247 *
2248 * Why store the entire packet and not just its Data_packet::m_data? There's a small bit of memory overhead, but
2249 * we get Data_packet::m_rexmit_id for free and don't have to spend time re-creating the packet to retransmit.
2250 * So it's just convenient.
2251 *
2252 * If retransmission is off, there is no need to store this, as it will not be re-sent.
2253 */
2254 const boost::shared_ptr<Data_packet> m_packet;
2255
2256 // Constructors/destructor.
2257
2258 /**
2259 * Constructs object with the given values and #m_acks_after_me at zero. If rexmit-on option enabled, the
2260 * packet is stored in #m_packet; otherwise #m_packet is initialized to null. Regardless, #m_size
2261 * is set to `packet.m_data.size()`.
2262 *
2263 * @param rexmit_on
2264 * True if and only if `sock->rexmit_on()` for the containing socket.
2265 * @param packet
2266 * The packet that will be sent. Used for #m_size and, if `sock->rexmit_on()`, saved in #m_packet.
2267 * @param sent_when
2268 * #m_sent_when is set to contain one element equalling this argument.
2269 */
2270 explicit Sent_packet(bool rexmit_on, boost::shared_ptr<Data_packet> packet, const Sent_when& sent_when);
2271}; // struct Peer_socket::Sent_packet
2272
2273/**
2274 * @private
2275 *
2276 * Data store to keep timing related info when a packet is sent out. Construct via direct member initialization.
2277 * It is copy-constructible (for initially copying into containers and such) but not assignable to discourage
2278 * unneeded copying (though it is not a heavy structure). Update: A later version of clang does not like
2279 * this technique and warns about it; to avoid any such trouble just forget the non-assignability stuff;
2280 * it's internal code; we should be fine.
2281 */
2283{
2284 // Data.
2285
2286 /**
2287 * Order number of the packet. This can be used to compare two packets sent out; the one with the
2288 * higher order number was sent out later. See Peer_socket::m_snd_last_order_num. The per-socket next-value
2289 * for this is incremented each time another packet is sent out.
2290 *
2291 * @see Peer_socket::Sent_packet::m_sent_when for more discussion.
2292 *
2293 * @internal
2294 * @todo Can we make Sent_when::m_order_num and some of its peers const?
2295 */
2296 // That @internal should not be necessary to hide the @todo in public generated docs -- Doxygen bug?
2298
2299 /**
2300 * The timestamp when the packet is sent out. This may be "corrected" when actually sent *after* pacing delay
2301 * (hence not `const`).
2302 *
2303 * @see Fine_clock. Recall that, due to latter's properties, `Fine_clock::now()` results monotonically increase
2304 * over time -- a property on which we rely on, particularly since this data member is only assigned
2305 * `Fine_clock::now()` at various times.
2306 *
2307 * @see Peer_socket::Sent_packet::m_sent_when for more discussion.
2308 */
2310
2311 /**
2312 * The congestion window size (in bytes) that is used when the packet is sent out.
2313 * We store this to pass to the congestion control module when this packet is acked in the future.
2314 * Some congestion control algorithms compute CWND based on an earlier CWND value.
2315 * Not `const` for similar reason as #m_sent_time.
2316 *
2317 * @internal
2318 * @todo Why is #m_sent_cwnd_bytes in Sent_when `struct` and not directly in `Sent_packet`? Or maybe it should
2319 * stay here, but Sent_when should be renamed `Send_attempt` (or `Send_try` for brevity)? Yeah, I think that's
2320 * it. Then Sent_packet::m_sent_when -- a container of Sent_when objects -- would become `m_send_tries`, a
2321 * container of `Send_try` objects. That makes more sense (sentce?!) than the status quo which involves the
2322 * singular-means-plural strangeness.
2323 *
2324 * @todo Can we make #m_sent_cwnd_bytes and some of its peers `const`?
2325 */
2326 // That @internal should not be necessary to hide the @todo in public generated docs -- Doxygen bug?
2328}; // struct Peer_socket::Sent_packet::Sent_when
2329
2330/**
2331 * @private
2332 *
2333 * Metadata (and data, if retransmission is on) for a packet that has been received (and, if
2334 * retransmission is off, copied to Receive buffer). This is purely a data store, not a class.
2335 * It is not copyable, and moving them around by smart Received_packet::Ptr is encouraged.
2336 */
2338 // Endow us with shared_ptr<>s ::Ptr and ::Const_ptr (syntactic sugar).
2339 public util::Shared_ptr_alias_holder<boost::shared_ptr<Peer_socket::Received_packet>>,
2340 private boost::noncopyable
2341{
2342 // Data.
2343
2344 /// Number of bytes in the Data_packet::m_data field of that packet.
2345 const size_t m_size;
2346
2347 /**
2348 * Byte sequence equal to that of Data_packet::m_data of the packet. Kept `empty()` if retransmission is off,
2349 * since in that mode any received packet's `m_data` is immediately moved to Receive buffer. With
2350 * retransmission on, it is stored until all gaps before the packet are filled (a/k/a
2351 * reassembly).
2352 */
2354
2355 // Constructors/destructor.
2356
2357 /**
2358 * Constructs object by storing size of data and, if so instructed, the data themselves.
2359 *
2360 * @param logger_ptr
2361 * The Logger implementation to use subsequently.
2362 * @param size
2363 * #m_size.
2364 * @param src_data
2365 * Pointer to the packet data to be moved into #m_data; or null if we shouldn't store it.
2366 * This should be null if and only if retransmission is off. If not null, for
2367 * performance, `*src_data` is CLEARED by this constructor (its data moved, in constant
2368 * time, into #m_data). In that case, `src_data.size() == size` must be true at entrance to
2369 * constructor, or behavior is undefined.
2370 */
2371 explicit Received_packet(log::Logger* logger_ptr, size_t size, util::Blob* src_data);
2372};
2373
2374/**
2375 * @private
2376 *
2377 * Metadata describing the data sent in the acknowledgment of an individual received packet. (A
2378 * low-level ACK packet may include several sets of such data.) This is purely a data store, not
2379 * a class. It is not copyable, and moving them around by smart Individual_ack::Ptr is encouraged.
2380 * Construct this by direct member initialization.
2381 */
2383 // Cannot use boost::noncopyable or Shared_ptr_alias_holder, because that turns off direct initialization.
2384{
2385 // Types.
2386
2387 /// Short-hand for ref-counted pointer to mutable objects of this class.
2388 using Ptr = boost::shared_ptr<Individual_ack>;
2389
2390 /// Short-hand for ref-counted pointer to immutable objects of this class.
2391 using Const_ptr = boost::shared_ptr<const Individual_ack>;
2392
2393 // Data.
2394
2395 /// Sequence number of first datum in packet.
2397
2398 /**
2399 * Retransmit counter of the packet (as reported by sender). Identifies which attempt we are
2400 * acknowledging (0 = initial, 1 = first retransmit, 2 = second retransmit, ...). Always 0
2401 * unless retransmission is on.
2402 */
2403 const unsigned int m_rexmit_id;
2404
2405 /// When was it received? Used for supplying delay before acknowledging (for other side's RTT calculations).
2407
2408 /// Number of bytes in the packet's user data.
2409 const size_t m_data_size;
2410
2411 // Constructors/destructor.
2412
2413 /// Force direct member initialization even if no member is `const`.
2415
2416 // Methods.
2417
2418 /// Forbid copy assignment.
2420}; // struct Peer_socket::Individual_ack
2421
2422// Free functions: in *_fwd.hpp.
2423
2424// However the following refer to inner type(s) and hence must be declared here and not _fwd.hpp.
2425
2426/**
2427 * @internal
2428 *
2429 * @todo There are a few guys like this which are marked `@internal` (Doxygen command) to hide from generated
2430 * public documentation, and that works, but really they should not be visible in the publicly-exported
2431 * (not in detail/) header source code; so this should be reorganized for cleanliness. The prototypes like this one
2432 * can be moved to a detail/ header or maybe directly into .cpp that uses them (for those where it's only one).
2433 *
2434 * Prints string representation of given socket state to given standard `ostream` and returns the
2435 * latter.
2436 *
2437 * @param os
2438 * Stream to print to.
2439 * @param state
2440 * Value to serialize.
2441 * @return `os`.
2442 */
2443std::ostream& operator<<(std::ostream& os, Peer_socket::Int_state state);
2444
2445// Template implementations.
2446
2447template<typename Const_buffer_sequence>
2448size_t Peer_socket::send(const Const_buffer_sequence& data, Error_code* err_code)
2449{
2450 namespace bind_ns = util::bind_ns;
2451 using bind_ns::bind;
2452
2453 FLOW_ERROR_EXEC_AND_THROW_ON_ERROR(size_t, Peer_socket::send<Const_buffer_sequence>, bind_ns::cref(data), _1);
2454 // ^-- Call ourselves and return if err_code is null. If got to present line, err_code is not null.
2455
2456 // We are in user thread U != W.
2457
2458 Lock_guard lock(m_mutex); // Lock m_node; also it's a pre-condition for Node::send().
2459
2460 /* Forward the rest of the logic to Node.
2461 * Now, what I really want to do here is simply:
2462 * m_node->send(this, data, err_code);
2463 * I cannot, however, due to C++'s circular reference BS (Node needs Peer_socket, Peer_socket needs Node,
2464 * and both send()s are templates so cannot be implemented in .cpp). So I perform the following
2465 * voodoo. The only part that must be defined in .hpp is the part that actually uses template
2466 * parameters. The only part that uses template parameters is m_snd_buf->feed_bufs_copy().
2467 * Therefore I prepare a canned call to m_snd_buf->feed_bufs_copy(data, ...) and save it in a
2468 * function pointer (Function) to an untemplated, regular function that makes this canned
2469 * call. Finally, I call an untemplated function in .cpp, which has full understanding of Node,
2470 * and pass the pre-canned thingie into that. That guy finally calls Node::send(), passing along
2471 * the precanned thingie. Node::send() can then call the latter when it needs to feed() to the
2472 * buffer. Therefore Node::send() does not need to be templated.
2473 *
2474 * Yeah, I know. I had to do it though. Logic should be in Node, not in Peer_socket.
2475 *
2476 * Update: As of this writing I've added a formal @todo into Node doc header. .inl tricks a-la-Boost might
2477 * help avoid this issue. */
2478 const auto snd_buf_feed_func = [this, &data](size_t max_data_size)
2479 {
2480 return m_snd_buf.feed_bufs_copy(data, max_data_size);
2481 };
2482 // ^-- Important to capture `&data`, not `data`; else `data` (albeit not actual payload) will be copied!
2483
2484 /* Let this untemplated function, in .cpp, deal with Node (since .cpp knows what *m_node really is).
2485 * Note: whatever contraption lambda generated above will be converted to a Function<...> with ... being
2486 * an appropriate signature that node_send() expects, seen in the above explanatory comment. */
2487 return node_send(snd_buf_feed_func, err_code);
2488} // Peer_socket::send()
2489
2490template<typename Const_buffer_sequence>
2491size_t Peer_socket::sync_send(const Const_buffer_sequence& data, Error_code* err_code)
2492{
2493 return sync_send_impl(data, Fine_time_pt(), err_code); // sync_send() with infinite timeout.
2494}
2495
2496template<typename Rep, typename Period, typename Const_buffer_sequence>
2497size_t Peer_socket::sync_send(const Const_buffer_sequence& data,
2498 const boost::chrono::duration<Rep, Period>& max_wait,
2499 Error_code* err_code)
2500{
2501 assert(max_wait.count() > 0);
2502 return sync_send_impl(data,
2504 err_code);
2505}
2506
2507template<typename Rep, typename Period>
2508bool Peer_socket::sync_send(const boost::asio::null_buffers&,
2509 const boost::chrono::duration<Rep, Period>& max_wait,
2510 Error_code* err_code)
2511{
2512 assert(max_wait.count() > 0);
2514 err_code);
2515}
2516
2517template<typename Const_buffer_sequence>
2518size_t Peer_socket::sync_send_impl(const Const_buffer_sequence& data, const Fine_time_pt& wait_until,
2519 Error_code* err_code)
2520{
2521 namespace bind_ns = util::bind_ns;
2522 using bind_ns::bind;
2523
2524 FLOW_ERROR_EXEC_AND_THROW_ON_ERROR(size_t, Peer_socket::sync_send_impl<Const_buffer_sequence>,
2525 bind_ns::cref(data), bind_ns::cref(wait_until), _1);
2526 // ^-- Call ourselves and return if err_code is null. If got to present line, err_code is not null.
2527
2528 // We are in user thread U != W.
2529
2530 Lock_guard lock(m_mutex); // Lock m_node; also it's a pre-condition for Node::send().
2531
2532 /* Forward the rest of the logic to Node.
2533 * Now, what I really want to do here is simply:
2534 * m_node->sync_send(...args...);
2535 * I cannot, however, due to <see same situation explained in Peer_socket::send()>.
2536 * Therefore I do the following (<see Peer_socket::send() for more comments>). */
2537
2538 const auto snd_buf_feed_func = [this, &data](size_t max_data_size)
2539 {
2540 return m_snd_buf.feed_bufs_copy(data, max_data_size);
2541 };
2542 // ^-- Important to capture `&data`, not `data`; else `data` (albeit not actual payload) will be copied!
2543
2544 lock.release(); // Let go of the mutex (mutex is still LOCKED).
2545 /* Let this untemplated function, in .cpp, deal with Node (since .cpp knows what *m_node really is).
2546 * Note: whatever contraption lambda generated above will be converted to a Function<...> with ... being
2547 * an appropriate signature that node_sync_send() expects, seen in the above explanatory comment. */
2548 return node_sync_send(snd_buf_feed_func, wait_until, err_code);
2549} // Peer_socket::sync_send_impl()
2550
2551template<typename Mutable_buffer_sequence>
2552size_t Peer_socket::receive(const Mutable_buffer_sequence& target, Error_code* err_code)
2553{
2554 namespace bind_ns = util::bind_ns;
2555 using bind_ns::bind;
2556
2557 FLOW_ERROR_EXEC_AND_THROW_ON_ERROR(size_t, Peer_socket::receive<Mutable_buffer_sequence>, bind_ns::cref(target), _1);
2558 // ^-- Call ourselves and return if err_code is null. If got to present line, err_code is not null.
2559
2560 // We are in user thread U != W.
2561
2562 Lock_guard lock(m_mutex); // Lock m_node/m_state; also it's a pre-condition for Node::receive().
2563
2564 /* Forward the rest of the logic to Node.
2565 * Now, what I really want to do here is simply:
2566 * m_node->receive(this, target), err_code);
2567 * I cannot, however, due to <see same situation explained in Peer_socket::send()>.
2568 * Therefore I do the following (<see Peer_socket::send() for more comments>). */
2569
2570 const auto rcv_buf_consume_func = [this, &target]()
2571 {
2572 // Ensure that if there are data in Receive buffer, we will return at least 1 block as
2573 // advertised. @todo: I don't understand this comment now. What does it mean? Explain.
2574 return m_rcv_buf.consume_bufs_copy(target);
2575 };
2576 // ^-- Important to capture `&target`, not `target`; else `target` (albeit no actual buffers) will be copied!
2577
2578 /* Let this untemplated function, in .cpp, deal with Node (since .cpp knows what *m_node really is).
2579 * Note: whatever contraption lambda generated above will be converted to a Function<...> with ... being
2580 * an appropriate signature that node_receive() expects, seen in the above explanatory comment. */
2581 return node_receive(rcv_buf_consume_func, err_code);
2582}
2583
2584template<typename Mutable_buffer_sequence>
2585size_t Peer_socket::sync_receive(const Mutable_buffer_sequence& target, Error_code* err_code)
2586{
2587 return sync_receive_impl(target, Fine_time_pt(), err_code); // sync_receive() with infinite timeout.
2588}
2589
2590template<typename Rep, typename Period, typename Mutable_buffer_sequence>
2591size_t Peer_socket::sync_receive(const Mutable_buffer_sequence& target,
2592 const boost::chrono::duration<Rep, Period>& max_wait, Error_code* err_code)
2593{
2594 assert(max_wait.count() > 0);
2595 return sync_receive_impl(target,
2597 err_code);
2598}
2599
2600template<typename Rep, typename Period>
2601bool Peer_socket::sync_receive(const boost::asio::null_buffers&,
2602 const boost::chrono::duration<Rep, Period>& max_wait,
2603 Error_code* err_code)
2604{
2605 assert(max_wait.count() > 0);
2607}
2608
2609template<typename Mutable_buffer_sequence>
2610size_t Peer_socket::sync_receive_impl(const Mutable_buffer_sequence& target,
2611 const Fine_time_pt& wait_until, Error_code* err_code)
2612{
2613 namespace bind_ns = util::bind_ns;
2614 using bind_ns::bind;
2615
2616 FLOW_ERROR_EXEC_AND_THROW_ON_ERROR(size_t, Peer_socket::sync_receive_impl<Mutable_buffer_sequence>,
2617 bind_ns::cref(target), bind_ns::cref(wait_until), _1);
2618 // ^-- Call ourselves and return if err_code is null. If got to present line, err_code is not null.
2619
2620 // We are in user thread U != W.
2621
2622 Lock_guard lock(m_mutex); // Lock m_node; also it's a pre-condition for Node::send().
2623
2624 /* Forward the rest of the logic to Node.
2625 * Now, what I really want to do here is simply:
2626 * m_node->sync_receive(...args...);
2627 * I cannot, however, due to <see same situation explained in Peer_socket::send()>.
2628 * Therefore I do the following (<see Peer_socket::send() for more comments>). */
2629
2630 const auto rcv_buf_consume_func = [this, &target]()
2631 {
2632 return m_rcv_buf.consume_bufs_copy(target);
2633 };
2634 // ^-- Important to capture `&target`, not `target`; else `target` (albeit no actual buffers) will be copied!
2635
2636 lock.release(); // Let go of the mutex (mutex is still LOCKED).
2637 /* Let this untemplated function, in .cpp, deal with Node (since .cpp knows what *m_node really is).
2638 * Note: whatever contraption lambda generated above will be converted to a Function<...> with ... being
2639 * an appropriate signature that node_sync_receive() expects, seen in the above explanatory comment. */
2640 return node_sync_receive(rcv_buf_consume_func, wait_until, err_code);
2641} // Peer_socket::sync_receive()
2642
2643template<typename Opt_type>
2644Opt_type Peer_socket::opt(const Opt_type& opt_val_ref) const
2645{
2646 // Similar to Node::opt().
2648 return opt_val_ref;
2649}
2650
2651} // namespace flow::net_flow
2652
Convenience class that simply stores a Logger and/or Component passed into a constructor; and returns...
Definition: log.hpp:1619
Interface that the user should implement, passing the implementing Logger into logging classes (Flow'...
Definition: log.hpp:1291
Utility class for use by Congestion_control_strategy implementations that implements congestion windo...
Classic congestion control but with backoff to bandwidth estimate-based pipe size.
Classic congestion control, based on Reno (TCP RFC 5681), with congestion avoidance,...
Internal net_flow class that maintains the Drop Timer for DATA packet(s) to have been sent out over a...
Definition: drop_timer.hpp:152
An object of this class is a single Flow-protocol networking node, in the sense that: (1) it has a di...
Definition: node.hpp:937
A class that keeps a Peer_socket_receive_stats data store, includes methods to conveniently accumulat...
A class that keeps a Peer_socket_send_stats data store, includes methods to conveniently accumulate d...
A peer (non-server) socket operating over the Flow network protocol, with optional stream-of-bytes an...
Drop_timer_ptr m_snd_drop_timer
The Drop Timer engine, which controls how In-flight (m_snd_flying_pkts_by_sent_when) packets are cons...
friend std::ostream & operator<<(std::ostream &os, Int_state state)
size_t m_snd_pending_rcv_wnd
While Node::low_lvl_recv_and_handle() or async part of Node::async_wait_latency_then_handle_incoming(...
size_t get_connect_metadata(const boost::asio::mutable_buffer &buffer, Error_code *err_code=0) const
Obtains the serialized connect metadata, as supplied by the user during the connection handshake.
unsigned int m_init_rexmit_count
If currently using m_init_rexmit_scheduled_task, this is the number of times the timer has already fi...
size_t max_block_size_multiple(const size_t &opt_val_ref, const unsigned int *inflate_pct_val_ptr=0) const
Returns the smallest multiple of max_block_size() that is >= the given option value,...
size_t m_snd_remote_rcv_wnd
The receive window: the maximum number of bytes the other side has advertised it would be willing to ...
Sequence_number m_rcv_next_seq_num
The maximal sequence number R from the remote side such that all data with sequence numbers strictly ...
size_t m_rcv_syn_rcvd_data_cumulative_size
The running total count of bytes in the m_data fields of m_rcv_syn_rcvd_data_q.
bool sync_send_reactor_pattern_impl(const Fine_time_pt &wait_until, Error_code *err_code)
Helper similar to sync_send_impl() but for the null_buffers versions of sync_send().
std::vector< boost::shared_ptr< Data_packet > > Rcv_syn_rcvd_data_q
Type used for m_rcv_syn_rcvd_data_q.
std::map< Sequence_number, Sent_pkt_ordered_by_when_iter > Sent_pkt_by_seq_num_map
Short-hand for m_snd_flying_pkts_by_seq_num type; see that data member.
bool sync_receive_reactor_pattern_impl(const Fine_time_pt &wait_until, Error_code *err_code)
Helper similar to sync_receive_impl() but for the null_buffers versions of sync_receive().
util::Mutex_non_recursive Options_mutex
Short-hand for high-performance, non-reentrant, exclusive mutex used to lock m_opts.
order_num_t m_snd_last_order_num
For the Sent_packet representing the next packet to be sent, this is the value to assign to m_sent_wh...
Sequence_number m_snd_next_seq_num
The sequence number for the start of the data in the next new DATA packet to be sent out.
Remote_endpoint m_remote_endpoint
See remote_endpoint(). Should be set before user gets access to *this and not changed afterwards.
size_t sync_receive(const Mutable_buffer_sequence &target, const boost::chrono::duration< Rep, Period > &max_wait, Error_code *err_code=0)
Blocking (synchronous) version of receive().
Fine_duration m_round_trip_time_variance
RTTVAR used for m_snd_smoothed_round_trip_time calculation.
bool m_active_connect
true if we connect() to server; false if we are to be/are accept()ed. Should be set once and not modi...
Int_state m_int_state
Current internal state of the socket.
Fine_duration m_snd_drop_timeout
The Drop Timeout: Time period between the next time m_snd_drop_timer schedules a Drop Timer and that ...
Sent_pkt_by_seq_num_map m_snd_flying_pkts_by_seq_num
The collection of all In-flight packets (including those queued up to send in pacing module),...
boost::movelib::unique_ptr< Send_bandwidth_estimator > m_snd_bandwidth_estimator
The outgoing available bandwidth estimator for this connection on this side.
util::Blob m_serialized_metadata
If !m_active_connect, this contains the serialized metadata that the user supplied on the other side ...
security_token_t m_security_token
Random security token used during SYN_ACK-SYN_ACK_ACK.
size_t node_sync_send(const Function< size_t(size_t max_data_size)> &snd_buf_feed_func_or_empty, const Fine_time_pt &wait_until, Error_code *err_code)
This is to sync_send() as node_send() is to send().
boost::movelib::unique_ptr< Congestion_control_strategy > m_snd_cong_ctl
The congestion control strategy in use for this connection on this side.
size_t sync_send_impl(const Const_buffer_sequence &data, const Fine_time_pt &wait_until, Error_code *err_code)
Same as sync_send() but uses a Fine_clock-based Fine_duration non-template type for implementation co...
Rcv_syn_rcvd_data_q m_rcv_syn_rcvd_data_q
The queue of DATA packets received while in Int_state::S_SYN_RCVD state before the Syn_ack_ack_packet...
Error_code m_disconnect_cause
The Error_code causing disconnection (if one has occurred or is occurring) on this socket; otherwise ...
Peer_socket(log::Logger *logger_ptr, util::Task_engine *task_engine, const Peer_socket_options &opts)
Constructs object; initializes most values to well-defined (0, empty, etc.) but not necessarily meani...
Definition: peer_socket.cpp:37
Fine_time_pt m_rcv_wnd_recovery_start_time
Time point at which m_rcv_in_rcv_wnd_recovery was last set to true.
Send_pacing_data m_snd_pacing_data
The state of outgoing packet pacing for this socket; segregated into a simple struct to keep Peer_soc...
Server_socket::Ptr m_originating_serv
For sockets that come a Server_socket, this is the inverse of Server_socket::m_connecting_socks: it i...
uint64_t security_token_t
Type used for m_security_token.
Sequence_number m_rcv_init_seq_num
The Initial Sequence Number (ISN) contained in the original Syn_packet or Syn_ack_packet we received.
const Remote_endpoint & remote_endpoint() const
Intended other side of the connection (regardless of success, failure, or current State).
Sent_pkt_by_sent_when_map m_snd_flying_pkts_by_sent_when
The collection of all In-flight packets, indexed by sequence number and ordered from most to least re...
State
State of a Peer_socket.
Open_sub_state
The sub-state of a Peer_socket when state is State::S_OPEN.
size_t sync_send(const Const_buffer_sequence &data, const boost::chrono::duration< Rep, Period > &max_wait, Error_code *err_code=0)
Blocking (synchronous) version of send().
~Peer_socket() override
Boring virtual destructor. Note that deletion is to be handled exclusively via shared_ptr,...
Definition: peer_socket.cpp:77
Error_code disconnect_cause() const
The error code that perviously caused state() to become State::S_CLOSED, or success code if state is ...
size_t m_rcv_pending_acks_size_at_recv_handler_start
Helper state, to be used while inside either Node::low_lvl_recv_and_handle() or async part of Node::a...
Sequence_number::seq_num_t order_num_t
Short-hand for order number type. 0 is reserved. Caution: Keep in sync with Drop_timer::packet_id_t.
flow_port_t local_port() const
The local Flow-protocol port chosen by the Node (if active or passive open) or user (if passive open)...
flow_port_t m_local_port
See local_port(). Should be set before user gets access to *this and not changed afterwards.
Peer_socket_receive_stats_accumulator m_rcv_stats
Stats regarding incoming traffic (and resulting outgoing ACKs) for this connection so far.
bool set_options(const Peer_socket_options &opts, Error_code *err_code=0)
Dynamically replaces the current options set (options()) with the given options set.
size_t m_rcv_last_sent_rcv_wnd
The last rcv_wnd value sent to the other side (in an ACK).
std::vector< Ack_packet::Individual_ack::Ptr > m_rcv_acked_packets
While Node::low_lvl_recv_and_handle() or async part of Node::async_wait_latency_then_handle_incoming(...
size_t node_send(const Function< size_t(size_t max_data_size)> &snd_buf_feed_func, Error_code *err_code)
Non-template helper for template send() that forwards the send() logic to Node::send().
std::list< boost::shared_ptr< Sent_packet > > m_snd_rexmit_q
If retransmission is on, this is the retransmission queue.
util::Scheduled_task_handle m_init_rexmit_scheduled_task
Connection attempt scheduled task; fires if an individual connection request packet is not answered w...
util::Timer m_rcv_delayed_ack_timer
Timer started, assuming delayed ACKs are enabled, when the first Individual_ack is placed onto an emp...
bool rexmit_on() const
Whether retransmission is enabled on this connection.
size_t node_sync_receive(const Function< size_t()> &rcv_buf_consume_func_or_empty, const Fine_time_pt &wait_until, Error_code *err_code)
This is to sync_receive() as node_receive() is to receive().
util::Lock_guard< Mutex > Lock_guard
Short-hand for RAII lock guard of Mutex.
Sent_pkt_by_seq_num_map::iterator Sent_pkt_ordered_by_seq_iter
Short-hand for m_snd_flying_pkts_by_seq_num iterator type.
Int_state
The state of the socket (and the connection from this end's point of view) for the internal state mac...
util::Lock_guard< Options_mutex > Options_lock
Short-hand for lock that acquires exclusive access to an Options_mutex.
Socket_buffer m_snd_buf
The Send buffer; user feeds data at the back; Node consumes data at the front.
void close_abruptly(Error_code *err_code=0)
Acts as if fatal error error::Code::S_USER_CLOSED_ABRUPTLY has been discovered on the connection.
size_t m_snd_rexmit_q_size
Equals m_snd_rexmit_q.size(). Kept since m_snd_rexmit_q.size() may be O(n) depending on implementatio...
size_t max_block_size() const
The maximum number of bytes of user data per received or sent packet on this connection.
size_t m_rcv_reassembly_q_data_size
With retransmission enabled, the sum of Received_packet::m_size over all packets stored in the reasse...
Node * node() const
Node that produced this Peer_socket.
Definition: peer_socket.cpp:95
Fine_time_pt m_snd_last_loss_event_when
The last time that Node has detected a packet loss event and so informed m_snd_cong_ctl by calling th...
Peer_socket_info info() const
Returns a structure containing the most up-to-date stats about this connection.
Socket_buffer m_rcv_buf
The Receive buffer; Node feeds data at the back; user consumes data at the front.
Recvd_pkt_map::iterator Recvd_pkt_iter
Short-hand for m_rcv_packets_with_gaps iterator type.
Mutex m_mutex
This object's mutex.
Peer_socket_send_stats_accumulator m_snd_stats
Stats regarding outgoing traffic (and resulting incoming ACKs) for this connection so far.
Sent_pkt_by_sent_when_map::iterator Sent_pkt_ordered_by_when_iter
Short-hand for m_snd_flying_pkts_by_sent_when iterator type.
Sent_pkt_by_seq_num_map::const_iterator Sent_pkt_ordered_by_seq_const_iter
Short-hand for m_snd_flying_pkts_by_seq_num const iterator type.
Peer_socket_info m_info_on_close
This is the final set of stats collected at the time the socket was moved to S_CLOSED m_state.
bool ensure_open(Error_code *err_code) const
Helper that is equivalent to Node::ensure_sock_open(this, err_code).
Sent_pkt_by_sent_when_map::const_iterator Sent_pkt_ordered_by_when_const_iter
Short-hand for m_snd_flying_pkts_by_sent_when const iterator type.
Sequence_number m_snd_init_seq_num
The Initial Sequence Number (ISN) used in our original SYN or SYN_ACK.
Opt_type opt(const Opt_type &opt_val_ref) const
Analogous to Node::opt() but for per-socket options.
std::string bytes_blocks_str(size_t bytes) const
Helper that, given a byte count, returns a string with that byte count and the number of max_block_si...
Fine_duration m_snd_smoothed_round_trip_time
Estimated current round trip time of packets, computed as a smooth value over the past individual RTT...
util::Mutex_recursive Mutex
Short-hand for reentrant mutex type.
size_t sync_receive_impl(const Mutable_buffer_sequence &target, const Fine_time_pt &wait_until, Error_code *err_code)
Same as sync_receive() but uses a Fine_clock-based Fine_duration non-template type for implementation...
std::vector< boost::shared_ptr< Individual_ack > > m_rcv_pending_acks
The received packets to be acknowledged in the next low-level ACK packet to be sent to the other side...
Peer_socket_options m_opts
This socket's per-socket set of options.
bool m_rcv_in_rcv_wnd_recovery
true indicates we are in a state where we've decided other side needs to be informed that our receive...
util::Scheduled_task_handle m_connection_timeout_scheduled_task
Connection timeout scheduled task; fires if the entire initial connection process does not complete w...
Peer_socket_options options() const
Copies this socket's option set and returns that copy.
Options_mutex m_opts_mutex
The mutex protecting m_opts.
std::map< Sequence_number, boost::shared_ptr< Received_packet > > Recvd_pkt_map
Short-hand for m_rcv_packets_with_gaps type; see that data member.
Recvd_pkt_map::const_iterator Recvd_pkt_const_iter
Short-hand for m_rcv_packets_with_gaps const iterator type.
Open_sub_state m_open_sub_state
See state().
boost::shared_ptr< Drop_timer > Drop_timer_ptr
Short-hand for shared_ptr to immutable Drop_timer (can't use Drop_timer::Ptr due to C++ and circular ...
Recvd_pkt_map m_rcv_packets_with_gaps
The sequence-number-ordered collection of all received-and-not-dropped-due-to-buffer-overflow packets...
size_t m_snd_flying_bytes
The number of bytes contained in all In-flight packets, used at least for comparison against the cong...
std::vector< order_num_t > m_snd_temp_pkts_marked_to_drop
Helper data structure to store the packet IDs of packets that are marked Dropped during a single run ...
size_t receive(const Mutable_buffer_sequence &target, Error_code *err_code=0)
Receives (consumes from the Receive buffer) bytes of data, up to a given maximum cumulative number of...
Fine_time_pt m_snd_last_data_sent_when
Time at which the last Data_packet low-level packet for this connection was sent.
size_t node_receive(const Function< size_t()> &rcv_buf_consume_func, Error_code *err_code)
Non-template helper for template receive() that forwards the receive() logic to Node::receive().
size_t send(const Const_buffer_sequence &data, Error_code *err_code=0)
Sends (adds to the Send buffer) the given bytes of data up to a maximum internal buffer size; and asy...
State state(Open_sub_state *open_sub_state=0) const
Current State of the socket.
Definition: peer_socket.cpp:85
util::Scheduled_task_handle m_rcv_wnd_recovery_scheduled_task
When m_rcv_in_rcv_wnd_recovery is true, this is the scheduled task to possibly send another unsolicit...
A per-Peer_socket module that tries to estimate the bandwidth available to the outgoing flow.
Definition: bandwidth.hpp:125
An internal net_flow sequence number identifying a piece of data.
Definition: seq_num.hpp:126
uint64_t seq_num_t
Raw sequence number type.
Definition: seq_num.hpp:138
A server socket able to listen on a single Flow port for incoming connections and return peer sockets...
Internal net_flow class that implements a socket buffer, as used by Peer_socket for Send and Receive ...
size_t feed_bufs_copy(const Const_buffer_sequence &data, size_t max_data_size)
Feeds (adds to the back of the byte buffer) the contents of the byte stream composed of the bytes in ...
size_t consume_bufs_copy(const Mutable_buffer_sequence &target_bufs)
Consumes (removes from the front of the internal byte buffer and returns them to the caller) a byte s...
Iterator iterator
For container compliance (hence the irregular capitalization): Iterator type.
Const_iterator const_iterator
For container compliance (hence the irregular capitalization): Const_iterator type.
An empty interface, consisting of nothing but a default virtual destructor, intended as a boiler-plat...
Definition: util.hpp:45
Convenience class template that endows the given subclass T with nested aliases Ptr and Const_ptr ali...
boost::shared_ptr< Server_socket > Ptr
Short-hand for ref-counted pointer to mutable values of type Target_type::element_type (a-la T*).
#define FLOW_ERROR_EXEC_AND_THROW_ON_ERROR(ARG_ret_type, ARG_method_name,...)
Narrow-use macro that implements the error code/exception semantics expected of most public-facing Fl...
Definition: error.hpp:357
Flow module containing the API and implementation of the Flow network protocol, a TCP-inspired stream...
Definition: node.cpp:25
uint16_t flow_port_t
Logical Flow port type (analogous to a UDP/TCP port in spirit but in no way relevant to UDP/TCP).
std::ostream & operator<<(std::ostream &os, const Congestion_control_selector::Strategy_choice &strategy_choice)
Serializes a Peer_socket_options::Congestion_control_strategy_choice enum to a standard ostream – the...
Definition: cong_ctl.cpp:146
Fine_time_pt chrono_duration_from_now_to_fine_time_pt(const boost::chrono::duration< Rep, Period > &dur)
Helper that takes a non-negative duration of arbitrary precision/period and converts it to Fine_durat...
Definition: util.hpp:42
boost::unique_lock< Mutex > Lock_guard
Short-hand for advanced-capability RAII lock guard for any mutex, ensuring exclusive ownership of tha...
Definition: util_fwd.hpp:265
boost::recursive_mutex Mutex_recursive
Short-hand for reentrant, exclusive mutex.
Definition: util_fwd.hpp:218
boost::shared_ptr< Scheduled_task_handle_state > Scheduled_task_handle
Black-box type that represents a handle to a scheduled task as scheduled by schedule_task_at() or sch...
boost::mutex Mutex_non_recursive
Short-hand for non-reentrant, exclusive mutex. ("Reentrant" = one can lock an already-locked-in-that-...
Definition: util_fwd.hpp:215
boost::asio::io_service Task_engine
Short-hand for boost.asio event service, the central class of boost.asio.
Definition: util_fwd.hpp:135
boost::asio::basic_waitable_timer< Fine_clock > Timer
boost.asio timer.
Definition: util_fwd.hpp:202
Blob_with_log_context<> Blob
A concrete Blob_with_log_context that compile-time-disables Basic_blob::share() and the sharing API d...
Definition: blob_fwd.hpp:60
boost::system::error_code Error_code
Short-hand for a boost.system error code (which basically encapsulates an integer/enum error code and...
Definition: common.hpp:503
Fine_clock::duration Fine_duration
A high-res time duration as computed from two Fine_time_pts.
Definition: common.hpp:411
Fine_clock::time_point Fine_time_pt
A high-res time point as returned by Fine_clock::now() and suitable for precise time math in general.
Definition: common.hpp:408
Metadata describing the data sent in the acknowledgment of an individual received packet.
const size_t m_data_size
Number of bytes in the packet's user data.
boost::shared_ptr< const Individual_ack > Const_ptr
Short-hand for ref-counted pointer to immutable objects of this class.
const unsigned int m_rexmit_id
Retransmit counter of the packet (as reported by sender).
const Sequence_number m_seq_num
Sequence number of first datum in packet.
Individual_ack(const Individual_ack &)=delete
Force direct member initialization even if no member is const.
Individual_ack & operator=(const Individual_ack &)=delete
Forbid copy assignment.
boost::shared_ptr< Individual_ack > Ptr
Short-hand for ref-counted pointer to mutable objects of this class.
const Fine_time_pt m_received_when
When was it received? Used for supplying delay before acknowledging (for other side's RTT calculation...
Metadata (and data, if retransmission is on) for a packet that has been received (and,...
const size_t m_size
Number of bytes in the Data_packet::m_data field of that packet.
Received_packet(log::Logger *logger_ptr, size_t size, util::Blob *src_data)
Constructs object by storing size of data and, if so instructed, the data themselves.
util::Blob m_data
Byte sequence equal to that of Data_packet::m_data of the packet.
Data store to keep timing related info when a packet is sent out.
const order_num_t m_order_num
Order number of the packet.
size_t m_sent_cwnd_bytes
The congestion window size (in bytes) that is used when the packet is sent out.
Fine_time_pt m_sent_time
The timestamp when the packet is sent out.
Metadata (and data, if retransmission is on) for a packet that has been sent one (if retransmission i...
Sent_packet(bool rexmit_on, boost::shared_ptr< Data_packet > packet, const Sent_when &sent_when)
Constructs object with the given values and m_acks_after_me at zero.
std::vector< Sent_when > m_sent_when
Time stamps, order numbers, and other info at the times when the different attempts (including origin...
const size_t m_size
Number of bytes in the Data_packet::m_data field of the sent packet.
const boost::shared_ptr< Data_packet > m_packet
If retransmission is on, this is the DATA packet itself that was sent; otherwise null.
uint16_t ack_count_t
Type used for m_acks_after_me.
ack_count_t m_acks_after_me
The number of times any packet with m_sent_when.back().m_order_num > this->m_sent_when....
A data store that keeps stats about the a Peer_socket connection.
Definition: info.hpp:456
A set of low-level options affecting a single Peer_socket.
Definition: options.hpp:36
Represents the remote endpoint of a Flow-protocol connection; identifies the UDP endpoint of the remo...
Definition: endpoint.hpp:93
The current outgoing packet pacing state, including queue of low-level packets to be sent,...
Definition: low_lvl_io.hpp:178