21#include <boost/algorithm/string.hpp>
29#define ADD_CONFIG_OPTION(ARG_opt, ARG_desc) \
30 Node_options::add_config_option(opts_desc, #ARG_opt, &target->ARG_opt, defaults_source.ARG_opt, ARG_desc, \
43 m_st_capture_interrupt_signals_internally(false),
47 m_st_low_lvl_max_buf_size(3 * 1024 * 1024),
49 m_st_timer_min_period(0),
51 m_dyn_max_packets_per_main_loop_iteration(0),
53 m_dyn_low_lvl_max_packet_size(1124),
55 m_dyn_guarantee_one_low_lvl_in_buf_per_socket(true)
60template<
typename Opt_type>
62 const std::string& opt_id,
63 Opt_type* target_val,
const Opt_type& default_val,
64 const char* description,
bool printout_only)
66 using boost::program_options::value;
69 opts_desc->add_options()
70 (
opt_id_to_str(opt_id).c_str(), value<Opt_type>()->default_value(default_val));
74 opts_desc->add_options()
75 (
opt_id_to_str(opt_id).c_str(), value<Opt_type>(target_val)->default_value(default_val),
87 "If and only if this is true, the Node will detect SIGINT and SIGTERM (or your OS's version thereof); "
88 "upon seeing such a signal, it will fire Node::interrupt_all_waits(), which will interrupt all "
89 "blocking operations, conceptually similarly to POSIX's EINTR. If this is true, the user may register "
90 "their own signal handler(s) (for any purpose whatsoever) using boost::asio::signal_set. However, behavior "
91 "is undefined if the program registers signal handlers via any other API, such as sigaction() or signal(). "
92 "If you need to set up such a non-signal_set signal handler, AND you require EINTR-like behavior, "
93 "then (1) set this option to false; (2) trap SIGINT and SIGTERM yourself; (3) in your handlers for the "
94 "latter, simply call Node::interrupt_all_waits(). Similarly, if you want custom behavior regarding "
95 "Node::interrupt_all_waits(), feel free to call it whenever you want (not necessarily even from a signal "
96 "handler), and set this to false. However, if a typical, common-sense behavior is what you're after -- and "
97 "either don't need additional signal handling or are OK with using signal_set for it -- then setting this to "
98 "true is a good option.");
101 "The max size to ask the OS to set our UDP socket's receive buffer to in order to minimize loss "
102 "if we can't process datagrams fast enough. This should be as high as possible while being "
103 "\"acceptable\" in terms of memory. However, the OS will probably have its own limit and may well "
104 "pick a limit that is the minimum of that limit and this value.");
107 "A time period such that the boost.asio timer implementation for this platform is able to "
108 "accurately able to schedule events within this time period or greater. If you select 0, the "
109 "code will decide what this value based on the platform, but its logic for this may or may not "
110 "be correct (actually it will probably be correct but possibly too conservative [large], causing "
111 "timer coarseness in mechanisms like packet pacing).");
114 "The UDP net-stack may deliver 2 or more datagrams to the Flow Node at the same time. To lower overhead "
115 "and increase efficiency, Flow will process all such datagrams -- and any more that may arrive during this "
116 "processing -- before preparing any resulting outgoing messages, such as acknowledgments or more data packets. "
117 "In high-speed conditions this may result in excessive burstiness of outgoing traffic. This option's value "
118 "places a limit on the number of datagrams to process before constructing and sending any resulting outgoing "
119 "messages to prevent this burstiness. If 0, there is no limit.");
122 "Any incoming low-level (UDP) packet will be truncated to this size. This should be well above "
123 "per-socket max-block-size (# of bytes of user payload per DATA packet). There will only be one buffer "
124 "of this size in memory at a time, so no need to be too stingy, but on the other hand certain "
125 "allocation/deallocation behavior may cause performance drops if this unnecessarily large.");
128 "This very inside-baseball setting controls the allocation/copy behavior of the UDP receive-deserialize "
129 "operation sequence. When enabled, there is exactly one input buffer large enough to "
130 "hold any one serialized incoming packet; any deserialized data (including DATA and ACK payloads) are "
131 "stored in separately allocated per-packet buffers; and and the input buffer is repeatedly reused "
132 "without reallocation. When disabled, however, at least some packet types (most notably DATA) will "
133 "use the zero-copy principle, having the deserializer take ownership of the input buffer and access pieces "
134 "inside it as post-deserialization values (most notably the DATA payload); in this case the input buffer "
135 "has to be reallocated between UDP reads. As of this writing the former behavior seems to be somewhat "
136 "faster, especially if low-lvl-max-packet-size is unnecessarily large; but arguably the zero-copy behavior "
137 "may become faster if some implementation details related to this change. So this switch seemed worth "
157 return os << opts_desc;
162 using boost::algorithm::starts_with;
163 using boost::algorithm::replace_all;
166 const string STATIC_PREFIX =
"m_st_";
167 const string DYNAMIC_PREFIX =
"m_dyn_";
170 if (starts_with(opt_id, STATIC_PREFIX))
172 str.erase(0, STATIC_PREFIX.size());
174 else if (starts_with(opt_id, DYNAMIC_PREFIX))
176 str.erase(0, DYNAMIC_PREFIX.size());
179 replace_all(str,
"_",
"-");
188 m_st_max_block_size(1024),
190 m_st_connect_retransmit_period(boost::chrono::milliseconds(125)),
192 m_st_connect_retransmit_timeout(boost::chrono::seconds(3)),
195 m_st_snd_buf_max_size(6 * 1024 * 1024),
197 m_st_rcv_buf_max_size(m_st_snd_buf_max_size),
199 m_st_rcv_flow_control_on(true),
201 m_st_rcv_buf_max_size_slack_percent(10),
206 m_st_rcv_buf_max_size_to_advertise_percent(50),
215 m_st_rcv_max_packets_after_unrecvd_packet_ratio_percent(220),
217 m_st_delayed_ack_timer_period(boost::chrono::milliseconds(200)),
219 m_st_max_full_blocks_before_ack_send(2),
220 m_st_rexmit_on(true),
222 m_st_max_rexmissions_per_packet(15),
224 m_st_init_drop_timeout(boost::chrono::seconds(1)),
226 m_st_drop_packet_exactly_after_drop_timeout(false),
228 m_st_drop_all_on_drop_timeout(true),
230 m_st_out_of_order_ack_restarts_drop_timer(true),
232 m_st_snd_pacing_enabled(true),
236 m_st_snd_bandwidth_est_sample_period_floor(boost::chrono::milliseconds(50)),
240 m_st_cong_ctl_init_cong_wnd_blocks(0),
252 m_st_cong_ctl_max_cong_wnd_blocks(640),
254 m_st_cong_ctl_cong_wnd_on_drop_timeout_blocks(1),
256 m_st_cong_ctl_cong_avoidance_increment_blocks(0),
258 m_st_cong_ctl_classic_wnd_decay_percent(0),
260 m_dyn_drop_timeout_ceiling(boost::chrono::seconds(60)),
262 m_dyn_drop_timeout_backoff_factor(2),
264 m_dyn_rcv_wnd_recovery_timer_period(boost::chrono::seconds(1)),
266 m_dyn_rcv_wnd_recovery_max_period(boost::chrono::minutes(1))
278 using boost::algorithm::join;
281 vector<string> cong_strategy_ids;
283 const string str_cong_strategy_ids = join(cong_strategy_ids,
", ");
287 "The size of block that we will strive to (and will, assuming at least that many bytes are "
288 "available in Send buffer) pack into each outgoing DATA packet. It is assumed the other side is "
289 "following the same policy (any packets that do not -- i.e., exceed this size -- are dropped). "
290 "This is an important control; the higher it is the better for performance AS LONG AS it's not so "
291 "high it undergoes IP fragmentation (or does that even happen in UDP? if not, even worse -- "
292 "it'll just be dropped and not sent!). The performance impact is major; e.g., assuming no "
293 "fragmentation/dropping, we've seen a ~1500 byte MBS result in 20-30% higher throughput than "
295 "Additionally, if using net_flow module with no reliability feature -- i.e., if you want to perform FEC "
296 "or something else outside the Flow protocol -- then it is absolutely essential that EVERY send*() call "
297 "provides a buffer whose size is a multiple of max-block-size. Otherwise packet boundaries "
298 "will not be what you expect, and you will get what seems to be corrupted data at the "
299 "application layer (since our stream-based API has no way of knowing where your message begins "
300 "or ends). Alternatively you can encode message terminators or packet sizes, but since in "
301 "unreliable mode integrity of a given block is guaranteed only if all blocks align with "
302 "max-block-size boundaries, you'll still probably be screwed.");
305 "How often to resend SYN or SYN_ACK while SYN_ACK or SYN_ACK_ACK, respectively, has not been received. "
306 "In other words, this controls the pause between retries during the connection opening phase, by either side, "
307 "if the other side is not responding with the appropriate response. "
308 "Examples: \"250 ms\", \"1 s\".");
311 "How long from the first SYN or SYN_ACK to allow for connection handshake before aborting connection. "
312 "Examples: \"5500 ms\", \"60 seconds\".");
315 "Maximum number of bytes that the Send buffer can hold. This determines how many bytes user can "
316 "send() while peer cannot send over network until send() refuses to take any more bytes. Note "
317 "that any value given will be increased, if necessary, to the nearest multiple of "
318 "max-block-size. This is important to preserve message boundaries when operating in "
319 "unreliable mode (guaranteed max-block-size-sized chunks of data will be sent out in their "
320 "entirety instead of being fragmented).");
323 "Maximum number of bytes that the Receive buffer can hold. This determines how many bytes "
324 "can be received in the background by the Node without user doing any receive()s. "
325 "It is also rounded up to to the nearest multiple of max-block-size.");
328 "Whether flow control (a/k/a receive window a/k/a rcv_wnd management) is enabled. If this is "
329 "disabled, an infinite rcv_wnd will always be advertised to the sender; so if the Receive buffer "
330 "is exceeded packets are dropped as normal, but the sender will not know it should stop sending "
331 "until Receive buffer space is freed. If this is enabled, we keep the sender informed of how "
332 "much Receive buffer space is available, so it can suspend the flow as necessary.");
335 "% of rcv-buf-max-size such that if Receive buffer stores up to (100 + this many) % of "
336 "rcv-buf-max-size bytes, the bytes will still be accepted. In other words, this allows the max "
337 "Receive buffer to hold slightly more than rcv-buf-max-size bytes. However, the current Receive "
338 "buffer capacity advertised to the other side of the connection will be based on the "
339 "non-inflated rcv-buf-max-size. This option provides some protection against the fact that the "
340 "receive window value sent to the other side will lag behind reality somewhat.");
343 "% of rcv-buf-max-size that has to be freed, since the last receive window advertisement, via "
344 "user popping data from Receive buffer, before we must send a receive window advertisement. "
345 "Normally we send rcv_wnd to the other side opportunistically in every ACK; but there can be "
346 "situations when there is no packets to acknowledge, and hence we must specifically make a "
347 "packet just to send over rcv_wnd. Typically we should only need to do this if the buffer was "
348 "exceeded and is now significantly freed. This value must be in [1, 100], but anything over 50 "
349 "is probably pushing it.");
352 "The limit on the size of Peer_socket::m_rcv_packets_with_gaps, expressed as what percentage the "
353 "maximal size of that structure times max-block-size is of the maximal Receive buffer size. For "
354 "example, if this is 200, then Peer_socket::m_rcv_packets_with_gaps can represent up to roughly "
355 "2x as many full-sized blocks as the Receive buffer can. This should also by far exceed any "
356 "sender congestion window max size to avoid packet loss. m_rcv_packets_with_gaps consists of all packets "
357 "such that at least one packet sequentially preceding them has not yet been received.");
360 "The maximum amount of time to delay sending ACK with individual packet's acknowledgment since "
361 "receiving that individual packet. If set to zero duration, any given individual acknowledgment "
362 "is sent within a non-blocking amount of time of its DATA packet being read. Inspired by RFC "
363 "1122-4.2.3.2. Examples: \"200 ms\", \"550200 microseconds\".");
366 "If there are at least this many TIMES max-block-size bytes' worth of individual acknowledgments "
367 "to be sent, then the delayed ACK timer is to be short-circuited, and the accumulated "
368 "acknowledgments are to be sent as soon as possible. Inspired by RFC 5681.");
371 "Whether to enable reliability via retransmission. If false, a detected lost packet may have "
372 "implications on congestion control (speed at which further data are sent) but will not cause "
373 "that packet to be resent; receiver application code either has to be OK with missing packets or "
374 "must implement its own reliability (e.g., FEC). Packets may also be delivered in an order "
375 "different from the way they were sent. If true, the receiver need not worry about it, as any "
376 "lost packets will be retransmitted with no participation from the application code on either "
377 "side, as in TCP. Also as in TCP, this adds order preservation, so that the stream of bytes sent "
378 "will be exactly equal to the stream of bytes received. Retransmission removes the requirement "
379 "for the very exacting block-based way in which send() and friends must be called. "
380 "This option must have the same value on both sides of the connection, or the server will refuse "
384 "If retransmission is enabled and a given packet is retransmitted this many times and has to be "
385 "retransmitted yet again, the connection is reset. Should be positive.");
388 "Once socket enters ESTABLISHED state, this is the value for m_snd_drop_timeout before the first RTT "
389 "measurement is made (the first valid acknowledgment arrives). Example: \"2 seconds\".");
392 "If true, when scheduling Drop Timer, schedule it for Drop Timeout relative to the send time of "
393 "the earliest In-flight packet at the time. If false, also schedule DTO relative to the time of "
394 "scheduling. The latter is less aggressive and is recommended by RFC 6298.");
397 "If true, when the Drop Timer fires, all In-flight packets are to be considered Dropped (and "
398 "thus the timer is to be disabled). If false, only the earliest In-flight packet is to be "
399 "considered Dropped (and thus the timer is to restart). RFC 6298 recommends false. true is more aggressive.");
402 "If an In-flight packet is acknowledged, but it is not the earliest In-flight packet (i.e., it's "
403 "an out-of-order acknowledgment), and this is true, the timer is restarted. Otherwise the timer "
404 "continues to run. The former is less aggressive. RFC 6298 wording is ambiguous on what it "
405 "recommends (not clear if cumulative ACK only, or if SACK also qualifies).");
408 "When estimating the available send bandwidth, each sample must be compiled over at least this long "
409 "of a time period, even if the SRTT is lower. Normally a sample is collected over at least an SRTT, but "
410 "computing a bandwidth sample over a quite short time period can produce funky results, hence this floor. "
411 "Send bandwidth estimation is used at least for some forms of congestion control.");
414 "Enables or disables packet pacing, which attempts to spread out, without sacrificing overall "
415 "send throughput, outgoing low-level packets to prevent loss. If disabled, any packet that is "
416 "allowed by congestion/flow control to be sent over the wire is immediately sent to the UDP "
417 "net-stack; so for example if 200 packets are ready to send and are allowed to be sent, they're sent "
418 "at the same time. If enabled, they will be spread out over a reasonable time period instead. "
419 "Excessive burstiness can lead to major packet drops, so this can really help.");
422 (
string(
"The congestion control algorithm to use for the connection or connections. The choices are: [") +
423 str_cong_strategy_ids +
"].").c_str());
426 "The initial size of the congestion window, given in units of max-block-size-sized blocks. "
427 "The special value 0 means RFC 5681's automatic max-block-size-based computation should "
431 "The constant that determines the CWND limit in congestion_window_at_limit() and "
432 "clamp_congestion_window() (in multiple of max-block-size). When choosing this value, use these "
434 "(1) This limits total outgoing throughput. The throughput B will be <= CWND/RTT, where RTT is "
435 "roughly the RTT of the connection, and CWND == S_MAX_CONG_WND_BLOCKS * max-block-size. "
436 "Therefore, choose B and RTT values and set S_MAX_CONG_WND_BLOCKS = B * RTT / max-block-size "
437 "(B in bytes/second, RTT in seconds). "
438 "(2) Until we implement Receive window, this value should be much (say, 4x) less than the size "
439 "of the Receive buffer, to avoid situations where even a well-behaving user (i.e., user that "
440 "receive()s all data ASAP) cannot keep up with reading data off Receive buffer, forcing "
441 "net_flow to drop large swaths of incoming traffic. If CWND is much smaller than Receive "
442 "buffer size, then this avoids that problem.");
445 "On Drop Timeout, set congestion window to this value times max-block-size.");
448 "The multiple of max-block-size by which to increment CWND in congestion avoidance mode after receiving "
449 "at least a full CWND's worth of clean acknowledgments. RFC 5681 (classic Reno) mandates this is set to 1, "
450 "but we allow it to be overridden. The special value 0 causes the RFC value to be used.");
453 "In classic congestion control, RFC 5681 specified the window should be halved on loss; this "
454 "option allows one to use a customer percentage instead. This should be a value in [1, "
455 "100] to have the window decay to that percentage of its previous value, or 0 to use the RFC "
456 "5681-recommended constant (50).");
459 "Whenever the Drop Timer fires, upon the requisite Dropping of packet(s), the DTO (Drop Timeout) "
460 "is set to its current value times this factor, and then the timer is rescheduled accordingly. "
461 "RFC 6298 recommends 2. Another value might be 1 (disable feature). The lower the more "
465 "When the mode triggered by rcv-buf-max-size-to-advertise-percent being exceeded is in effect, "
466 "to counteract the possibility of ACK loss the receive window is periodically advertised "
467 "subsequently -- with the period given by this option -- until either some new data arrive or "
468 "rcv-wnd-recovery-max-period is exceeded. Example: \"5 s\".");
471 "Approximate amount of time since the beginning of rcv_wnd recovery due to "
472 "rcv-buf-max-size-to-advertise-percent until we give up and end that phase. Example: \"30 s\".");
475 "Ceiling to impose on the Drop Timeout. Example: \"120 s\".");
489 return os << opts_desc;
492#undef ADD_CONFIG_OPTION
static void get_ids(std::vector< std::string > *ids)
Returns a list of strings, called IDs, each of which textually represents a distinct Congestion_contr...
Flow module containing the API and implementation of the Flow network protocol, a TCP-inspired stream...
std::ostream & operator<<(std::ostream &os, const Congestion_control_selector::Strategy_choice &strategy_choice)
Serializes a Peer_socket_options::Congestion_control_strategy_choice enum to a standard ostream – the...
A set of low-level options affecting a single Flow Node, including Peer_socket objects and other obje...
static void setup_config_parsing_helper(Options_description *opts_desc, Node_options *target, const Node_options &defaults_source, bool printout_only)
Loads the full set of boost.program_options config options into the given Options_description,...
void setup_config_parsing(Options_description *opts_desc)
Modifies a boost.program_options options description object to enable subsequent parsing of a command...
Node_options()
Constructs a Node_options with values equal to those used by Node when the Node creator chooses not t...
Peer_socket_options m_dyn_sock_opts
The set of per-Peer_socket options in this per-Node set of options.
size_t m_st_low_lvl_max_buf_size
The max size to ask the OS to set our UDP socket's receive buffer to in order to minimize loss if we ...
size_t m_dyn_low_lvl_max_packet_size
Any incoming low-level (UDP) packet will be truncated to this size.
unsigned int m_dyn_max_packets_per_main_loop_iteration
The UDP net-stack may deliver 2 or more datagrams to the Flow Node at the same time.
static void add_config_option(Options_description *opts_desc, const std::string &opt_id, Opt_type *target_val, const Opt_type &default_val, const char *description, bool printout_only)
A helper that adds a single option to a given Options_description, for use either in printing out the...
static std::string opt_id_to_str(const std::string &opt_id)
Helper that, for a given option m_blah, takes something like "m_blah_blah" and returns the similar mo...
Fine_duration m_st_timer_min_period
A time period such that the boost.asio timer implementation for this platform is able to accurately a...
bool m_dyn_guarantee_one_low_lvl_in_buf_per_socket
This very inside-baseball setting controls the allocation/copy behavior of the UDP receive-deserializ...
bool m_st_capture_interrupt_signals_internally
If and only if this is true, the Node will detect SIGINT and SIGTERM (or your OS's version thereof); ...
Peer_socket_options::Options_description Options_description
Short-hand for boost.program_options config options description. See setup_config_parsing().
A set of low-level options affecting a single Peer_socket.
Fine_duration m_st_init_drop_timeout
Once socket enters ESTABLISHED state, this is the value for Peer_socket::m_snd_drop_timeout until the...
unsigned int m_st_max_rexmissions_per_packet
If retransmission is enabled and a given packet is retransmitted this many times and has to be retran...
size_t m_st_rcv_buf_max_size
Maximum number of bytes that the Receive buffer can hold.
size_t m_st_cong_ctl_max_cong_wnd_blocks
The constant that determines the CWND limit in Congestion_control_classic_data::congestion_window_at_...
Fine_duration m_st_snd_bandwidth_est_sample_period_floor
When estimating the available send bandwidth, each sample must be compiled over at least this long of...
unsigned int m_st_cong_ctl_cong_avoidance_increment_blocks
The multiple of max-block-size by which to increment CWND in congestion avoidance mode after receivin...
size_t m_st_cong_ctl_cong_wnd_on_drop_timeout_blocks
On Drop Timeout, set congestion window to this value times max-block-size.
Fine_duration m_dyn_rcv_wnd_recovery_max_period
Approximate amount of time since the beginning of rcv_wnd recovery due to rcv-buf-max-size-to-adverti...
Congestion_control_strategy_choice
The possible choices for congestion control strategy for the socket.
size_t m_st_cong_ctl_init_cong_wnd_blocks
The initial size of the congestion window, given in units of max-block-size-sized blocks.
bool m_st_drop_packet_exactly_after_drop_timeout
If true, when scheduling Drop Timer, schedule it for Drop Timeout relative to the send time of the ea...
bool m_st_rexmit_on
Whether to enable reliability via retransmission.
size_t m_st_snd_buf_max_size
Maximum number of bytes that the Send buffer can hold.
Fine_duration m_st_connect_retransmit_period
How often to resend SYN or SYN_ACK while SYN_ACK or SYN_ACK_ACK, respectively, has not been received.
unsigned int m_st_rcv_buf_max_size_slack_percent
% of rcv-buf-max-size such that if Receive buffer stores up to (100 + this many) % of rcv-buf-max-siz...
void setup_config_parsing(Options_description *opts_desc)
Analogous to Node_options::setup_config_parsing().
Peer_socket_options()
Constructs a Peer_socket_options with values equal to those used by Node when the Node creator choose...
Fine_duration m_dyn_rcv_wnd_recovery_timer_period
When the mode triggered by rcv-buf-max-size-to-advertise-percent being exceeded is in effect,...
bool m_st_drop_all_on_drop_timeout
If true, when the Drop Timer fires, all In-flight packets are to be considered Dropped (and thus the ...
Fine_duration m_st_connect_retransmit_timeout
How long from the first SYN or SYN_ACK to allow for connection handshake before aborting connection.
Congestion_control_strategy_choice m_st_cong_ctl_strategy
The congestion control algorithm to use for the connection or connections.
static void setup_config_parsing_helper(Options_description *opts_desc, Peer_socket_options *target, const Peer_socket_options &defaults_source, bool printout_only)
Analogous to Node_options::setup_config_parsing_helper().
boost::program_options::options_description Options_description
Short-hand for boost.program_options config options description. See setup_config_parsing().
size_t m_st_max_full_blocks_before_ack_send
If there are at least this many TIMES max-block-size bytes' worth of individual acknowledgments to be...
bool m_st_snd_pacing_enabled
Enables or disables packet pacing, which attempts to spread out, without sacrificing overall send thr...
bool m_st_rcv_flow_control_on
Whether flow control (a/k/a receive window a/k/a rcv_wnd management) is enabled.
Fine_duration m_st_delayed_ack_timer_period
The maximum amount of time to delay sending ACK with individual packet's acknowledgment since receivi...
unsigned int m_dyn_drop_timeout_backoff_factor
Whenever the Drop Timer fires, upon the requisite Dropping of packet(s), the DTO (Drop Timeout) is se...
bool m_st_out_of_order_ack_restarts_drop_timer
If an In-flight packet is acknowledged, but it is not the earliest In-flight packet (i....
size_t m_st_max_block_size
The size of block that we will strive to (and will, assuming at least that many bytes are available i...
unsigned int m_st_cong_ctl_classic_wnd_decay_percent
In classic congestion control, RFC 5681 specifies the window should be halved on loss; this option al...
unsigned int m_st_rcv_buf_max_size_to_advertise_percent
% of rcv-buf-max-size that has to be freed, since the last receive window advertisement,...
unsigned int m_st_rcv_max_packets_after_unrecvd_packet_ratio_percent
The limit on the size of Peer_socket::m_rcv_packets_with_gaps, expressed as what percentage the maxim...
Fine_duration m_dyn_drop_timeout_ceiling
Ceiling to impose on the Drop Timeout.