Flow 1.0.0
Flow project: Public API.
|
Flow module containing tools for profiling and optimization. More...
Classes | |
class | Checkpointing_timer |
The central class in the perf Flow module, this efficiently times the user's operation, with a specified subset of timing methods; and with the optional ability to time intermediate checkpoints within the overall operation. More... | |
struct | Duration_set |
Convenience wrapper around an array<Duration, N> , which stores a duration for each of the N possible clock types in perf::Clock_type. More... | |
struct | Time_pt_set |
Convenience wrapper around an array<Time_pt, N> , which stores a time point for each of the N possible clock types in perf::Clock_type. More... | |
Typedefs | |
using | Time_pt = Fine_time_pt |
Short-hand for a high-precision boost.chrono point in time, formally equivalent to flow::Fine_time_pt. More... | |
using | Duration = Fine_duration |
Short-hand for a high-precision boost.chrono duration, formally equivalent to flow::Fine_duration. More... | |
using | duration_rep_t = Duration::rep |
The raw type used in Duration to store its clock ticks. More... | |
using | Clock_types_subset = std::bitset< size_t(Clock_type::S_END_SENTINEL)> |
Short-hand for a bit-set of N bits which represents the presence or absence of each of the N possible clock types in perf::Clock_type. More... | |
using | Checkpointing_timer_ptr = boost::shared_ptr< Checkpointing_timer > |
Short-hand for ref-counting pointer to Checkpointing_timer. More... | |
Enumerations | |
enum class | Clock_type : size_t { S_REAL_HI_RES = 0 , S_CPU_USER_LO_RES , S_CPU_SYS_LO_RES , S_CPU_TOTAL_HI_RES , S_CPU_THREAD_TOTAL_HI_RES , S_END_SENTINEL } |
Clock types supported by flow::perf module facilities, perf::Checkpointing_timer in particular. More... | |
Functions | |
std::ostream & | operator<< (std::ostream &os, const Checkpointing_timer::Checkpoint &checkpoint) |
Prints string representation of the given Checkpoint to the given ostream . More... | |
std::ostream & | operator<< (std::ostream &os, const Checkpointing_timer &timer) |
Prints string representation of the given Checkpointing_timer (whether with original data or an aggregated-result timer) to the given ostream . More... | |
Duration_set | operator- (const Time_pt_set &to, const Time_pt_set &from) |
Returns a Duration_set representing the time that passed since from to to (negative if to happened earlier), for each Clock_type stored. More... | |
Duration_set & | operator+= (Duration_set &target, const Duration_set &to_add) |
Advances each Duration in the target Duration_set by the given respective addend Duration s (negative Duration causes advancing backwards). More... | |
Time_pt_set & | operator+= (Time_pt_set &target, const Duration_set &to_add) |
Advances each Time_pt in the target Time_pt_set by the given respective addend Duration s (negative Duration causes advancing backwards). More... | |
Duration_set & | operator*= (Duration_set &target, uint64_t mult_scale) |
Scales each Duration in the target Duration_set by the given numerical constant. More... | |
Duration_set & | operator/= (Duration_set &target, uint64_t div_scale) |
Divides each Duration in the target Duration_set by the given numerical constant. More... | |
std::ostream & | operator<< (std::ostream &os, Clock_type clock_type) |
Prints string representation of the given clock type enum value to the given ostream . More... | |
std::ostream & | operator<< (std::ostream &os, const Duration_set &duration_set) |
Prints string representation of the given Duration_set value to the given ostream . More... | |
template<typename Accumulator , typename Func > | |
auto | timed_function (Clock_type clock_type, Accumulator *accumulator, Func &&function) |
Constructs a closure that times and executes void -returning function() , adding the elapsed time with clock type clock_type – as raw ticks of perf::Duration – to accumulator . More... | |
template<typename Accumulator , typename Func > | |
auto | timed_function_nvr (Clock_type clock_type, Accumulator *accumulator, Func &&function) |
Constructs a closure that times and executes non-void -returning function() , adding the elapsed time with clock type clock_type – as raw ticks of perf::Duration – to accumulator . More... | |
template<typename Accumulator , typename Handler > | |
auto | timed_handler (Clock_type clock_type, Accumulator *accumulator, Handler &&handler) |
Identical to timed_function() but suitable for boost.asio-targeted handler functions. More... | |
Flow module containing tools for profiling and optimization.
As of this writing (around the time the flow::perf Flow module was created) this centers on Checkpointing_timer, a facility for measuring real and processor time elapsed during the arbitrary measured operation. That said, generally speaking, this module is meant to be a "kitchen-sink" set of facilities fitting the sentence at the very top of this doc header.
using flow::perf::Checkpointing_timer_ptr = typedef boost::shared_ptr<Checkpointing_timer> |
Short-hand for ref-counting pointer to Checkpointing_timer.
Original use case is to allow Checkpointing_timer::Aggregator to generate and return Checkpointing_timer objects with minimal headaches for user.
using flow::perf::Clock_types_subset = typedef std::bitset<size_t(Clock_type::S_END_SENTINEL)> |
Short-hand for a bit-set of N bits which represents the presence or absence of each of the N possible clock types in perf::Clock_type.
This is what we use to represent such things, as it is more compact and (we suspect) faster in typical operations, especially "is clock type T enabled?".
If C is a Clock_types_subset
, and T is a Clock_type
, then bit C[size_t(T)] is true
if and only if T is in C.
Potential gotcha: bit-sets are indexed right-to-left (LSB-to-MSB); so if the 0th (in enum
) clock type is enabled and others are disabled, then a print-out of such a Clock_types_subset would be 0...0001, not 1000...0. So watch out when reading logs.
using flow::perf::Duration = typedef Fine_duration |
Short-hand for a high-precision boost.chrono duration, formally equivalent to flow::Fine_duration.
The alias exists 1/2 for brevity, 1/2 to declare the standardly-used duration type in flow::perf Flow module.
using flow::perf::duration_rep_t = typedef Duration::rep |
The raw type used in Duration to store its clock ticks.
It is likely int64_t
, but try not to rely on that directly.
Useful, e.g., in atomic<duration_rep_t>
, when one wants to perform high-performance operations like +=
and fetch_add()
on atomic<>
s: these do not exist for chrono::duration
, because the latter is not an integral type.
using flow::perf::Time_pt = typedef Fine_time_pt |
Short-hand for a high-precision boost.chrono point in time, formally equivalent to flow::Fine_time_pt.
The alias exists 1/2 for brevity, 1/2 to declare the standardly-used time point type in flow::perf Flow module.
|
strong |
Clock types supported by flow::perf module facilities, perf::Checkpointing_timer in particular.
These are used, among other things, as array/vector
indices and therefore numerically equal 0, 1, .... Clock_type::S_END_SENTINEL is an invalid clock type whose numerical value equals the number of clock types available.
S_REAL_HI_RES
would always be preferable. Nevertheless it would be interesting to "officially" see its characteristics including in particular (1) resolution and (2) its own perf cost especially vs. S_REAL_HI_RES
which we know is quite fast itself. This may also help a certain to-do listed as of this writing in the doc header of flow::log FLOW_LOG_WITHOUT_CHECKING() (the main worker bee of the log system, the one that generates each log time stamp). Duration_set & operator*= | ( | Duration_set & | target, |
uint64_t | mult_scale | ||
) |
Scales each Duration
in the target Duration_set by the given numerical constant.
operator*=(Duration_set)
by a potentially negative number; same for division.target | The set of Duration s each of which may be modified. |
mult_scale | Constant by which to multiply each target Duration . |
target
to enable standard *=
semantics. Duration_set & operator+= | ( | Duration_set & | target, |
const Duration_set & | to_add | ||
) |
Advances each Duration
in the target Duration_set by the given respective addend Duration
s (negative Duration causes advancing backwards).
target | The set of Duration s each of which may be modified. |
to_add | The set of Duration s each of which is added to a target Duration . |
target
to enable standard +=
semantics. Time_pt_set & flow::perf::operator+= | ( | Time_pt_set & | target, |
const Duration_set & | to_add | ||
) |
Advances each Time_pt
in the target Time_pt_set by the given respective addend Duration
s (negative Duration
causes advancing backwards).
target | The set of Time_pt s each of which may be modified. |
to_add | The set of Duration s each of which is added to a target Time_pt . |
target
to enable standard +=
semantics. Duration_set operator- | ( | const Time_pt_set & | to, |
const Time_pt_set & | from | ||
) |
Returns a Duration_set representing the time that passed since from
to to
(negative if to
happened earlier), for each Clock_type
stored.
to | The minuend set of time points. |
from | The subtrahend set of time points. |
Duration_set & operator/= | ( | Duration_set & | target, |
uint64_t | div_scale | ||
) |
Divides each Duration
in the target Duration_set by the given numerical constant.
target | The set of Duration s each of which may be modified. |
div_scale | Constant by which to divide each target Duration . |
target
to enable standard /=
semantics. std::ostream & flow::perf::operator<< | ( | std::ostream & | os, |
Clock_type | clock_type | ||
) |
Prints string representation of the given clock type enum
value to the given ostream
.
os | Stream to which to write. |
clock_type | Object to serialize. |
os
. std::ostream & operator<< | ( | std::ostream & | os, |
const Checkpointing_timer & | timer | ||
) |
Prints string representation of the given Checkpointing_timer
(whether with original data or an aggregated-result timer) to the given ostream
.
Note this is multi-line output that does not end in newline.
os | Stream to which to write. |
timer | Object to serialize. |
os
. std::ostream & operator<< | ( | std::ostream & | os, |
const Checkpointing_timer::Checkpoint & | checkpoint | ||
) |
Prints string representation of the given Checkpoint
to the given ostream
.
See Checkpointing_timer::checkpoint() and Checkpointing_timer::checkpoints().
os | Stream to which to write. |
checkpoint | Object to serialize. |
os
. std::ostream & operator<< | ( | std::ostream & | os, |
const Duration_set & | duration_set | ||
) |
Prints string representation of the given Duration_set value to the given ostream
.
os | Stream to which to write. |
duration_set | Object to serialize. |
os
. auto flow::perf::timed_function | ( | Clock_type | clock_type, |
Accumulator * | accumulator, | ||
Func && | function | ||
) |
Constructs a closure that times and executes void
-returning function()
, adding the elapsed time with clock type clock_type
– as raw ticks of perf::Duration – to accumulator
.
Consider other overload(s) and similarly named functions as well. With this one you get:
function()
is treated as returning void
(any return value is ignored).function()
is a generally-used timed function: not necessarily a boost.asio
or flow::async handler. Any associated executor (such as a strand
) will be lost. See timed_handler(), if you have a handler.+=(duration_rep_t)
, where perf::duration_rep_t is – as a reminder – a raw integer type like int64_t
. If accumulation may occur in a multi-threaded situation concurrently, this can improve performance vs. using an explicit lock, if one uses Accumulator
= atomic<duration_rep_t>
.chrono
-style type safety: It is up to you to interpret the *accumulator
-stored ticks as their appropriate units.Time a function that happens to take a couple of args. Don't worry about the timing also happening concurrenty: not using atomic
.
Same thing but with an atomic
to support timing/execution occuring concurrently:
Accumulator A
type requirements/recommendationsIt must have A += duration_rep_t(...)
. This operation must be safe for concurrent execution with itself, if timed_function() is potentially used concurrently. In that case consider atomic<duration_rep_t>
. If concurrency is not a concern, you can just use duration_rep_t
to avoid the strict-ordering overhead involved in atomic
plus-equals operation.
Accumulator
is understood to store raw ticks of Duration – not actual Duration – for performance reasons (to wit: so that atomic
plus-equals can be made use of, if it exists). If you need a Duration ultimately – and for type safety you really should – it is up to you to construct a Duration from the accumulated duration_rep_t
. This is trivially done via the Duration(duration_rep_t)
constructor.
atomic<duration_rep_t>
, uses +=
for accumulation which may be lock-free but uses strict ordering; a version that uses fetch_add()
with relaxed ordering may be desirable for extra performance at the cost of not-always-up-to-date accumulation results in all threads. As of this writing this can be done by the user by providing a custom type that defines +=
as explicitly using fetch_add()
with relaxed ordering; but we could provide an API for this.Clock_type
, but simultaneous multi-clock timing using the perf::Clock_types_subset paradigm (as used, e.g., in Checkpointing_timer) would be a useful and consistent API. E.g., one could measure user and system elapsed time simultaneously. As of this writing this only does not exist due to time constraints: a perf-niggardly version targeting one clock type was necessary.Accumulator | Integral accumulator of clock ticks. See above for details. |
Func | A function that is called void -style taking any arbitrary number of args, possibly none. |
clock_type | The type of clock to use for timing function() . |
accumulator | The accumulator to add time elapsed when calling function() to. See instructions above regarding concurrency, atomic , etc. |
function | The function to execute and time. |
function()
, adding the elapsed time to accumulator
. auto flow::perf::timed_function_nvr | ( | Clock_type | clock_type, |
Accumulator * | accumulator, | ||
Func && | function | ||
) |
Constructs a closure that times and executes non-void
-returning function()
, adding the elapsed time with clock type clock_type
– as raw ticks of perf::Duration – to accumulator
.
"Nvr" stands for non-void
-returning.
Consider other overload(s) and similarly named functions as well. With this one you get:
function()
is treated as returning non-void
(any return value returned by it is then returned by the returned closure accordingly).function()
cannot be a boost.asio
handler, which are always void
-returning. So there is no timed_handler() counterpart to the present function.Similar to the 2nd example in timed_function() doc header: Time a function that happens to take a couple of args, allowing for concurrency by using an atomic
. The difference: timed_func()
returns a value.
Accumulator A
type requirements/recommendationsSee timed_function().
Accumulator | See timed_function(). |
Func | A function that is called non-void -style taking any arbitrary number of args, possibly none. |
clock_type | The type of clock to use for timing function() . |
accumulator | The accumulator to add time elapsed when calling function() to. See instructions above regarding concurrency, atomic , etc. |
function | The function to execute and time. |
function()
, adding the elapsed time to accumulator
. auto flow::perf::timed_handler | ( | Clock_type | clock_type, |
Accumulator * | accumulator, | ||
Handler && | handler | ||
) |
Identical to timed_function() but suitable for boost.asio-targeted handler functions.
In other words, if you want to post(handler)
or async_...(handler)
in a boost.asio Task_engine
, and you'd like to time handler()
when it is executed by boost.asio, then use timed_handler(..., handler)
.
Consider other overload(s) and similarly named functions as well. With this one you get:
handler()
is a boost.asio
or flow::async handler.timed_function(handler)
would "work" too, in that it would compile and at a first glance appear to work fine. The problem: If handler
is bound to an executor – most commonly a boost.asio strand (util::Strand) – then using timed_function() would "unbind it." So it it was bound to Strand S
, meant to make certain handler()
never executed concurrently with other handlers bound to S
, then that constraint would (silently!) no longer be observed – leading to terrible intermittent concurrency bugs. void
(meaning anything else they might return is ignored). Hence there is no timed_handler_nvr()
, even though there is a timed_function_nvr().Similar to the 2nd example in timed_function() doc header: Time a function that happens to take a couple of args, allowing for concurrency by using an atomic
. The difference: it is first bound to a strand. In this case we post()
the handler, so it takes no args in this example. However, if used with, say, boost::asio::ip::tcp::socket::async_read_some()
, it would take args such as bytes-received and error code.
Accumulator A
type requirements/recommendationsSee timed_function().
Accumulator | See timed_function(). |
Handler | Handler meant to be post() ed or otherwise async-executed on a Task_engine . Can take any arbitrary number of args, possibly none. |
clock_type | See timed_function(). |
accumulator | See timed_function(). |
handler | The handler to execute and time. |
handler()
, adding the elapsed time to accumulator
; bound to the same executor (if any; e.g., a util::Strand) to which handler
is bound.