Flow 1.0.0
Flow project: Public API.
Classes | Public Member Functions | Public Attributes | List of all members
flow::log::Async_file_logger Class Reference

An implementation of Logger that logs messages to a given file-system path but never blocks any logging thread for file I/O; suitable for heavy-duty file logging. More...

#include <async_file_logger.hpp>

Inheritance diagram for flow::log::Async_file_logger:
[legend]
Collaboration diagram for flow::log::Async_file_logger:
[legend]

Classes

struct  Throttling_cfg
 Controls behavior of the throttling algorithm as described in Async_file_logger doc header Throttling section. More...
 

Public Member Functions

 Async_file_logger (Logger *backup_logger_ptr, Config *config, const fs::path &log_path, bool capture_rotate_signals_internally)
 Constructs logger to subsequently log to the given file-system path. More...
 
 ~Async_file_logger () override
 Flushes out anything buffered, returns resources/closes output file(s); then returns.
 
bool should_log (Sev sev, const Component &component) const override
 Implements interface method by returning true if the severity and component (which is allowed to be null) indicate it should; and if so potentially applies the throttling algorithm's result as well. More...
 
bool logs_asynchronously () const override
 Implements interface method by returning true, indicating that this Logger may need the contents of *metadata and msg passed to do_log() even after that method returns. More...
 
void do_log (Msg_metadata *metadata, util::String_view msg) override
 Implements interface method by asynchronously logging the message and some subset of the metadata in a fashion controlled by m_config. More...
 
void log_flush_and_reopen (bool async=true)
 Causes the log at the file-system path to be flushed/closed (if needed) and re-opened; this will happen as soon as possible but may occur asynchronously after this method exits, unless directed otherwise via async argument. More...
 
Throttling_cfg throttling_cfg () const
 Accessor returning a copy of the current set of throttling knobs. More...
 
bool throttling_active () const
 Whether the throttling feature is currently in effect. More...
 
void throttling_cfg (bool active, const Throttling_cfg &cfg)
 Mutator that sets the throttling knobs. More...
 
- Public Member Functions inherited from flow::log::Logger
virtual bool should_log (Sev sev, const Component &component) const =0
 Given attributes of a hypothetical message that would be logged, return true if that message should be logged and false otherwise (e.g., if the verbosity of the message is above the current configured verbosity threshold for the Component specified). More...
 
virtual bool logs_asynchronously () const =0
 Must return true if do_log() at least sometimes logs the given message and metadata (e.g., time stamp) after do_log() returns; false if this never occurs (i.e., it logs synchronously, always). More...
 
virtual void do_log (Msg_metadata *metadata, util::String_view msg)=0
 Given a message and its severity, logs that message and possibly severity WITHOUT checking whether it should be logged (i.e., without performing logic that should_log() performs). More...
 
std::ostream * this_thread_ostream () const
 Returns the stream dedicated to the executing thread and this Logger, so that the caller can apply state-setting formatters to it. More...
 
- Public Member Functions inherited from flow::util::Null_interface
virtual ~Null_interface ()=0
 Boring virtual destructor. More...
 
- Public Member Functions inherited from flow::util::Unique_id_holder
 Unique_id_holder ()
 Thread-safely construct an ID whose value is different from any other object of this class, past or future.
 
 Unique_id_holder (const Unique_id_holder &)
 This copy constructor is identical in behavior to Unique_id_holder(), the default ctor. More...
 
id_t unique_id () const
 Raw unique ID identifying this object as well as any object of a derived type. More...
 
const Unique_id_holderoperator= (const Unique_id_holder &) const
 This assignment operator is a const no-op. More...
 

Public Attributes

Config *const m_config
 Reference to the config object passed to constructor. More...
 

Additional Inherited Members

- Public Types inherited from flow::util::Unique_id_holder
using id_t = uint64_t
 Raw integer type to uniquely identify a thing. 64-bit width should make overflow extremely hard to reach.
 
- Static Public Member Functions inherited from flow::log::Logger
static void this_thread_set_logged_nickname (util::String_view thread_nickname=util::String_view(), Logger *logger_ptr=0, bool also_set_os_name=true)
 Sets or unsets the current thread's logging-worthy string name; optionally sets the OS thread name (such as visible in top output). More...
 
static std::ostream & this_thread_logged_name_os_manip (std::ostream &os)
 ostream manipulator function that, if output via operator<< to an ostream, will cause the current thread's logging-worthy string name to be output to that stream. More...
 
static void set_thread_info_in_msg_metadata (Msg_metadata *msg_metadata)
 Loads msg_metadata->m_call_thread_nickname (if set) or else msg_metadata->m_call_thread_id, based on whether/how this_thread_set_logged_nickname() was last called in the current thread. More...
 
static void set_thread_info (std::string *call_thread_nickname, flow::util::Thread_id *call_thread_id)
 Same as set_thread_info_in_msg_metadata() but targets the given two variables as opposed to a Msg_metadata. More...
 
- Static Public Member Functions inherited from flow::util::Unique_id_holder
static id_t create_unique_id ()
 Short-hand for Unique_id_holder().unique_id(); useful when all you want is the unique integer itself. More...
 
- Protected Member Functions inherited from flow::log::Log_context
 Log_context (Logger *logger=0)
 Constructs Log_context by storing the given pointer to a Logger and a null Component. More...
 
template<typename Component_payload >
 Log_context (Logger *logger, Component_payload component_payload)
 Constructs Log_context by storing the given pointer to a Logger and a new Component storing the specified generically typed payload (an enum value). More...
 
 Log_context (const Log_context &src)
 Copy constructor that stores equal Logger* and Component values as the source. More...
 
 Log_context (Log_context &&src)
 Move constructor that makes this equal to src, while the latter becomes as-if default-constructed. More...
 
Log_contextoperator= (const Log_context &src)
 Assignment operator that behaves similarly to the copy constructor. More...
 
Log_contextoperator= (Log_context &&src)
 Move assignment operator that behaves similarly to the move constructor. More...
 
void swap (Log_context &other)
 Swaps Logger pointers and Component objects held by *this and other. More...
 
Loggerget_logger () const
 Returns the stored Logger pointer, particularly as many FLOW_LOG_*() macros expect. More...
 
const Componentget_log_component () const
 Returns reference to the stored Component object, particularly as many FLOW_LOG_*() macros expect. More...
 

Detailed Description

An implementation of Logger that logs messages to a given file-system path but never blocks any logging thread for file I/O; suitable for heavy-duty file logging.

Protects against garbling due to simultaneous logging from multiple threads.

For file logging, the two out-of-the-box Loggers currently suitable are this Async_file_logger and Simple_ostream_logger. Vaguely speaking, Simple_ostream_logger is suitable for console (cout, cerr) output; and, in a pinch outside of a heavy-duty production/server environment, for file (ofstream) output. For heavy-duty file logging one should use this Async_file_logger. The primary reason is performance; this is discussed in the Logger class doc header; note Async_file_logger logs asynchronously. A secondary reason is additional file-logging-specific utilities – such as rotation – are now or in the future going to be in Async_file_logger, as its purpose is heavy-duty file logging specifically.

Todo:
Lacking feature: Compress-as-you-log in Async_file_logger. So, optionally, when characters are actually written out to file-system, gzip/zip/whatever them instead of writing plain text. (This is possible at least for gzip.) Background: It is common-place to compress a log file after it has been rotated (e.g., around rotation time: F.log.1.gz -> F.log.2.gz, F.log -> F.log.1 -> F.log.1.gz). It is more space-efficient (at least), however, to write to F.log.gz directly already in compressed form; then rotation requires only renaming (e.g.: F.log.1.gz -> F.log.2.gz, F.log.gz [already gzipped from the start] -> F.log.1.gz).

Throttling

By default this feature is disabled, but one can enable/disable/configure it at will via throttling_cfg() mutator. For example a viable tactic is to call it once right after construction, before any logging via *this; then call it subsequently in case of dynamic config update.

Throttling deals with the following potential problem which can occur under very heavy do_log() throughput (lots of calls being made per unit time, probably from many threads); in practice this would typically only happen if one sets effective verbosity to Sev::S_TRACE or more-verbose – as S_INFO-or-less-verbose means should_log() should be preventing do_log() from being called frequently. For example one might enable TRACE logging temporarily in production to see some details, causing a heavy do_log() execution rate. (That said, if your code does INFO-level logging too often, it could happen then too. The official convention is to log INFO-or-more-severe messages only if this would not meaningfully affect the overall perf and responsiveness of the system; but sometimes mistakes are made.)

Internally each do_log() call pushes this log request into a central (per *this) queue of log requests; a single (per *this) background thread is always popping this queue ASAP, writing the message to the file, etc., until the queue is emptied; then it sleeps until it becomes non-empty; and so on. If many threads are pushing messages faster than this thread can get through them, then more and more RAM is used to store the enqueued messages. If the throughput doesn't decrease in time, letting this central thread catch up, then the RAM use might cause swapping and eventually out-of-memory (congestive collapse). To be clear, unless congestive collapse is in fact reached, the system is self-correcting, in that given a respite from the log-requests (do_log() calls), the queue size will get down to zero, as will the corresponding memory use. However, if this does not occur early enough, congestive collapse occurs.

Throttling is a safeguard against this. It works as follows. There's a limit H in bytes. We start at Not-Throttling state. If the memory use M grows beyond H, we enter Throttling state. In this state, we reject incoming log requests – thus letting the system deal with just logging-to-file what it has queued up (in FIFO fashion, naturally) – until no more such queued-up requests remain. Once the queue is empty, we re-enter Not-Throttling state. In other words beyond a certain total memory use, we throttle; then to exit this state total memory use has to go down to 0, before we un-throttle. The idea is to encourage a sawtooth /\/\/\ memory-use pattern when subjected to a firehose of log requests.

Note
An earlier functional design contemplated having a limit L (smaller than H; e.g., picture L = 50% of H), so that mem-use would need to merely get down to L to re-enter Not-Throttling state. However, upon prototyping this, it became clear this hardly improved the experience, instead making it rather confusing to understand in action. E.g., if there's a firehose going at full-blast, and you're fighting it by turning off the fire-hose and letting water drain, but then turn it on again once the container is 50% full, then the draining will "fight" the filling again, potentially losing quite decisively. Trying to then read resulting logs is messy and strange, depending on the 2 rates relative to each other. By comparison, the fill-drain-fill-drain /\/\/\ pattern is straightforward to understand; and one can cleanly point out the minima and maxima with log messages. Plus, less configuration/tuning with little to no functional loss = a good thing.

The particular mechanic of throttling is as follows. If throttling is enabled during a particular call to should_log(), and if and only if should_log() would return true based on considerations excluding the throttling feature, then:

Since standard FLOW_LOG_* macros avoid do_log() (or even the evaluation of the message – which is itself quite expensive, possibly quite a bit more expensive than the do_log()) if should_log() returns false for a given log call site, during Throttling state enqueuing of messages is blocked (letting the logging-to-file thread catch up). should_log() == false is how we turn off the firehose.

M starts at 0. Each do_log() (queued-up log request) increments M based on the memory-use estimate of the message+metadata passed to do_log(); then it enqueues the log request. Each time the background thread actually writes out the queued-up message+metadata, it decrements M by the same value by which it was incremented in do_log(); accordingly the log-request's memory is freed.

This algorithm (computing M via increments and decrements; setting state to Throttling or Not-Throttling) is carried out at all times. However, should_log() consults the state (Throttling versus Not-Throttling), if and only if the throttling feature is enabled at that time. If it is not, that state is simply ignored, and should_log() only makes the usual Config verbosity check(s); if that results in true then the message is enqueued. Therefore one can enable throttling at any time and count on its having immediate effect based on actual memory use at that time.

The limit H can also be reconfigured at any time. Essentially throttling_cfg() mutator takes 2 orthogonal sets of info: 1, whether throttling is to be possible at all (whether Throttling versus Not-Throttling affects should_log() from now on); and 2, the limit H which controls the policy about setting the state (Throttling versus Not-Throttling). (2) affects the algorithm that computes that binary state; whereas (1) affects whether that binary state actually controls whether to prevent logging to save memory or not.

If H is modified, the binary state is reinitialized: it is set to Throttling if and only if memory use M at that time exceeds H; else to Not-Throttling. The state prior to the throttling_cfg() mutator call does not matter in this situation; it is overwritten. This avoids various annoying corner cases and ambiguities around config updates.

Lastly: Initially throttling is disabled, while a certain default value of H is assumed. Hence the above algorithm is active but has no effect, unless you call throttling_cfg(true, ...) to make it have effect and/or change H. You may use throttling_cfg() accessor and throttling_active() to get a copy of the current config values.

Thread safety

As noted above, simultaneous logging from multiple threads is safe from output corruption, in that simultaneous do_log() calls for the same Logger targeting the same stream will log serially to each other. However, if some other code or process writes to the same file, then all bets are off – so don't.

See thread safety notes and to-dos regarding m_config in Simple_ostream_logger doc header. These apply here also.

throttling_cfg() mutator does not add any thread safety restrictions: it can be called concurrently with any other method, including should_log(), do_log(), same-named accessor, and throttling_active(). There is one formal exception: it must not be called concurrently with itself.

There are no other mutable data (state), so that's that.

Throttling: Functional design rationale notes

The throttling feature could have been designed differently (in terms of how it should act, functionally speaking), and a couple of questions tend to come up, so let's answer here.

All in all, these choices are defensible but not necessarily the only good ones.

Constructor & Destructor Documentation

◆ Async_file_logger()

flow::log::Async_file_logger::Async_file_logger ( Logger backup_logger_ptr,
Config config,
const fs::path &  log_path,
bool  capture_rotate_signals_internally 
)
explicit

Constructs logger to subsequently log to the given file-system path.

It will append.

Todo:
Consider adding Async_file_logger constructor option to overwrite the file instead of appending.
Parameters
configControls behavior of this Logger. In particular, it affects should_log() logic (verbosity default and per-component) and output format (such as time stamp format). See thread safety notes in class doc header. This is saved in m_config.
log_pathFile-system path to which to write subsequently. Note that no writing occurs until the first do_log() call.
backup_logger_ptrThe Logger to use for *this to log about its logging operations to the actual intended file-system path; or null to not log such things anywhere. If you do not pass in null, but ensure backup_logger_ptr->should_log() lets through nothing more than Sev::S_INFO severity messages for Flow_log_component::S_LOG, then you can expect a reasonable amount of useful output that will not affect performance. Tip: null is a reasonable value. A Simple_ostream_logger logging to cout and cerr (or only cout) is also a good choice, arguably better than null. Lastly, setting verbosity to INFO for *backup_logger_ptr is typically a better choice than TRACE in practice.
capture_rotate_signals_internallyIf and only if this is true, *this will detect SIGHUP (or your OS's version thereof); upon seeing such a signal, it will fire the equivalent of log_flush_and_reopen(), as needed for classic log rotation. (The idea is: If we are writing to path F, then your outside log rotation tool will rename F -> F.1 [and F.1 -> F.2, etc.]; even as we continue writing to the underlying file after it has been renamed; then the tool sends SIGHUP; we flush/close what is really F.1; reopen at the real path F again, which will create it anew post-rotation.) If false then you'd have to do it yourself if desired. If this is true, the user may register their own signal handler(s) (for any purpose whatsoever) using boost::asio::signal_set. However, behavior is undefined if the program registers signal handlers via any other API, such as sigaction() or signal(). If you need to set up such a non-signal_set signal handler, AND you require rotation behavior, then (1) set this option to false; (2) trap SIGHUP yourself; (3) in your handlers for the latter, simply call log_flush_and_reopen(). However, if typical, common-sense behavior is what you're after – and either don't need additional signal handling or are OK with using signal_set for it – then setting this to true is a good option.

Member Function Documentation

◆ do_log()

void flow::log::Async_file_logger::do_log ( Msg_metadata metadata,
util::String_view  msg 
)
overridevirtual

Implements interface method by asynchronously logging the message and some subset of the metadata in a fashion controlled by m_config.

Parameters
metadataAll information to potentially log in addition to msg.
msgThe message.

Implements flow::log::Logger.

◆ log_flush_and_reopen()

void flow::log::Async_file_logger::log_flush_and_reopen ( bool  async = true)

Causes the log at the file-system path to be flushed/closed (if needed) and re-opened; this will happen as soon as possible but may occur asynchronously after this method exits, unless directed otherwise via async argument.

Uses

Flushing: log_flush_and_reopen(false) is a reasonable and safe way to flush anything buffered in memory to the file. Naturally, for performance, it should not be done frequently. For example this might be useful in the event of an abnormal termination (abort(), etc.), in the signal handler before exiting program.

Rotation: log_flush_and_reopen(true) is useful for rotation purposes; however, you need not do this manually if you decided to (properly) use the capture_rotate_signals_internally == true option in Async_file_logger constructor; the procedure will occur on receiving the proper signal automatically.

Todo:
Async_file_logger::log_flush_and_reopen(true) is great for flushing, such as in an abort-signal handler, but providing just the flushing part without the reopening might be useful. At the moment we've left it this way, due to the vague feeling that closing the file upon flushing it is somehow more final and thus safer (in terms of accomplishing its goal) in an abort-signal scenario. Feelings aren't very scientific though.
Parameters
asyncIf true, the operation will execute ASAP but asynchronously, the method exiting immediately; else it will complete fully before this method returns.

◆ logs_asynchronously()

bool flow::log::Async_file_logger::logs_asynchronously ( ) const
overridevirtual

Implements interface method by returning true, indicating that this Logger may need the contents of *metadata and msg passed to do_log() even after that method returns.

Returns
See above.

Implements flow::log::Logger.

◆ should_log()

bool flow::log::Async_file_logger::should_log ( Sev  sev,
const Component component 
) const
overridevirtual

Implements interface method by returning true if the severity and component (which is allowed to be null) indicate it should; and if so potentially applies the throttling algorithm's result as well.

As of this writing not thread-safe against changes to *m_config (but thread-safe agains throttling_cfg() mutator).

Throttling comes into play if and only if: 1, sev and component indicate should_log() should return true in the first place; and 2, throttling_active() == true. In that case the throttling alogorithm's current output (Throttling versus Not-Throttling state) is consulted to determine whether to return true or false. (See Throttling section of class doc header.)

Parameters
sevSeverity of the message.
componentComponent of the message. Reminder: component.empty() == true is allowed.
Returns
See above.

Implements flow::log::Logger.

◆ throttling_active()

bool flow::log::Async_file_logger::throttling_active ( ) const

Whether the throttling feature is currently in effect.

That is: can the throttling computations actually affect should_log() output? (It is not about whether log lines are actually being rejected due to throttling right now.) Please see Async_file_logger doc header Throttling section for more info.

If true should_log() will potentially consider Throttling versus Not-Throttling state; else it will ignore it. If throttling_cfg() mutator is never called, then this shall be false (feature inactive by default).

See also
throttling_cfg() accessor also.
Returns
See above.

◆ throttling_cfg() [1/2]

Async_file_logger::Throttling_cfg flow::log::Async_file_logger::throttling_cfg ( ) const

Accessor returning a copy of the current set of throttling knobs.

Please see Async_file_logger doc header Throttling section for description of their meanings in the algorithm.

See also
throttling_active() also.
Returns
The current knobs controlling the behavior of the algorithm that determines Throttling versus Not-Throttling state. If throttling_cfg() mutator is never called, then the values therein will be some valid defaults.

◆ throttling_cfg() [2/2]

void flow::log::Async_file_logger::throttling_cfg ( bool  active,
const Throttling_cfg cfg 
)

Mutator that sets the throttling knobs.

Please see Async_file_logger doc header Throttling section for description of their meanings in the algorithm.

Thread safety

It is okay to call concurrently with any other method on the same *this, except it must not be called concurrently with itself.

Parameters
activeWhether the feature shall be in effect (if should_log() will potentially consider Throttling versus Not-Throttling state; else it will ignore it).
cfgThe new values for knobs controlling the behavior of the algorithm that determines Throttling versus Not-Throttling state.

Member Data Documentation

◆ m_config

Config* const flow::log::Async_file_logger::m_config

Reference to the config object passed to constructor.

Note that object is mutable; see notes on thread safety.


The documentation for this class was generated from the following files: