Flow 1.0.0
Flow project: Full implementation reference.
thread_lcl_str_appender.hpp
Go to the documentation of this file.
1/* Flow
2 * Copyright 2023 Akamai Technologies, Inc.
3 *
4 * Licensed under the Apache License, Version 2.0 (the
5 * "License"); you may not use this file except in
6 * compliance with the License. You may obtain a copy
7 * of the License at
8 *
9 * https://www.apache.org/licenses/LICENSE-2.0
10 *
11 * Unless required by applicable law or agreed to in
12 * writing, software distributed under the License is
13 * distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR
14 * CONDITIONS OF ANY KIND, either express or implied.
15 * See the License for the specific language governing
16 * permissions and limitations under the License. */
17
18/// @file
19#pragma once
20
24#include <boost/iostreams/device/array.hpp>
25#include <boost/iostreams/stream.hpp>
26#include <boost/iostreams/stream_buffer.hpp>
27#include <boost/io/ios_state.hpp>
28#include <boost/move/unique_ptr.hpp>
29#include <boost/unordered_map.hpp>
30#include <boost/thread.hpp>
31#include <string>
32
33namespace flow::log
34{
35
36/**
37 * Internal flow::log class that facilitates a more efficient way to get util::ostream_op_to_string() behavior
38 * by allowing each thread to repeatedly reuse the structures that function creates from scratch on stack each time it
39 * is invoked; furthermore each logging entity is allotted a separate such set of structures to enable each entity to
40 * not affect the streams of other entities.
41 *
42 * The problem statement arises from util::ostream_op_to_string()'s only obvious flaw which is performance.
43 * One invokes that function and passes a target `std::string` and a sequence of `ostream`-writing operations. Usually
44 * the `string` has just been newly created `{` locally `}`; and the function itself will ultimately create some Boost
45 * `iostreams` machinery `{` locally `}` and then use it to efficiently write through that machinery into the `string`.
46 * That's all fine, but if the function is very often invoked, it would be nice
47 * to instead reuse a global `string` *and* associated machinery between invocations. (`ostream_op_to_string()` can
48 * reuse an existing `string`, but even then it'll need to re-create the `iostreams` machinery each time.)
49 * In particular, logging machinery (`FLOW_LOG_*()` especially) indeed very often invokes such behavior and could thus
50 * benefit (hence the inspiration behind this class).
51 *
52 * Well, using global structures is not OK: multiple threads will corrupt any such structures by writing and/or
53 * reading concurrently to/from them. However, if one creates a structure per thread -- a/k/a a thread-local
54 * structure -- then that danger goes away completely. That is the principle behind this class. Firstly, from the
55 * class user's perspective, the class itself is a singleton per thread. Call get_this_thread_string_appender() to
56 * access the current thread's only instance of Thread_local_string_appender. Now you have a pointer to that
57 * object. Further use is very simple: fresh_appender_ostream() clears the internally stored target string and
58 * returns a pointer to an `ostream` object. Write to this `ostream` like any other `ostream`. Once done doing so,
59 * call target_contents(), which will return a reference to the read-only string which has been appended to with
60 * the aforementioned `ostream` writing. Repeat each time you'd like to write to a string, and then quickly access
61 * that string.
62 *
63 * Update: A further level of indirection/lookup was added when I realized that, for each thread, it's not desirable
64 * that logically separate stream writers (typically each allotted its own writing object, e.g., a Logger) all write
65 * to one common stream. Even if their writes are not interleaved within an "atomic" sequence of characters,
66 * state-changing `ostream` formatters (`std::hex` and the like) would cause one writer to affect the formatting of
67 * another, which is not desired (though maybe not immediately obvious due to formatting being a relatively rarely
68 * used feature for many). Therefore, each writing "entity" is allotted a separate instance of this class, and
69 * lookup is (implicitly) via thread ID and (explicitly) via a source object ID which uniquely (over all time)
70 * identifies each object.
71 *
72 * ~~~
73 * // If you frequently do this, from one or from multiple threads:
74 * class Distinct_writer
75 * {
76 * ...
77 * string output;
78 * // Expensive structure setup occurs before the stream writing can execute. (Terminating `flush` is assumed.)
79 * util::ostream_op_to_string(&output, "The answer is: [", std::hex, 42, "].");
80 * log_string(output); // Suppose log_string() takes const std::string&.
81 * log_chars(output.c_str()); // Or suppose log_chars() takes const char*.
82 * // std::hex formatting is forgotten, as each util::ostream_op_to_string() creates a new everything.
83 * ...
84 * }
85 *
86 * // Then consider replacing it with this (for better performance over time):
87 * class Distinct_writer : private util::Unique_id_holder
88 * {
89 * ...
90 * // Note lookup by `this`: we are also a Unique_id_holder which means this can get our unique ID.
91 * auto const appender = log::Thread_local_string_appender::get_this_thread_string_appender(*this);
92 * // Note added explicit `flush`.
93 * *(appender->fresh_appender_ostream()) << "The answer is: [" << std::hex << 42 << "]." << std::flush;
94 * log_string(appender->target_contents());
95 * log_chars(appender->target_contents().c_str());
96 * // Additional feature: std::hex formatting will persist for further such snippets, for the current thread.
97 * ...
98 * }
99 * ~~~
100 *
101 * Note, once again, that because get_this_thread_string_appender() returns a thread-local object, it is by definition
102 * impossible to corrupt anything inside it due to multiple threads writing to it. (That is unless, of course, you
103 * try passing that pointer to another thread and writing to it there, but that's basically malicious behavior;
104 * so don't do that.)
105 *
106 * ### Thread safety ###
107 * Well, see above. If you use class as prescribed, then it's safe to read and write without locking around
108 * object of this class.
109 *
110 * ### Memory use ###
111 * The object itself is not large, storing some streams and ancillary stream-related obejcts.
112 * The lookup table just indexes by object ID and thread ID, so a few integers per writing entity per writing thread.
113 * The implementation we use ensures that once a given thread disappears, all the data for that thread ID are freed.
114 * Let us then discuss, from this point forth, what happens while a given thread is alive. Namely:
115 *
116 * There is no way to remove an appender from this table once it has been added which can be seen as a
117 * memory leak. In practice, the original use case of Logger means the number of lookup table entries is not likely
118 * to grow large enough to matter -- but extreme scenarios contradicting this estimation may be contrived or occur.
119 *
120 * In terms of implementation, getting cleanup to work is extremely difficult and possibly essentially impossible.
121 * (This is true because it's not possible, without some kind of undocumented and possibly un-portable hackery, to
122 * enumerate all the threads from which a given object has created a Thread_local_string_appender. Even if one could,
123 * it is similarly not really possible to do anything about it without entering each such thread, which is quite hard
124 * to do elegantly and unintrusively w/r/t to calling code. Even if one could enter each such thread, some
125 * kind of locking would probably need to be added, eliminating the elegance of the basic premise of the class which
126 * is that of a thread-local singleton per source object.) For this reason, adding cleanup is not even listed as a
127 * to-do.
128 *
129 * Since util::Unique_id_holder is unique over all time, not just at any given time, there is no danger that the reuse
130 * of some dead object's ID will cause a collision. Historically this was a problem when we used `this` pointers
131 * as IDs (as once an object is gone, its `this` value can be reused by another, new object).
132 *
133 * ### Implementation notes ###
134 * We use boost.thread's `thread_specific_ptr` to implement a lazily initialized per-thread singleton (in which
135 * a per-object-ID sub-table of Thread_local_string_appender objects lies). An alternative implementation would
136 * be C++11's built-in `thread_local` keyword, probably with a `unique_ptr` to wrap each sub-table (to allow for
137 * lazy initialization instead of thread-startup initialization). The main reason I chose to keep `thread_specific_ptr`
138 * even upon moving from C++03 to C++1x is that we use `boost::thread` -- not `std::thread`. Experiments show
139 * `thread_local` behaves appropriately (crucially, including cleanup on thread exit) even with
140 * `boost::thread`, but I don't see this documented anywhere (which doesn't mean it isn't documented), and without
141 * that it could be an implementation coincidence as opposed to a formal guarantee. A secondary reason -- which can
142 * be thought of the straw that broke the camel's back in this case, as it is fairly minor -- is that
143 * `thread_specific_ptr` provides lazy initialization by default, without needing a `unique_ptr` wrapper;
144 * a given thread's `p.get()` returns null the first time it is invoked; and executes `delete p.get();` at thread exit
145 * (one need not supply a deleter function, although one could if more complex cleanup were needed). This is
146 * what a default-constructed `unique_ptr` would give us, but we get it for "free" (in the sense that no added code
147 * is necessary to achieve the same behavior) with `thread_specific_ptr`.
148 */
150 private boost::noncopyable
151{
152public:
153 // Methods.
154
155 /**
156 * Returns a pointer to the exactly one Thread_local_string_appender object that is accessible from
157 * the current thread for the given source object. The source object is given by its ID. The source object
158 * can store or contain (or be otherwise mapped to) its own util::Unique_id_holder; and thus it can be
159 * any object (e.g., a Logger, in original use case) that desires the use of one distinct (from other
160 * objects) but continuous (meaning any stream state including characters output will persist over time)
161 * `ostream`. (The `ostream` is not returned directly but rather as the wrapping Thread_local_string_appender
162 * object for that stream.) Ultimately there is exactly one `ostream` (and Thread_local_string_appender) per
163 * (invoking thread T, `source_obj_id`) pair that has invoked the present method so far, including the current
164 * invocation itself.
165 *
166 * @param source_obj_id
167 * An ID attached in a 1-to-1 (over all time until program exit) fashion to the entity (typically, class
168 * instance of any type, e.g., Logger) desiring its own Thread_local_string_appender.
169 * @return See above.
170 */
172
173 /**
174 * Clears the internally stored string (accessible for reading via target_contents()), and returns an
175 * output stream writing to which will append to that string. You must `flush` (or flush any other way like
176 * `endl`) the stream in order to ensure characters are actually written to the string. (I am fairly sure flushing
177 * is in fact THE thing that actually writes to the string.)
178 *
179 * The pointer's value never changes for `*this` object. This fact is critical when it comes to the logic
180 * of sequential save_formatting_state_and_restore_prev() calls, as well as (for example) the `ostream` continuity
181 * semantics described in class Logger header (that class uses the present class).
182 *
183 * Behavior is undefined if, at the time the present method is called, the last `flush` of
184 * `*fresh_appender_ostream()` precedes the last writing of actual data to same. In other words,
185 * always `flush` the stream immediately after writing to it, or else who knows what will happen
186 * with any un-flushed data when one subsequently calls fresh_appender_ostream(), clearing the string?
187 *
188 * Behavior is undefined if you call this from any thread other than the one in which
189 * get_this_thread_string_appender() was called in order to obtain `this`.
190 *
191 * @see save_formatting_state_and_restore_prev() is a valuable technique to enable usability in
192 * user-facing APIs that use Thread_local_string_appender within the implementation (notably logging
193 * APIs). See its doc header before using this method.
194 *
195 * @return Pointer to `ostream` writing to which (followed by flushing) will append to string that
196 * is accessible via target_contents().
197 */
198 std::ostream* fresh_appender_ostream();
199
200 /**
201 * Same as fresh_appender_ostream() but does not clear the result string, enabling piecemeal writing to
202 * that string between clearings of the latter.
203 *
204 * @return Identical to fresh_appender_ostream()'s return value.
205 */
206 std::ostream* appender_ostream();
207
208 /**
209 * Saves the formatting state of the `ostream` returned by appender_ostream() and sets that same `ostream`
210 * to the state saved last time this method was called on `*this`, or at its construction, whichever happened
211 * later. Examples of formatting state are `std::hex` and locale imbuings.
212 *
213 * This is useful if you tend to follow a pattern like the following
214 * macro definition that takes `user_supplied_stream_args` macro argument:
215 *
216 * ~~~
217 * *(appender->fresh_appender_ostream())
218 * << __FILE__ << ':' << __LINE << ": " // First standard prefix to identify source of log line...
219 * << user_supplied_stream_args // ...then user-given ostream args (token may expand to multiple <<s)...
220 * << '\n' // ...then any terminating characters...
221 * << std::flush; // ...and ensure it's all flushed into `string target_contents()`.
222 * ~~~
223 *
224 * If this is done repeatedly, then `user_supplied_stream_args` might include formatting changes like `std::hex`.
225 * In this example, there is the danger that such an `std::hex` from log statement N would then "infect"
226 * the `__LINE__` output from log statement (N + 1), the line # showing up in hex form instead of decimal.
227 * However, if you use the present feature, then the problem will not occur.
228 * However, to avoid similarly surprising the user, restore their formatting with the same call.
229 * The above example would become:
230 *
231 * ~~~
232 * auto& os = *appender->fresh_appender_ostream();
233 * os.save_formatting_state_and_restore_prev(); // Restore pristine formatting from constuction time.
234 * os << __FILE__ << ':' << __LINE << ": "; // Can log prefix without fear of surprising formatting.
235 * // (Note that if apply formatters here for more exotic output, undo them before the following call.)
236 * os.save_formatting_state_and_restore_prev(); // Restore formatting from previous user_supplied_stream_args.
237 * os << user_supplied_stream_args
238 * << '\n' // Formatting probably doesn't affect this, so no need to worry about restoring state here.
239 * << std::flush;
240 * ~~~
241 *
242 * @note Reminder: All access to a `*this` must occur in one thread by definition. Therefore there are no
243 * thread safety concerns to do with the suggested usage patterns.
244 * @note Detail: The formatting features affected are as described in documentation for
245 * `boost::io::basic_ios_all_saver`. This can be summarized as everything except user-defined formatters.
246 */
248
249 /**
250 * Read-only accessor for the contents of the `string`, as written to it since the last fresh_appender_ostream()
251 * call, or object construction, whichever occurred later.
252 *
253 * The reference's value never changes for `*this` object. The string's value may change depending on
254 * whether the user writes to `*fresh_appender_ostream()` (which writes to the string) or calls that
255 * method (which clears it).
256 *
257 * Behavior is undefined if you call this from any thread other than the one in which
258 * get_this_thread_string_appender() was called in order to obtain `this`.
259 *
260 * ### Rationale ###
261 * Why return `const string&` instead of util::String_view? Answer: Same as in doc header of String_ostream::str().
262 *
263 * @return Read-only reference to `string`.
264 */
265 const std::string& target_contents() const;
266
267private:
268
269 // Types.
270
271 /**
272 * Short-hand for map of a given thread's appender objects indexed by the IDs of their respective source objects.
273 * Smart pointers are stored to ensure the Thread_local_string_appender is deleted once removed from such a map.
274 * Thus once a per-thread map disappears at thread exit, all the stored objects within are freed also.
275 */
277 boost::movelib::unique_ptr<Thread_local_string_appender>>;
278
279 // Constructors/destructor.
280
281 /**
282 * Initializes object with an empty string and the streams machinery available to write to that string.
283 * Note this is not publicly accessible.
284 */
286
287 // Data.
288
289 /**
290 * Thread-local storage for each thread's map storing objects of this class (lazily set to non-null on 1st access).
291 * Recall `delete s_this_thread_appender_ptrs.get();` is executed at each thread's exit; so if that is non-null
292 * for a given thread, this map is freed at that time. Since smart pointers to Thread_local_string_appender
293 * are stored in the map, the Thread_local_string_appender objects thus stored are also freed at that time.
294 */
295 static boost::thread_specific_ptr<Source_obj_to_appender_map> s_this_thread_appender_ptrs;
296
297 /// The target string wrapped by an `ostream`. Emptied at construction and in fresh_appender_ostream() *only*.
299
300 /**
301 * Stores the `ostream` formatter state from construction time or time of last
302 * save_formatting_state_and_restore_prev(), whichever occurred later; clearing (including as part of reassigning)
303 * this pointer will invoke the destructor `~ios_all_saver()`, restoring that state to #m_target_appender_ostream.
304 *
305 * A pointer is used in order to be able to re-construct this at will: `ios_all_saver` does not have a "re-do
306 * construction on existing object" API (not that I'm saying it should... less state is good, all else being equal...
307 * but I digress).
308 *
309 * @see save_formatting_state_and_restore_prev() for explanation of the feature enabled by this member.
310 */
311 boost::movelib::unique_ptr<boost::io::ios_all_saver> m_target_appender_ostream_prev_os_state;
312}; // class Thread_local_string_appender
313
314} // namespace flow::log
Internal flow::log class that facilitates a more efficient way to get util::ostream_op_to_string() be...
void save_formatting_state_and_restore_prev()
Saves the formatting state of the ostream returned by appender_ostream() and sets that same ostream t...
static boost::thread_specific_ptr< Source_obj_to_appender_map > s_this_thread_appender_ptrs
Thread-local storage for each thread's map storing objects of this class (lazily set to non-null on 1...
const std::string & target_contents() const
Read-only accessor for the contents of the string, as written to it since the last fresh_appender_ost...
boost::unordered_map< util::Unique_id_holder::id_t, boost::movelib::unique_ptr< Thread_local_string_appender > > Source_obj_to_appender_map
Short-hand for map of a given thread's appender objects indexed by the IDs of their respective source...
static Thread_local_string_appender * get_this_thread_string_appender(const util::Unique_id_holder &source_obj_id)
Returns a pointer to the exactly one Thread_local_string_appender object that is accessible from the ...
std::ostream * appender_ostream()
Same as fresh_appender_ostream() but does not clear the result string, enabling piecemeal writing to ...
util::String_ostream m_target_appender_ostream
The target string wrapped by an ostream. Emptied at construction and in fresh_appender_ostream() only...
Thread_local_string_appender()
Initializes object with an empty string and the streams machinery available to write to that string.
boost::movelib::unique_ptr< boost::io::ios_all_saver > m_target_appender_ostream_prev_os_state
Stores the ostream formatter state from construction time or time of last save_formatting_state_and_r...
std::ostream * fresh_appender_ostream()
Clears the internally stored string (accessible for reading via target_contents()),...
Similar to ostringstream but allows fast read-only access directly into the std::string being written...
Each object of this class stores (at construction) and returns (on demand) a numeric ID unique from a...
uint64_t id_t
Raw integer type to uniquely identify a thing. 64-bit width should make overflow extremely hard to re...
Flow module providing logging functionality.