blob: 368e8499f59c3a61c99b4af77bb6404420b55b3a [file] [log] [blame]
% together/count.tex
% mainfile: ../perfbook.tex
% SPDX-License-Identifier: CC-BY-SA-3.0
\section{Counter Conundrums}
\label{sec:together:Counter Conundrums}
%
\epigraph{Ford carried on counting quietly.
This is about the most aggressive thing you can do to a
computer, the equivalent of going up to a human being and saying
``Blood \dots blood \dots blood \dots blood \dots''}
{Douglas Adams}
This \lcnamecref{sec:together:Counter Conundrums}
outlines solutions to counter conundrums.
\subsection{Counting Updates}
\label{sec:together:Counting Updates}
Suppose that Schr\"odinger (see
\cref{sec:datastruct:Motivating Application})
wants to count the number of updates for each animal,
and that these updates are synchronized using a per-data-element lock.
How can this counting best be done?
Of course, any number of counting algorithms from \cref{chp:Counting}
might qualify, but the optimal approach is quite simple.
Just place a counter in each data element, and increment it under the
protection of that element's lock!
If readers access the count locklessly, then updaters should use
\co{WRITE_ONCE()} to update the counter and lockless readers should
use \co{READ_ONCE()} to load it.
\subsection{Counting Lookups}
\label{sec:together:Counting Lookups}
Suppose that Schr\"odinger also wants to count the number of lookups for
each animal, where lookups are protected by RCU\@.
How can this counting best be done?
One approach would be to protect a lookup counter with the per-element
lock, as discussed in \cref{sec:together:Counting Updates}.
Unfortunately, this would require all lookups to acquire this lock,
which would be a severe bottleneck on large systems.
Another approach is to ``just say no'' to counting, following the example
of the \co{noatime} mount option.
If this approach is feasible, it is clearly the best:
After all, nothing is faster than doing nothing.
If the lookup count cannot be dispensed with, read on!
Any of the counters from \cref{chp:Counting}
could be pressed into service, with the statistical counters described in
\cref{sec:count:Statistical Counters} being perhaps the most common choice.
However, this results in a large memory footprint:
The number of counters required is the number of data elements multiplied
by the number of threads.
If this memory overhead is excessive, then one approach is to keep
per-core or even per-socket counters rather than per-CPU counters,
with an eye to the hash-table performance results depicted in
\cref{fig:datastruct:Read-Only Hash-Table Performance For Schroedinger's Zoo; 448 CPUs}.
This will require that the counter increments be atomic operations,
especially for user-mode execution where a given thread could migrate
to another CPU at any time.
If some elements are looked up very frequently, there are a number
of approaches that batch updates by maintaining a per-thread log,
where multiple log entries for a given element can be merged.
After a given log entry has a sufficiently large increment or after
sufficient time has passed, the log entries may be applied to the
corresponding data elements.
Silas Boyd-Wickizer has done some work formalizing this
notion~\cite{SilasBoydWickizerPhD}.