defer/hazptr.tex - pub/scm/linux/kernel/git/paulmck/perfbook - Git at Google

 % defer/hazptr.tex
 % mainfile: ../perfbook.tex
 % From an C++ Standards Committee meeting:  "Can I hazptr cheezeberger?"

 \section{Hazard Pointers}
 \label{sec:defer:Hazard Pointers}
 %
 \epigraph{If in doubt, turn it inside out.}{Zara Carpenter}

 One way of avoiding problems with concurrent reference counting
 is to implement the reference counters
 inside out, that is, rather than incrementing an integer stored in the
 data element, instead store a pointer to that data element in
 per-CPU (or per-thread) lists.
 Each element of these lists is called a
 \emph{\IX{hazard pointer}}~\cite{MagedMichael04a}.\footnote{
 	Also independently invented by others~\cite{HerlihyLM02}.}
 The value of a given data element's ``virtual reference counter'' can
 then be obtained by counting the number of hazard pointers referencing
 that element.
 Therefore, if that element has been rendered inaccessible to readers,
 and there are no longer any hazard pointers referencing it, that element
 may safely be freed.

 \begin{listing}
 \input{CodeSamples/defer/hazptr@record_clear.fcv}
 \caption{Hazard-Pointer Recording and Clearing}
 \label{lst:defer:Hazard-Pointer Recording and Clearing}
 \end{listing}

 Of course, this means that hazard-pointer acquisition must be carried
 out quite carefully in order to avoid destructive races with concurrent
 deletion.
 \begin{fcvref}[ln:defer:hazptr:record_clear]
 One implementation is shown in
 \cref{lst:defer:Hazard-Pointer Recording and Clearing},
 which shows \co{hp_try_record()} on \clnrefrange{htr:b}{htr:e},
 \co{hp_record()} on \clnrefrange{hr:b}{hr:e}, and
 \co{hp_clear()} on
 \clnrefrange{hc:b}{hc:e} (\path{hazptr.h}).

 The \co{hp_try_record()} macro on \clnref{htr:e} is simply a casting
 wrapper for the \co{_h_t_r_impl()} function, which attempts to store
 the pointer referenced by \co{p} into the hazard pointer referenced
 by \co{hp}.
 If successful, it returns the value of the stored pointer.
 If it fails due to that pointer being \co{NULL}, it returns \co{NULL}.
 Finally, if it fails due to racing with an update, it returns a special
 \co{HAZPTR_POISON} token.

 \QuickQuiz{
 	Given that papers on hazard pointers use the bottom bits
 	of each pointer to mark deleted elements, what is up with
 	\co{HAZPTR_POISON}?
 }\QuickQuizAnswer{
 	The published implementations of hazard pointers used
 	non-blocking synchronization techniques for insertion
 	and deletion.
 	These techniques require that readers traversing the
 	data structure ``help'' updaters complete their updates,
 	which in turn means that readers need to look at the successor
 	of a deleted element.

 	In contrast, we will be using locking to synchronize updates,
 	which does away with the need for readers to help updaters
 	complete their updates, which in turn allows us to leave
 	pointers' bottom bits alone.
 	This approach allows read-side code to be simpler and faster.
 }\QuickQuizEnd

 \Clnref{htr:ro1} reads the pointer to the object to be protected.
 If \clnref{htr:race1} finds that this pointer was either \co{NULL} or
 the special \co{HAZPTR_POISON} deleted-object token, it returns
 the pointer's value to inform the caller of the failure.
 Otherwise, \clnref{htr:store} stores the pointer into the specified
 hazard pointer, and \clnref{htr:mb} forces full ordering of that
 store with the reload of the original pointer on \clnref{htr:ro2}.
 (See \cref{chp:Advanced Synchronization: Memory Ordering}
 for more information on memory ordering.)
 If the value of the original pointer has not changed, then the hazard
 pointer protects the pointed-to object, and in that case,
 \clnref{htr:success} returns a pointer to that object, which also
 indicates success to the caller.
 Otherwise, if the pointer changed between the two \co{READ_ONCE()}
 invocations, \clnref{htr:race2} indicates failure.

 \QuickQuiz{
 	Why does \co{hp_try_record()} in
 	\cref{lst:defer:Hazard-Pointer Recording and Clearing}
 	take a double indirection to the data element?
 	Why not \co{void *} instead of \co{void **}?
 }\QuickQuizAnswer{
 	Because \co{hp_try_record()} must check for concurrent modifications.
 	To do that job, it needs a pointer to a pointer to the element,
 	so that it can check for a modification to the pointer to the
 	element.
 }\QuickQuizEnd

 The \co{hp_record()} function is quite straightforward:
 It repeatedly invokes \co{hp_try_record()} until the return value
 is something other than \co{HAZPTR_POISON}.

 \QuickQuiz{
 	Why bother with \co{hp_try_record()}?
 	Wouldn't it be easier to just use the failure-immune
 	\co{hp_record()} function?
 }\QuickQuizAnswer{
 	It might be easier in some sense, but as will be seen in the
 	Pre-BSD routing example, there are situations for which
 	\co{hp_record()} simply does not work.
 }\QuickQuizEnd

 The \co{hp_clear()} function is even more straightforward, with
 an \co{smp_mb()} to force full ordering between the caller's uses
 of the object protected by the hazard pointer and the setting of
 the hazard pointer to \co{NULL}.
 \end{fcvref}

 \begin{listing}
 \ebresizeverb{.91}{\input{CodeSamples/defer/hazptr@scan_free.fcv}}
 \caption{Hazard-Pointer Scanning and Freeing}
 \label{lst:defer:Hazard-Pointer Scanning and Freeing}
 \end{listing}

 \begin{fcvref}[ln:defer:hazptr:scan_free:free]
 Once a hazard-pointer-protected object has been removed from its
 linked data structure, so that it is now inaccessible to future
 hazard-pointer readers, it is passed to \co{hazptr_free_later()},
 which is shown on \clnrefrange{b}{e} of
 \cref{lst:defer:Hazard-Pointer Scanning and Freeing}
 (\path{hazptr.c}).
 \Clnref{enq:b,enq:e}
 enqueue the object on a per-thread list \co{rlist}
 and \clnref{count} counts the object in \co{rcount}.
 If \clnref{check} sees that a sufficiently large number of objects are now
 queued, \clnref{scan} invokes \co{hazptr_scan()} to attempt to
 free some of them.
 \end{fcvref}

 \begin{fcvref}[ln:defer:hazptr:scan_free:scan]
 The \co{hazptr_scan()} function is shown on \clnrefrange{b}{e}
 of the listing.
 This function relies on a fixed maximum number of threads (\co{NR_THREADS})
 and a fixed maximum number of hazard pointers per thread (\co{K}),
 which allows a fixed-size array of hazard pointers to be used.
 Because any thread might need to scan the hazard pointers, each thread
 maintains its own array, which is referenced by the per-thread variable
 \co{gplist}.
 If \clnref{check} determines that this thread has not yet allocated its
 \co{gplist}, \clnrefrange{alloc:b}{alloc:e} carry out the allocation.
 The \IX{memory barrier} on \clnref{mb1} ensures that all threads see the
 removal of all objects by this thread before
 \clnrefrange{loop:b}{loop:e} scan
 all of the hazard pointers, accumulating non-NULL pointers into
 the \co{plist} array and counting them in \co{psize}.
 The memory barrier on \clnref{mb2} ensures that the reads of
 the hazard pointers
 happen before any objects are freed.
 \Clnref{sort} then sorts this array to enable use of binary search below.

 \Clnref{rem:b,rem:e}
 remove all elements from this thread's list of
 to-be-freed objects, placing them on the local \co{tmplist}
 and \clnref{zero} zeroes the count.
 Each pass through the loop spanning
 \clnrefrange{loop2:b}{loop2:e} processes each
 of the to-be-freed objects.
 \Clnref{rem1st:b,rem1st:e}
 remove the first object from \co{tmplist},
 and if \clnref{chkhazp:b,chkhazp:e}
 determine that there is a hazard pointer
 protecting this object, \clnrefrange{back:b}{back:e}
 place it back onto \co{rlist}.
 Otherwise, \clnref{free} frees the object.
 \end{fcvref}

 \begin{listing}
 \input{CodeSamples/defer/route_hazptr@lookup.fcv}
 \caption{Hazard-Pointer Pre-BSD Routing Table Lookup}
 \label{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup}
 \end{listing}

 The Pre-BSD routing example can use hazard pointers as shown in
 \cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup}
 for data structures and \co{route_lookup()}, and in
 \cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}
 for \co{route_add()} and \co{route_del()}
 (\path{route_hazptr.c}).
 As with reference counting, the hazard-pointers implementation
 is quite similar to the sequential algorithm shown in
 \cref{lst:defer:Sequential Pre-BSD Routing Table}
 on
 \cpageref{lst:defer:Sequential Pre-BSD Routing Table},
 so only differences will be discussed.

 \begin{fcvref}[ln:defer:route_hazptr:lookup]
 Starting with
 \cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup},
 \clnref{hh} shows the \co{->hh} field used to queue objects pending
 hazard-pointer free,
 \clnref{re_freed} shows the \co{->re_freed} field used to detect
 use-after-free bugs, and \clnref{tryrecord} invokes
 \co{hp_try_record()} to attempt to acquire a hazard pointer.
 If the return value is \co{NULL}, \clnref{NULL} returns a not-found
 indication to the caller.
 If the call to \co{hp_try_record()} raced with deletion, \clnref{deleted}
 branches back to \clnref{retry}'s \co{retry} to re-traverse the list
 from the beginning.
 The \co{do}--\co{while} loop falls through when the desired element is
 located, but if this element has already been freed, \clnref{abort}
 terminates the program.
 Otherwise, the element's \co{->iface} field is returned to the caller.

 Note that \clnref{tryrecord} invokes \co{hp_try_record()} rather
 than the easier-to-use \co{hp_record()}, restarting the full search
 upon \co{hp_try_record()} failure.
 And such restarting is absolutely required for correctness.
 To see this, consider a hazard-pointer-protected linked list
 containing elements~A, B, and~C that is subjected to the following
 sequence of events:
 \end{fcvref}

 \begin{enumerate}
 \item	Thread~0 stores a hazard pointer to element~B
 	(having presumably traversed to element~B from element~A).
 \item	Thread~1 removes element~B from the list, which sets
 	the pointer from element~B to element~C to the special
 	\co{HAZPTR_POISON} value in order to mark the deletion.
 	Because Thread~0 has a hazard pointer to element~B,
 	it cannot yet be freed.
 \item	Thread~1 removes element~C from the list.
 	Because there are no hazard pointers referencing element~C,
 	it is immediately freed.
 \item	Thread~0 attempts to acquire a hazard pointer to now-removed
 	element~B's successor, but \co{hp_try_record()} returns the
 	\co{HAZPTR_POISON} value, forcing the caller to restart its
 	traversal from the beginning of the list.
 \end{enumerate}

 Which is a very good thing, because B's successor is the now-freed
 element~C, which means that Thread~0's subsequent accesses might have
 resulted in arbitrarily horrible memory corruption, especially if the
 memory for element~C had since been re-allocated for some other purpose.
 Therefore, hazard-pointer readers must typically restart the full
 traversal in the face of a concurrent deletion.
 Often the restart must go back to some global (and thus immortal) pointer,
 but it is sometimes possible to restart at some intermediate location
 if that location is guaranteed to still be live, for example, due to
 the current thread holding a lock, a reference count, etc.

 \QuickQuiz{
 	Readers must ``typically'' restart?
 	What are some exceptions?
 }\QuickQuizAnswer{
 	If the pointer emanates from a global variable or is otherwise
 	not subject to being freed, then \co{hp_record()} may be
 	used to repeatedly attempt to record the hazard pointer,
 	even in the face of concurrent deletions.

 	In certain cases, restart can be avoided by using link counting
 	as exemplified by the UnboundedQueue and ConcurrentHashMap data
 	structures implemented in Folly open-source library.\footnote{
 		\url{https://github.com/facebook/folly}}
 }\QuickQuizEnd

 Because algorithms using hazard pointers might be restarted at any
 step of their traversal through the linked data structure, such algorithms
 must typically take care to avoid making any changes to the data
 structure until after they have acquired all the hazard pointers that
 are required for the update in question.

 \QuickQuiz{
 	But don't these restrictions on hazard pointers also apply
 	to other forms of reference counting?
 }\QuickQuizAnswer{
 	Yes and no.
 	These restrictions apply only to reference-counting mechanisms whose
 	reference acquisition can fail.
 }\QuickQuizEnd

 These hazard-pointer restrictions result in great benefits to readers,
 courtesy of the fact that the hazard pointers are stored local to each
 CPU or thread, which in turn allows traversals to be carried out without
 any writes to the data structures being traversed.
 Referring back to
 \cref{fig:count:Optimization and the Four Parallel-Programming Tasks}
 on
 \cpageref{fig:count:Optimization and the Four Parallel-Programming Tasks},
 hazard pointers enable the CPU caches to do resource replication, which
 in turn allows weakening of the parallel-access-control mechanism,
 thus boosting performance and scalability.

 Another advantage of restarting hazard pointers traversals is a reduction in
 minimal memory footprint:
 Any object not currently referenced by some hazard pointer may be
 immediately freed.
 In contrast,
 \cref{sec:defer:Read-Copy Update (RCU)}
 will discuss a mechanism that avoids read-side retries (and minimizes
 read-side overhead), but which can result in a much larger memory
 footprint.

 \begin{listing}
 \input{CodeSamples/defer/route_hazptr@add_del.fcv}
 \caption{Hazard-Pointer Pre-BSD Routing Table Add\slash Delete}
 \label{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}
 \end{listing}

 \begin{fcvref}[ln:defer:route_hazptr:add_del]
 The \co{route_add()} and \co{route_del()} functions are shown in
 \cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}.
 \Clnref{init_freed} initializes \co{->re_freed},
 \clnref{poison} poisons the \co{->re_next} field of the newly removed
 object, and
 \clnref{free_later} passes that object to the
 \co{hazptr_free_later()} function, which will free that object once it
 is safe to do so.
 The spinlocks work the same as in
 \cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}.
 \end{fcvref}

 \begin{figure}
 \centering
 \resizebox{2.5in}{!}{\includegraphics{CodeSamples/defer/data/hps.2019.12.17a/perf-hazptr}}
 \caption{Pre-BSD Routing Table Protected by Hazard Pointers}
 \label{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
 \end{figure}

 \Cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
 shows the hazard-pointers-protected Pre-BSD routing algorithm's
 performance on the same read-only workload as for
 \cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}.
 Although hazard pointers scale far better than does reference counting,
 hazard pointers still require readers to do writes to shared
 memory (albeit with much improved locality of reference),
 and also require a full memory barrier and retry check for each
 object traversed.
 Therefore, hazard-pointers performance is still far short of ideal.
 On the other hand, unlike naive approaches to concurrent
 reference-counting, hazard pointers not only operate correctly for
 workloads involving concurrent updates, but also exhibit excellent
 scalability.
 Additional performance comparisons with other mechanisms may be found in
 \cref{chp:Data Structures}
 and in other publications~\cite{ThomasEHart2007a,McKenney:2013:SDS:2483852.2483867,MagedMichael04a}.

 \QuickQuizSeries{%
 \QuickQuizB{
 	\Cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
 	shows no sign of hyperthread-induced flattening at 224 threads.
 	Why is that?
 }\QuickQuizAnswerB{
 	Modern microprocessors are complicated beasts, so significant
 	skepticism is appropriate for any simple answer.
 	That aside, the most likely reason is the full memory barriers
 	required by hazard-pointers readers.
 	Any delays resulting from those memory barriers would make time
 	available to the other hardware thread sharing the core, resulting
 	in greater scalability at the expense of per-hardware-thread
 	performance.
 }\QuickQuizEndB
 %
 \QuickQuizE{
 	The paper ``Structured Deferral:
 	Synchronization via
 	Procrastination''~\cite{McKenney:2013:SDS:2483852.2483867}
 	shows that hazard pointers have near-ideal performance.
 	Whatever happened in
 	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}???
 }\QuickQuizAnswerE{
 	First,
 	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
 	has a linear y-axis, while most of the graphs in the
 	``Structured Deferral'' paper have logscale y-axes.
 	Next, that paper uses lightly-loaded hash tables, while
 	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}'s
 	uses a 10-element simple linked list, which means that hazard pointers
 	face a larger memory-barrier penalty in this workload than in
 	that of the ``Structured Deferral'' paper.
 	Finally, that paper used an older modest-sized x86 system, while
 	a much newer and larger system was used to generate the data
 	shown in
 	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}.

 	In addition, use of pairwise asymmetric
 	barriers~\cite{Windows2008FlushProcessWriteBuffers,JonathanCorbet2010sys-membarrier,Linuxmanpage2018sys-membarrier}
 	has been proposed to eliminate the read-side hazard-pointer
 	memory barriers on systems supporting this notion~\cite{DavidGoldblatt2018asymmetricFences},
 	which might improve the performance of hazard pointers beyond
 	what is shown in the figure.

 	As always, your mileage may vary.
 	Given the difference in performance, it is clear that hazard
 	pointers give you the best performance either for
 	very large data structures (where the memory-barrier overhead
 	will at least partially overlap cache-miss penalties) and
 	for data structures such as hash tables where a lookup
 	operation needs a minimal number of hazard pointers.
 }\QuickQuizEndE
 }

 On June 17, 2023, the ISO C++ Standards committee voted hazard pointers
 into C++26~\cite{MagedMichael2023C++26Hazptr3}.
 Daniel Anderson has produced a prototype C++ atomic shared-pointer
 implementation based on hazard
 pointers~\cite{DanielAnderson2023HazptrSharedPtr}.

 And hazard pointers are the concurrent reference counter mentioned
 on \cpageref{sec:defer:Mysteries hazard pointers}.
 The next section attempts to improve on hazard pointers by using
 sequence locks, which avoid both read-side writes and per-object memory
 barriers.
	% defer/hazptr.tex
	% mainfile: ../perfbook.tex
	% From an C++ Standards Committee meeting: "Can I hazptr cheezeberger?"

	\section{Hazard Pointers}
	\label{sec:defer:Hazard Pointers}
	%
	\epigraph{If in doubt, turn it inside out.}{Zara Carpenter}

	One way of avoiding problems with concurrent reference counting
	is to implement the reference counters
	inside out, that is, rather than incrementing an integer stored in the
	data element, instead store a pointer to that data element in
	per-CPU (or per-thread) lists.
	Each element of these lists is called a
	\emph{\IX{hazard pointer}}~\cite{MagedMichael04a}.\footnote{
	Also independently invented by others~\cite{HerlihyLM02}.}
	The value of a given data element's ``virtual reference counter'' can
	then be obtained by counting the number of hazard pointers referencing
	that element.
	Therefore, if that element has been rendered inaccessible to readers,
	and there are no longer any hazard pointers referencing it, that element
	may safely be freed.

	\begin{listing}
	\input{CodeSamples/defer/hazptr@record_clear.fcv}
	\caption{Hazard-Pointer Recording and Clearing}
	\label{lst:defer:Hazard-Pointer Recording and Clearing}
	\end{listing}

	Of course, this means that hazard-pointer acquisition must be carried
	out quite carefully in order to avoid destructive races with concurrent
	deletion.
	\begin{fcvref}[ln:defer:hazptr:record_clear]
	One implementation is shown in
	\cref{lst:defer:Hazard-Pointer Recording and Clearing},
	which shows \co{hp_try_record()} on \clnrefrange{htr:b}{htr:e},
	\co{hp_record()} on \clnrefrange{hr:b}{hr:e}, and
	\co{hp_clear()} on
	\clnrefrange{hc:b}{hc:e} (\path{hazptr.h}).

	The \co{hp_try_record()} macro on \clnref{htr:e} is simply a casting
	wrapper for the \co{_h_t_r_impl()} function, which attempts to store
	the pointer referenced by \co{p} into the hazard pointer referenced
	by \co{hp}.
	If successful, it returns the value of the stored pointer.
	If it fails due to that pointer being \co{NULL}, it returns \co{NULL}.
	Finally, if it fails due to racing with an update, it returns a special
	\co{HAZPTR_POISON} token.

	\QuickQuiz{
	Given that papers on hazard pointers use the bottom bits
	of each pointer to mark deleted elements, what is up with
	\co{HAZPTR_POISON}?
	}\QuickQuizAnswer{
	The published implementations of hazard pointers used
	non-blocking synchronization techniques for insertion
	and deletion.
	These techniques require that readers traversing the
	data structure ``help'' updaters complete their updates,
	which in turn means that readers need to look at the successor
	of a deleted element.

	In contrast, we will be using locking to synchronize updates,
	which does away with the need for readers to help updaters
	complete their updates, which in turn allows us to leave
	pointers' bottom bits alone.
	This approach allows read-side code to be simpler and faster.
	}\QuickQuizEnd

	\Clnref{htr:ro1} reads the pointer to the object to be protected.
	If \clnref{htr:race1} finds that this pointer was either \co{NULL} or
	the special \co{HAZPTR_POISON} deleted-object token, it returns
	the pointer's value to inform the caller of the failure.
	Otherwise, \clnref{htr:store} stores the pointer into the specified
	hazard pointer, and \clnref{htr:mb} forces full ordering of that
	store with the reload of the original pointer on \clnref{htr:ro2}.
	(See \cref{chp:Advanced Synchronization: Memory Ordering}
	for more information on memory ordering.)
	If the value of the original pointer has not changed, then the hazard
	pointer protects the pointed-to object, and in that case,
	\clnref{htr:success} returns a pointer to that object, which also
	indicates success to the caller.
	Otherwise, if the pointer changed between the two \co{READ_ONCE()}
	invocations, \clnref{htr:race2} indicates failure.

	\QuickQuiz{
	Why does \co{hp_try_record()} in
	\cref{lst:defer:Hazard-Pointer Recording and Clearing}
	take a double indirection to the data element?
	Why not \co{void } instead of \co{void *}?
	}\QuickQuizAnswer{
	Because \co{hp_try_record()} must check for concurrent modifications.
	To do that job, it needs a pointer to a pointer to the element,
	so that it can check for a modification to the pointer to the
	element.
	}\QuickQuizEnd

	The \co{hp_record()} function is quite straightforward:
	It repeatedly invokes \co{hp_try_record()} until the return value
	is something other than \co{HAZPTR_POISON}.

	\QuickQuiz{
	Why bother with \co{hp_try_record()}?
	Wouldn't it be easier to just use the failure-immune
	\co{hp_record()} function?
	}\QuickQuizAnswer{
	It might be easier in some sense, but as will be seen in the
	Pre-BSD routing example, there are situations for which
	\co{hp_record()} simply does not work.
	}\QuickQuizEnd

	The \co{hp_clear()} function is even more straightforward, with
	an \co{smp_mb()} to force full ordering between the caller's uses
	of the object protected by the hazard pointer and the setting of
	the hazard pointer to \co{NULL}.
	\end{fcvref}

	\begin{listing}
	\ebresizeverb{.91}{\input{CodeSamples/defer/hazptr@scan_free.fcv}}
	\caption{Hazard-Pointer Scanning and Freeing}
	\label{lst:defer:Hazard-Pointer Scanning and Freeing}
	\end{listing}

	\begin{fcvref}[ln:defer:hazptr:scan_free:free]
	Once a hazard-pointer-protected object has been removed from its
	linked data structure, so that it is now inaccessible to future
	hazard-pointer readers, it is passed to \co{hazptr_free_later()},
	which is shown on \clnrefrange{b}{e} of
	\cref{lst:defer:Hazard-Pointer Scanning and Freeing}
	(\path{hazptr.c}).
	\Clnref{enq:b,enq:e}
	enqueue the object on a per-thread list \co{rlist}
	and \clnref{count} counts the object in \co{rcount}.
	If \clnref{check} sees that a sufficiently large number of objects are now
	queued, \clnref{scan} invokes \co{hazptr_scan()} to attempt to
	free some of them.
	\end{fcvref}

	\begin{fcvref}[ln:defer:hazptr:scan_free:scan]
	The \co{hazptr_scan()} function is shown on \clnrefrange{b}{e}
	of the listing.
	This function relies on a fixed maximum number of threads (\co{NR_THREADS})
	and a fixed maximum number of hazard pointers per thread (\co{K}),
	which allows a fixed-size array of hazard pointers to be used.
	Because any thread might need to scan the hazard pointers, each thread
	maintains its own array, which is referenced by the per-thread variable
	\co{gplist}.
	If \clnref{check} determines that this thread has not yet allocated its
	\co{gplist}, \clnrefrange{alloc:b}{alloc:e} carry out the allocation.
	The \IX{memory barrier} on \clnref{mb1} ensures that all threads see the
	removal of all objects by this thread before
	\clnrefrange{loop:b}{loop:e} scan
	all of the hazard pointers, accumulating non-NULL pointers into
	the \co{plist} array and counting them in \co{psize}.
	The memory barrier on \clnref{mb2} ensures that the reads of
	the hazard pointers
	happen before any objects are freed.
	\Clnref{sort} then sorts this array to enable use of binary search below.

	\Clnref{rem:b,rem:e}
	remove all elements from this thread's list of
	to-be-freed objects, placing them on the local \co{tmplist}
	and \clnref{zero} zeroes the count.
	Each pass through the loop spanning
	\clnrefrange{loop2:b}{loop2:e} processes each
	of the to-be-freed objects.
	\Clnref{rem1st:b,rem1st:e}
	remove the first object from \co{tmplist},
	and if \clnref{chkhazp:b,chkhazp:e}
	determine that there is a hazard pointer
	protecting this object, \clnrefrange{back:b}{back:e}
	place it back onto \co{rlist}.
	Otherwise, \clnref{free} frees the object.
	\end{fcvref}

	\begin{listing}
	\input{CodeSamples/defer/route_hazptr@lookup.fcv}
	\caption{Hazard-Pointer Pre-BSD Routing Table Lookup}
	\label{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup}
	\end{listing}

	The Pre-BSD routing example can use hazard pointers as shown in
	\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup}
	for data structures and \co{route_lookup()}, and in
	\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}
	for \co{route_add()} and \co{route_del()}
	(\path{route_hazptr.c}).
	As with reference counting, the hazard-pointers implementation
	is quite similar to the sequential algorithm shown in
	\cref{lst:defer:Sequential Pre-BSD Routing Table}
	on
	\cpageref{lst:defer:Sequential Pre-BSD Routing Table},
	so only differences will be discussed.

	\begin{fcvref}[ln:defer:route_hazptr:lookup]
	Starting with
	\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Lookup},
	\clnref{hh} shows the \co{->hh} field used to queue objects pending
	hazard-pointer free,
	\clnref{re_freed} shows the \co{->re_freed} field used to detect
	use-after-free bugs, and \clnref{tryrecord} invokes
	\co{hp_try_record()} to attempt to acquire a hazard pointer.
	If the return value is \co{NULL}, \clnref{NULL} returns a not-found
	indication to the caller.
	If the call to \co{hp_try_record()} raced with deletion, \clnref{deleted}
	branches back to \clnref{retry}'s \co{retry} to re-traverse the list
	from the beginning.
	The \co{do}--\co{while} loop falls through when the desired element is
	located, but if this element has already been freed, \clnref{abort}
	terminates the program.
	Otherwise, the element's \co{->iface} field is returned to the caller.

	Note that \clnref{tryrecord} invokes \co{hp_try_record()} rather
	than the easier-to-use \co{hp_record()}, restarting the full search
	upon \co{hp_try_record()} failure.
	And such restarting is absolutely required for correctness.
	To see this, consider a hazard-pointer-protected linked list
	containing elements~A, B, and~C that is subjected to the following
	sequence of events:
	\end{fcvref}

	\begin{enumerate}
	\item Thread~0 stores a hazard pointer to element~B
	(having presumably traversed to element~B from element~A).
	\item Thread~1 removes element~B from the list, which sets
	the pointer from element~B to element~C to the special
	\co{HAZPTR_POISON} value in order to mark the deletion.
	Because Thread~0 has a hazard pointer to element~B,
	it cannot yet be freed.
	\item Thread~1 removes element~C from the list.
	Because there are no hazard pointers referencing element~C,
	it is immediately freed.
	\item Thread~0 attempts to acquire a hazard pointer to now-removed
	element~B's successor, but \co{hp_try_record()} returns the
	\co{HAZPTR_POISON} value, forcing the caller to restart its
	traversal from the beginning of the list.
	\end{enumerate}

	Which is a very good thing, because B's successor is the now-freed
	element~C, which means that Thread~0's subsequent accesses might have
	resulted in arbitrarily horrible memory corruption, especially if the
	memory for element~C had since been re-allocated for some other purpose.
	Therefore, hazard-pointer readers must typically restart the full
	traversal in the face of a concurrent deletion.
	Often the restart must go back to some global (and thus immortal) pointer,
	but it is sometimes possible to restart at some intermediate location
	if that location is guaranteed to still be live, for example, due to
	the current thread holding a lock, a reference count, etc.

	\QuickQuiz{
	Readers must ``typically'' restart?
	What are some exceptions?
	}\QuickQuizAnswer{
	If the pointer emanates from a global variable or is otherwise
	not subject to being freed, then \co{hp_record()} may be
	used to repeatedly attempt to record the hazard pointer,
	even in the face of concurrent deletions.

	In certain cases, restart can be avoided by using link counting
	as exemplified by the UnboundedQueue and ConcurrentHashMap data
	structures implemented in Folly open-source library.\footnote{
	\url{https://github.com/facebook/folly}}
	}\QuickQuizEnd

	Because algorithms using hazard pointers might be restarted at any
	step of their traversal through the linked data structure, such algorithms
	must typically take care to avoid making any changes to the data
	structure until after they have acquired all the hazard pointers that
	are required for the update in question.

	\QuickQuiz{
	But don't these restrictions on hazard pointers also apply
	to other forms of reference counting?
	}\QuickQuizAnswer{
	Yes and no.
	These restrictions apply only to reference-counting mechanisms whose
	reference acquisition can fail.
	}\QuickQuizEnd

	These hazard-pointer restrictions result in great benefits to readers,
	courtesy of the fact that the hazard pointers are stored local to each
	CPU or thread, which in turn allows traversals to be carried out without
	any writes to the data structures being traversed.
	Referring back to
	\cref{fig:count:Optimization and the Four Parallel-Programming Tasks}
	on
	\cpageref{fig:count:Optimization and the Four Parallel-Programming Tasks},
	hazard pointers enable the CPU caches to do resource replication, which
	in turn allows weakening of the parallel-access-control mechanism,
	thus boosting performance and scalability.

	Another advantage of restarting hazard pointers traversals is a reduction in
	minimal memory footprint:
	Any object not currently referenced by some hazard pointer may be
	immediately freed.
	In contrast,
	\cref{sec:defer:Read-Copy Update (RCU)}
	will discuss a mechanism that avoids read-side retries (and minimizes
	read-side overhead), but which can result in a much larger memory
	footprint.

	\begin{listing}
	\input{CodeSamples/defer/route_hazptr@add_del.fcv}
	\caption{Hazard-Pointer Pre-BSD Routing Table Add\slash Delete}
	\label{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}
	\end{listing}

	\begin{fcvref}[ln:defer:route_hazptr:add_del]
	The \co{route_add()} and \co{route_del()} functions are shown in
	\cref{lst:defer:Hazard-Pointer Pre-BSD Routing Table Add/Delete}.
	\Clnref{init_freed} initializes \co{->re_freed},
	\clnref{poison} poisons the \co{->re_next} field of the newly removed
	object, and
	\clnref{free_later} passes that object to the
	\co{hazptr_free_later()} function, which will free that object once it
	is safe to do so.
	The spinlocks work the same as in
	\cref{lst:defer:Reference-Counted Pre-BSD Routing Table Add/Delete}.
	\end{fcvref}

	\begin{figure}
	\centering
	\resizebox{2.5in}{!}{\includegraphics{CodeSamples/defer/data/hps.2019.12.17a/perf-hazptr}}
	\caption{Pre-BSD Routing Table Protected by Hazard Pointers}
	\label{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
	\end{figure}

	\Cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
	shows the hazard-pointers-protected Pre-BSD routing algorithm's
	performance on the same read-only workload as for
	\cref{fig:defer:Pre-BSD Routing Table Protected by Reference Counting}.
	Although hazard pointers scale far better than does reference counting,
	hazard pointers still require readers to do writes to shared
	memory (albeit with much improved locality of reference),
	and also require a full memory barrier and retry check for each
	object traversed.
	Therefore, hazard-pointers performance is still far short of ideal.
	On the other hand, unlike naive approaches to concurrent
	reference-counting, hazard pointers not only operate correctly for
	workloads involving concurrent updates, but also exhibit excellent
	scalability.
	Additional performance comparisons with other mechanisms may be found in
	\cref{chp:Data Structures}
	and in other publications~\cite{ThomasEHart2007a,McKenney:2013:SDS:2483852.2483867,MagedMichael04a}.

	\QuickQuizSeries{%
	\QuickQuizB{
	\Cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
	shows no sign of hyperthread-induced flattening at 224 threads.
	Why is that?
	}\QuickQuizAnswerB{
	Modern microprocessors are complicated beasts, so significant
	skepticism is appropriate for any simple answer.
	That aside, the most likely reason is the full memory barriers
	required by hazard-pointers readers.
	Any delays resulting from those memory barriers would make time
	available to the other hardware thread sharing the core, resulting
	in greater scalability at the expense of per-hardware-thread
	performance.
	}\QuickQuizEndB
	%
	\QuickQuizE{
	The paper ``Structured Deferral:
	Synchronization via
	Procrastination''~\cite{McKenney:2013:SDS:2483852.2483867}
	shows that hazard pointers have near-ideal performance.
	Whatever happened in
	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}???
	}\QuickQuizAnswerE{
	First,
	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}
	has a linear y-axis, while most of the graphs in the
	``Structured Deferral'' paper have logscale y-axes.
	Next, that paper uses lightly-loaded hash tables, while
	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}'s
	uses a 10-element simple linked list, which means that hazard pointers
	face a larger memory-barrier penalty in this workload than in
	that of the ``Structured Deferral'' paper.
	Finally, that paper used an older modest-sized x86 system, while
	a much newer and larger system was used to generate the data
	shown in
	\cref{fig:defer:Pre-BSD Routing Table Protected by Hazard Pointers}.

	In addition, use of pairwise asymmetric
	barriers~\cite{Windows2008FlushProcessWriteBuffers,JonathanCorbet2010sys-membarrier,Linuxmanpage2018sys-membarrier}
	has been proposed to eliminate the read-side hazard-pointer
	memory barriers on systems supporting this notion~\cite{DavidGoldblatt2018asymmetricFences},
	which might improve the performance of hazard pointers beyond
	what is shown in the figure.

	As always, your mileage may vary.
	Given the difference in performance, it is clear that hazard
	pointers give you the best performance either for
	very large data structures (where the memory-barrier overhead
	will at least partially overlap cache-miss penalties) and
	for data structures such as hash tables where a lookup
	operation needs a minimal number of hazard pointers.
	}\QuickQuizEndE
	}

	On June 17, 2023, the ISO C++ Standards committee voted hazard pointers
	into C++26~\cite{MagedMichael2023C++26Hazptr3}.
	Daniel Anderson has produced a prototype C++ atomic shared-pointer
	implementation based on hazard
	pointers~\cite{DanielAnderson2023HazptrSharedPtr}.

	And hazard pointers are the concurrent reference counter mentioned
	on \cpageref{sec:defer:Mysteries hazard pointers}.
	The next section attempts to improve on hazard pointers by using
	sequence locks, which avoid both read-side writes and per-object memory
	barriers.