blob: 805334ad5fdc48c069b567cfaf3fbc4352c8e257 [file] [log] [blame]
% defer/rcuexercises.tex
\subsection{RCU Exercises}
\label{sec:defer:RCU Exercises}
This section is organized as a series of Quick Quizzes that invite you
to apply RCU to a number of examples earlier in this book.
The answer to each Quick Quiz gives some hints, and also contains a
pointer to a later section where the solution is explained at length.
The \co{rcu_read_lock()}, \co{rcu_read_unlock()}, \co{rcu_dereference()},
\co{rcu_assign_pointer()}, and \co{synchronize_rcu()} primitives should
suffice for most of these exercises.
\QuickQuiz{}
The statistical-counter implementation shown in
Figure~\ref{fig:count:Per-Thread Statistical Counters}
(\url{count_end.c})
used a global lock to guard the summation in \co{read_count()},
which resulted in poor performance and negative scalability.
How could you use RCU to provide \co{read_count()} with
excellent performance and good scalability.
(Keep in mind that \co{read_count()}'s scalability will
necessarily be limited by its need to scan all threads'
counters.)
\QuickQuizAnswer{
Hint: place the global variable \co{finalcount} and the
array \co{counterp[]} into a single RCU-protected struct.
At initialization time, this structure would be allocated
and set to all zero and \co{NULL}.
The \co{inc_count()} function would be unchanged.
The \co{read_count()} function would use \co{rcu_read_lock()}
instead of acquiring \co{final_mutex}, and would need to
use \co{rcu_dereference()} to acquire a reference to the
current structure.
The \co{count_register_thread()} function would set the
array element corresponding to the newly created thread
to reference that thread's per-thread \co{counter} variable.
The \co{count_unregister_thread()} function would need to
allocate a new structure, acquire \co{final_mutex},
copy the old structure to the new one, add the outgoing
thread's \co{counter} variable to the total, \co{NULL}
the pointer to this same \co{counter} variable,
use \co{rcu_assign_pointer()} to install the new structure
in place of the old one, release \co{final_mutex},
wait for a grace period, and finally free the old structure.
Does this really work?
Why or why not?
See
Section~\ref{sec:together:RCU and Per-Thread-Variable-Based Statistical Counters}
on
page~\pageref{sec:together:RCU and Per-Thread-Variable-Based Statistical Counters}
for more details.
} \QuickQuizEnd
\QuickQuiz{}
Section~\ref{sec:count:Applying Specialized Parallel Counters}
showed a fanciful pair of code fragments that dealt with counting
I/O accesses to removable devices.
These code fragments suffered from high overhead on the fastpath
(starting an I/O) due to the need to acquire a reader-writer
lock.
How would you use RCU to provide excellent performance and
scalability?
(Keep in mind that the performance of the common-case first
code fragment that does I/O accesses is much more important
than that of the device-removal code fragment.)
\QuickQuizAnswer{
Hint: replace the read-acquisitions of the reader-writer lock
with RCU read-side critical sections, then adjust the
device-removal code fragment to suit.
See
Section~\ref{sec:together:RCU and Counters for Removable I/O Devices}
on
Page~\pageref{sec:together:RCU and Counters for Removable I/O Devices}
for one solution to this problem.
} \QuickQuizEnd