formal/dyntickrcu.tex - pub/scm/linux/kernel/git/paulmck/perfbook - Git at Google

 % formal/dyntickrcu.tex
 % mainfile: ../perfbook.tex
 % SPDX-License-Identifier: CC-BY-SA-3.0

 % Disable frame around VerbatimN in two-column layout
 \IfTwoColumn{
 \RecustomVerbatimEnvironment{VerbatimN}{Verbatim}%
 {numbers=left,numbersep=5pt,xleftmargin=10pt,xrightmargin=0pt,frame=none}
 \setlength{\lnlblraise}{0pt}
 }{}

 \subsection{Promela Parable:
 			     {dynticks} and Preemptible RCU}
 \label{sec:formal:Promela Parable: dynticks and Preemptible RCU}

 In early 2008, a preemptible variant of RCU was accepted into
 mainline Linux in support of real-time workloads,
 a variant similar to the RCU implementations in
 the \rt\ patchset~\cite{IngoMolnar05a}
 since August 2005.
 Preemptible RCU is needed for real-time workloads because older
 RCU implementations disable preemption across RCU read-side
 critical sections, resulting in excessive real-time latencies.

 However, one disadvantage of the older \rt\ implementation
 was that each \IX{grace period}
 requires work to be done on each CPU, even if that CPU is in a low-power
 ``dynticks-idle'' state,
 and thus incapable of executing RCU read-side critical sections.
 The idea behind the dynticks-idle state is that idle CPUs
 should be physically powered down in order to conserve energy.
 In short, preemptible RCU can disable a valuable energy-conservation
 feature of recent Linux kernels.
 Although Josh Triplett and Paul McKenney
 had discussed some approaches for allowing
 CPUs to remain in low-power state throughout an RCU grace period
 (thus preserving the Linux kernel's ability to conserve energy), matters
 did not come to a head until Steve Rostedt integrated a new dyntick
 implementation with preemptible RCU in the \rt\ patchset.

 This combination caused one of Steve's systems to hang on boot, so in
 October, Paul coded up a dynticks-friendly modification to preemptible RCU's
 grace-period processing.
 Steve coded up \co{rcu_irq_enter()} and \co{rcu_irq_exit()}
 interfaces called from the
 \co{irq_enter()} and \co{irq_exit()} interrupt
 entry/exit functions.
 These \co{rcu_irq_enter()} and \co{rcu_irq_exit()}
 functions are needed to allow RCU to reliably handle situations where
 a dynticks-idle CPU is momentarily powered up for an interrupt
 handler containing RCU read-side critical sections.
 With these changes in place, Steve's system booted reliably,
 but Paul continued inspecting the code periodically on the assumption
 that we could not possibly have gotten the code right on the first try.

 Paul reviewed the code repeatedly from October 2007 to February 2008,
 and almost always found at least one bug.
 In one case, Paul even coded and tested a fix before realizing that the
 bug was illusory, and in fact in all cases, the ``bug'' turned out to be
 illusory.

 Near the end of February, Paul grew tired of this game.
 He therefore decided to enlist the aid of
 Promela and Spin.
 The following presents a series of seven increasingly realistic
 Promela models, the last of which passes, consuming about
 40\,GB of main memory for the state space.

 More important, Promela and Spin did find a very subtle bug for me!

 \QuickQuiz{
 	Yeah, that's just great!
 	Now, just what am I supposed to do if I don't happen to have a
 	machine with 40\,GB of main memory???
 }\QuickQuizAnswer{
 	Relax, there are a number of lawful answers to
 	this question:
 	\begin{enumerate}
 	\item	Try compiler flags \co{-DCOLLAPSE} and \co{-DMA=N}
 		to reduce memory consumption.
 		See \cref{sec:formal:Running the QRCU Example}.
 	\item	Further optimize the model, reducing its memory consumption.
 	\item	Work out a pencil-and-paper proof, perhaps starting with the
 		comments in the code in the Linux kernel.
 	\item	Devise careful torture tests, which, though they cannot prove
 		the code correct, can find hidden bugs.
 	\item	There is some movement towards tools that do model
 		checking on clusters of smaller machines.
 		However, please note that we have not actually used such
 		tools myself, courtesy of some large machines that Paul has
 		occasional access to.
 	\item	Wait for memory sizes of affordable systems to expand
 		to fit your problem.
 	\item	Use one of a number of cloud-computing services to rent
 		a large system for a short time period.
 	\end{enumerate}
 }\QuickQuizEnd

 Still better would be to come up with a simpler and faster algorithm
 that has a smaller state space.
 Even better would be an algorithm so simple that its correctness was
 obvious to the casual observer!

 \Crefrange{sec:formal:Introduction to Preemptible RCU and dynticks}
 {sec:formal:Grace-Period Interface}
 give an overview of preemptible RCU's dynticks interface,
 followed by
 \cref{sec:formal:Validating Preemptible RCU and dynticks}'s
 discussion of the validation of the interface.

 \subsubsection{Introduction to Preemptible RCU and dynticks}
 \label{sec:formal:Introduction to Preemptible RCU and dynticks}

 The per-CPU \co{dynticks_progress_counter} variable is
 central to the interface between dynticks and preemptible RCU\@.
 This variable has an even value whenever the corresponding CPU
 is in dynticks-idle mode, and an odd value otherwise.
 A CPU exits dynticks-idle mode for the following three reasons:

 \begin{enumerate}
 \item	To start running a task,
 \item	When entering the outermost of a possibly nested set of interrupt
 	handlers, and
 \item	When entering an \IXacr{nmi} handler.
 \end{enumerate}

 Preemptible RCU's grace-period machinery samples the value of
 the \co{dynticks_progress_counter} variable in order to
 determine when a dynticks-idle CPU may safely be ignored.

 The following three sections give an overview of the task
 interface, the interrupt/NMI interface, and the use of
 the \co{dynticks_progress_counter} variable by the
 grace-period machinery as of Linux kernel v2.6.25-rc4.

 \subsubsection{Task Interface}
 \label{sec:formal:Task Interface}

 When a given CPU enters dynticks-idle mode because it has no more
 tasks to run, it invokes \co{rcu_enter_nohz()}:

 \begin{VerbatimN}
 static inline void rcu_enter_nohz(void)
 {
 	mb();
 	__get_cpu_var(dynticks_progress_counter)++;
 	WARN_ON(__get_cpu_var(dynticks_progress_counter) &
 	        0x1);
 }
 \end{VerbatimN}

 This function simply increments \co{dynticks_progress_counter} and
 checks that the result is even, but first executing a memory barrier
 to ensure that any other CPU that sees the new value of
 \co{dynticks_progress_counter} will also see the completion
 of any prior RCU read-side critical sections.

 Similarly, when a CPU that is in dynticks-idle mode prepares to
 start executing a newly runnable task, it invokes
 \co{rcu_exit_nohz()}:

 \begin{VerbatimN}
 static inline void rcu_exit_nohz(void)
 {
 	__get_cpu_var(dynticks_progress_counter)++;
 	mb();
 	WARN_ON(!(__get_cpu_var(dynticks_progress_counter) &
 	          0x1));
 }
 \end{VerbatimN}

 This function again increments \co{dynticks_progress_counter},
 but follows it with a memory barrier to ensure that if any other CPU
 sees the result of any subsequent RCU read-side critical section,
 then that other CPU will also see the incremented value of
 \co{dynticks_progress_counter}.
 Finally, \co{rcu_exit_nohz()} checks that the result of the
 increment is an odd value.

 The \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()}
 functions handle the case where a CPU enters and exits dynticks-idle
 mode due to task execution, but does not handle interrupts, which are
 covered in the following section.

 \subsubsection{Interrupt Interface}
 \label{sec:formal:Interrupt Interface}

 The \co{rcu_irq_enter()} and \co{rcu_irq_exit()}
 functions handle interrupt/NMI entry and exit, respectively.
 Of course, nested interrupts must also be properly accounted for.
 The possibility of nested interrupts is handled by a second per-CPU
 variable, \co{rcu_update_flag}, which is incremented upon
 entry to an interrupt or NMI handler (in \co{rcu_irq_enter()})
 and is decremented upon exit (in \co{rcu_irq_exit()}).
 In addition, the pre-existing \co{in_interrupt()} primitive is
 used to distinguish between an outermost or a nested interrupt/NMI\@.

 Interrupt entry is handled by the \co{rcu_irq_enter()}
 shown below:

 \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_irq_enter]
 \begin{VerbatimN}[commandchars=\\\[\]]
 void rcu_irq_enter(void)
 {
 	int cpu = smp_processor_id();	\lnlbl[fetch]

 	if (per_cpu(rcu_update_flag, cpu))	\lnlbl[inc:b]
 		per_cpu(rcu_update_flag, cpu)++; \lnlbl[inc:e]
 	if (!in_interrupt() &&			\lnlbl[chk_lv:b]
 	    (per_cpu(dynticks_progress_counter,
 	             cpu) & 0x1) == 0) {	\lnlbl[chk_lv:e]
 		per_cpu(dynticks_progress_counter, cpu)++; \lnlbl[inc_cnt]
 		smp_mb();			\lnlbl[mb]
 		per_cpu(rcu_update_flag, cpu)++;\lnlbl[inc_flg]
 	}
 }
 \end{VerbatimN}
 \end{fcvlabel}

 \begin{fcvref}[ln:formal:dyntickrcu:rcu_irq_enter]
 \Clnref{fetch} fetches the current CPU's number, while \clnref{inc:b,inc:e}
 increment the \co{rcu_update_flag} nesting counter if it
 is already non-zero.
 \Clnrefrange{chk_lv:b}{chk_lv:e} check to see whether we are
 the outermost level of
 interrupt, and, if so, whether \co{dynticks_progress_counter}
 needs to be incremented.
 If so, \clnref{inc_cnt} increments \co{dynticks_progress_counter},
 \clnref{mb} executes a memory barrier, and \clnref{inc_flg} increments
 \co{rcu_update_flag}.
 As with \co{rcu_exit_nohz()}, the memory barrier ensures that
 any other CPU that sees the effects of an RCU read-side critical section
 in the interrupt handler (following the \co{rcu_irq_enter()}
 invocation) will also see the increment of
 \co{dynticks_progress_counter}.
 \end{fcvref}

 \QuickQuizSeries{%
 \QuickQuizB{
 	Why not simply increment \co{rcu_update_flag}, and then only
 	increment \co{dynticks_progress_counter} if the old value
 	of \co{rcu_update_flag} was zero???
 }\QuickQuizAnswerB{
 	This fails in presence of NMIs.
 	To see this, suppose an NMI was received just after
 	\co{rcu_irq_enter()} incremented \co{rcu_update_flag},
 	but before it incremented \co{dynticks_progress_counter}.
 	The instance of \co{rcu_irq_enter()} invoked by the NMI
 	would see that the original value of \co{rcu_update_flag}
 	was non-zero, and would therefore refrain from incrementing
 	\co{dynticks_progress_counter}.
 	This would leave the RCU grace-period machinery no clue that the
 	NMI handler was executing on this CPU, so that any RCU read-side
 	critical sections in the NMI handler would lose their RCU protection.

 	The possibility of NMI handlers, which, by definition cannot
 	be masked, does complicate this code.
 }\QuickQuizEndB
 %
 \QuickQuizE{
 	\begin{fcvref}[ln:formal:dyntickrcu:rcu_irq_enter]
 	But if \clnref{chk_lv:b} finds that we are the outermost interrupt,
 	wouldn't we \emph{always} need to increment
 	\co{dynticks_progress_counter}?
 	\end{fcvref}
 }\QuickQuizAnswerE{
 	Not if we interrupted a running task!
 	In that case, \co{dynticks_progress_counter} would
 	have already been incremented by \co{rcu_exit_nohz()},
 	and there would be no need to increment it again.
 }\QuickQuizEndE
 }

 Interrupt exit is handled similarly by
 \co{rcu_irq_exit()}:

 \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_irq_exit]
 \begin{VerbatimN}[commandchars=\\\[\]]
 void rcu_irq_exit(void)
 {
 	int cpu = smp_processor_id();	\lnlbl[fetch]

 	if (per_cpu(rcu_update_flag, cpu)) { \lnlbl[chk_flg]
 		if (--per_cpu(rcu_update_flag, cpu)) \lnlbl[dec_flg]
 			return;
 		WARN_ON(in_interrupt());	\lnlbl[verify]
 		smp_mb();			\lnlbl[mb]
 		per_cpu(dynticks_progress_counter, cpu)++; \lnlbl[inc_cnt]
 		WARN_ON(per_cpu(dynticks_progress_counter, \lnlbl[vrf_even:b]
 		                cpu) & 0x1); \lnlbl[vrf_even:e]
 	}
 }
 \end{VerbatimN}
 \end{fcvlabel}

 \begin{fcvref}[ln:formal:dyntickrcu:rcu_irq_exit]
 \Clnref{fetch} fetches the current CPU's number, as before.
 \Clnref{chk_flg} checks to see if the \co{rcu_update_flag} is
 non-zero, returning immediately (via falling off the end of the
 function) if not.
 Otherwise, \clnrefthro{dec_flg}{vrf_even:e} come into play.
 \Clnref{dec_flg} decrements \co{rcu_update_flag}, returning
 if the result is not zero.
 \Clnref{verify} verifies that we are indeed leaving the outermost
 level of nested interrupts, \clnref{mb} executes a memory barrier,
 \clnref{inc_cnt} increments \co{dynticks_progress_counter},
 and \clnref{vrf_even:b,vrf_even:e} verify that this
 variable is now even.
 As with \co{rcu_enter_nohz()}, the memory barrier ensures that
 any other CPU that sees the increment of
 \co{dynticks_progress_counter}
 will also see the effects of an RCU read-side critical section
 in the interrupt handler (preceding the \co{rcu_irq_exit()}
 invocation).
 \end{fcvref}

 These two sections have described how the
 \co{dynticks_progress_counter} variable is maintained during
 entry to and exit from dynticks-idle mode, both by tasks and by
 interrupts and NMIs.
 The following section describes how this variable is used by
 preemptible RCU's grace-period machinery.

 \subsubsection{Grace-Period Interface}
 \label{sec:formal:Grace-Period Interface}

 Of the four preemptible RCU grace-period states shown in
 \cref{fig:formal:Preemptible RCU State Machine},
 only the \co{rcu_try_flip_waitack_state}
 and \co{rcu_try_flip_waitmb_state} states need to wait
 for other CPUs to respond.

 \begin{figure}
 \centering
 \resizebox{2.5in}{!}{\includegraphics{formal/RCUpreemptStates}}
 \caption{Preemptible RCU State Machine}
 \label{fig:formal:Preemptible RCU State Machine}
 \end{figure}

 Of course, if a given CPU is in dynticks-idle state, we shouldn't
 wait for it.
 Therefore, just before entering one of these two states,
 the preceding state takes a snapshot of each CPU's
 \co{dynticks_progress_counter} variable, placing the
 snapshot in another per-CPU variable,
 \co{rcu_dyntick_snapshot}.
 This is accomplished by invoking
 \co{dyntick_save_progress_counter()}, shown below:

 \begin{VerbatimN}
 static void dyntick_save_progress_counter(int cpu)
 {
 	per_cpu(rcu_dyntick_snapshot, cpu) =
 		per_cpu(dynticks_progress_counter, cpu);
 }
 \end{VerbatimN}

 The \co{rcu_try_flip_waitack_state} state invokes
 \co{rcu_try_flip_waitack_needed()}, shown below:

 \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed]
 \begin{VerbatimN}[samepage=true,commandchars=\\\[\]]
 static inline int
 rcu_try_flip_waitack_needed(int cpu)
 {
 	long curr;
 	long snap;

 	curr = per_cpu(dynticks_progress_counter, cpu); \lnlbl[curr]
 	snap = per_cpu(rcu_dyntick_snapshot, cpu); \lnlbl[snap]
 	smp_mb();				\lnlbl[mb]
 	if ((curr == snap) && ((curr & 0x1) == 0)) \lnlbl[chk_remain]
 		return 0;			\lnlbl[ret_0_r]
 	if ((curr - snap) > 2 || (snap & 0x1) == 0) \lnlbl[chk_idle]
 		return 0;			\lnlbl[ret_0_i]
 	return 1;				\lnlbl[ret_1]
 }
 \end{VerbatimN}
 \end{fcvlabel}

 \begin{fcvref}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed]
 \Clnref{curr,snap} pick up current and snapshot versions of
 \co{dynticks_progress_counter}, respectively.
 The memory barrier on \clnref{mb} ensures that the counter checks
 in the later \co{rcu_try_flip_waitzero_state} follow
 the fetches of these counters.
 \Clnref{chk_remain,ret_0_r} return zero
 (meaning no communication with the
 specified CPU is required) if that CPU has remained in dynticks-idle
 state since the time that the snapshot was taken.
 Similarly, \clnref{chk_idle,ret_0_i} return zero
 if that CPU was initially
 in dynticks-idle state or if it has completely passed through a
 dynticks-idle state.
 In both these cases, there is no way that the CPU could have retained
 the old value of the grace-period counter.
 If neither of these conditions hold, \clnref{ret_1} returns one, meaning
 that the CPU needs to explicitly respond.
 \end{fcvref}

 For its part, the \co{rcu_try_flip_waitmb_state} state
 invokes \co{rcu_try_flip_waitmb_needed()}, shown below:

 \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_try_flip_waitmb_needed]
 \begin{VerbatimN}[commandchars=\\\[\]]
 static inline int
 rcu_try_flip_waitmb_needed(int cpu)
 {
 	long curr;
 	long snap;

 	curr = per_cpu(dynticks_progress_counter, cpu);
 	snap = per_cpu(rcu_dyntick_snapshot, cpu);
 	smp_mb();
 	if ((curr == snap) && ((curr & 0x1) == 0))
 		return 0;
 	if (curr != snap)		\lnlbl[chk_to_from]
 		return 0;		\lnlbl[ret_0]
 	return 1;
 }
 \end{VerbatimN}
 \end{fcvlabel}

 \begin{fcvref}[ln:formal:dyntickrcu:rcu_try_flip_waitmb_needed]
 This is quite similar to \co{rcu_try_flip_waitack_needed()},
 the difference being in \clnref{chk_to_from,ret_0}, because any transition
 either to or from dynticks-idle state executes the memory barrier
 needed by the \co{rcu_try_flip_waitmb_state} state.
 \end{fcvref}

 We now have seen all the code involved in the interface between
 RCU and the dynticks-idle state.
 The next section builds up the Promela model used to verify this
 code.

 \QuickQuiz{
 	Can you spot any bugs in any of the code in this section?
 }\QuickQuizAnswer{
 	Read the next section to see if you were correct.
 }\QuickQuizEnd

 \subsection{Validating Preemptible RCU and dynticks}
 \label{sec:formal:Validating Preemptible RCU and dynticks}

 This section develops a Promela model for the interface between
 dynticks and RCU step by step, with each of
 \crefrange{sec:formal:Basic Model}{sec:formal:Validating NMI Handlers}
 illustrating one step, starting with the process-level code,
 adding assertions, interrupts, and finally NMIs.

 \Cref{sec:formal:Lessons (Re)Learned} lists
 lessons (re)learned during this effort, and
 \crefrange{sec:formal:Simplicity Avoids Formal Verification}{sec:formal:Discussion}
 present a simpler solution to RCU's dynticks problem.

 \subsubsection{Basic Model}
 \label{sec:formal:Basic Model}

 This section translates the process-level dynticks entry/exit
 code and the grace-period processing into
 Promela~\cite{Holzmann03a}.
 We start with \co{rcu_exit_nohz()} and
 \co{rcu_enter_nohz()}
 from the 2.6.25-rc4 kernel, placing these in a single Promela
 process that models exiting and entering dynticks-idle mode in
 a loop as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base@dyntick_nohz.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base:dyntick_nohz]
 \Clnref{do,od} define a loop.
 \Clnref{break} exits the loop once the loop counter~\co{i}
 has exceeded the limit \co{MAX_DYNTICK_LOOP_NOHZ}.
 \Clnref{kick_loop} tells the loop construct to execute
 \clnrefrange{ex_inc:b}{inc_i}
 for each pass through the loop.
 Because the conditionals on \clnref{break,kick_loop}
 are exclusive of
 each other, the normal Promela random selection of true conditions
 is disabled.
 \Clnref{ex_inc:b,ex_inc:e} model
 \co{rcu_exit_nohz()}'s non-atomic
 increment of \co{dynticks_progress_counter}, while
 \clnref{ex_assert} models the \co{WARN_ON()}.
 The \co{atomic} construct simply reduces the Promela state space,
 given that the \co{WARN_ON()} is not strictly speaking part
 of the algorithm.
 \Clnrefrange{ent_inc:b}{ent_inc:e} similarly model the increment and
 \co{WARN_ON()} for \co{rcu_enter_nohz()}.
 Finally, \clnref{inc_i} increments the loop counter.
 \end{fcvref}

 Each pass through the loop therefore models a CPU exiting
 dynticks-idle mode (for example, starting to execute a task), then
 re-entering dynticks-idle mode (for example, that same task blocking).

 \QuickQuizSeries{%
 \QuickQuizB{
 	Why isn't the memory barrier in \co{rcu_exit_nohz()}
 	and \co{rcu_enter_nohz()} modeled in Promela?
 }\QuickQuizAnswerB{
 	Promela assumes sequential consistency, so
 	it is not necessary to model memory barriers.
 	In fact, one must instead explicitly model lack of memory barriers,
 	for example, as shown in
 	\cref{lst:formal:QRCU Unordered Summation} on
 	\cpageref{lst:formal:QRCU Unordered Summation}.
 }\QuickQuizEndB
 %
 \QuickQuizE{
 	Isn't it a bit strange to model \co{rcu_exit_nohz()}
 	followed by \co{rcu_enter_nohz()}?
 	Wouldn't it be more natural to instead model entry before exit?
 }\QuickQuizAnswerE{
 	It probably would be more natural, but we will need
 	this particular order for the liveness checks that we will add later.
 }\QuickQuizEndE
 }

 The next step is to model the interface to RCU's grace-period
 processing.
 For this, we need to model
 \co{dyntick_save_progress_counter()},
 \co{rcu_try_flip_waitack_needed()},
 \co{rcu_try_flip_waitmb_needed()},
 as well as portions of
 \co{rcu_try_flip_waitack()} and
 \co{rcu_try_flip_waitmb()}, all from the 2.6.25-rc4 kernel.
 The following \co{grace_period()} Promela process models
 these functions as they would be invoked during a single pass
 through preemptible RCU's grace-period processing.

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base@grace_period.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base:grace_period]
 \Clnrefrange{print:b}{print:e} print out the loop limit
 (but only into the ``.trail'' file
 in case of error) and models a line of code
 from \co{rcu_try_flip_idle()} and its call to
 \co{dyntick_save_progress_counter()}, which takes a
 snapshot of the current CPU's \co{dynticks_progress_counter}
 variable.
 These two lines are executed atomically to reduce state space.

 \Clnrefrange{do1}{od1} model the relevant code in
 \co{rcu_try_flip_waitack()} and its call to
 \co{rcu_try_flip_waitack_needed()}.
 This loop is modeling the grace-period state machine waiting for
 a counter-flip acknowledgement from each CPU, but only that part
 that interacts with dynticks-idle CPUs.

 \Clnref{snap} models a line from \co{rcu_try_flip_waitzero()}
 and its call to \co{dyntick_save_progress_counter()}, again
 taking a snapshot of the CPU's \co{dynticks_progress_counter}
 variable.

 Finally, \clnrefrange{do2}{od2} model the relevant code in
 \co{rcu_try_flip_waitack()} and its call to
 \co{rcu_try_flip_waitack_needed()}.
 This loop is modeling the grace-period state-machine waiting for
 each CPU to execute a memory barrier, but again only that part
 that interacts with dynticks-idle CPUs.
 \end{fcvref}

 \QuickQuiz{
 	Wait a minute!
 	In the Linux kernel, both \co{dynticks_progress_counter} and
 	\co{rcu_dyntick_snapshot} are per-CPU variables.
 	So why are they instead being modeled as single global variables?
 }\QuickQuizAnswer{
 	Because the grace-period code processes each
 	CPU's \co{dynticks_progress_counter} and
 	\co{rcu_dyntick_snapshot} variables separately,
 	we can collapse the state onto a single CPU\@.
 	If the grace-period code were instead to do something special
 	given specific values on specific CPUs, then we would indeed need
 	to model multiple CPUs.
 	But fortunately, we can safely confine ourselves to two CPUs, the
 	one running the grace-period processing and the one entering and
 	leaving dynticks-idle mode.
 }\QuickQuizEnd

 The resulting model (\path{dyntickRCU-base.spin}),
 when run with the
 \path{runspin.sh} script,
 generates 691 states and
 passes without errors, which is not at all surprising given that
 it completely lacks the assertions that could find failures.
 The next section therefore adds safety assertions.

 \subsubsection{Validating Safety}
 \label{sec:formal:Validating Safety}

 A safe RCU implementation must never permit a grace period to
 complete before the completion of any RCU readers that started
 before the start of the grace period.
 This is modeled by a \co{grace_period_state} variable that
 can take on three states as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-s@grace_period_state.fcv}

 The \co{grace_period()} process sets this variable as it
 progresses through the grace-period phases, as shown below:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-s@grace_period.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-s:grace_period]
 \Clnref{upd_gps1,upd_gps2,upd_gps3,upd_gps4,upd_gps5,upd_gps6}
 update this variable (combining
 atomically with algorithmic operations where feasible) to
 allow the \co{dyntick_nohz()} process to verify the basic
 RCU safety property.
 The form of this verification is to assert that the value of the
 \co{grace_period_state} variable cannot jump from
 \co{GP_IDLE} to \co{GP_DONE} during a time period
 over which RCU readers could plausibly persist.
 \end{fcvref}

 \QuickQuiz{
 	\begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-s:grace_period]
 	Given there are a pair of back-to-back changes to
 	\co{grace_period_state} on \clnref{upd_gps3,upd_gps4},
 	how can we be sure that \clnref{upd_gps3}'s changes won't be lost?
 	\end{fcvref}
 }\QuickQuizAnswer{
 	Recall that Promela and Spin trace out
 	every possible sequence of state changes.
 	Therefore, timing is irrelevant:
 	Promela/Spin will be quite happy to jam the entire rest of
 	the model between those two statements unless some state
 	variable specifically prohibits doing so.
 }\QuickQuizEnd

 The \co{dyntick_nohz()} Promela process implements
 this verification as shown below:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-s@dyntick_nohz.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-s:dyntick_nohz]
 \Clnref{new_flg} sets a new \co{old_gp_idle} flag if the
 value of the \co{grace_period_state} variable is
 \co{GP_IDLE} at the beginning of task execution,
 and the assertion at \clnref{assert:b,assert:e}
 fire if the \co{grace_period_state}
 variable has advanced to \co{GP_DONE} during task execution,
 which would be illegal given that a single RCU read-side critical
 section could span the entire intervening time period.
 \end{fcvref}

 The resulting
 model (\path{dyntickRCU-base-s.spin}),
 when run with the \path{runspin.sh} script,
 generates 964 states and passes without errors, which is reassuring.
 That said, although safety is critically important, it is also quite
 important to avoid indefinitely stalling grace periods.
 The next section therefore covers verifying liveness.

 \subsubsection{Validating Liveness}
 \label{sec:formal:Validating Liveness}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-sl-busted:dyntick_nohz]
 Although liveness can be difficult to prove, there is a simple
 trick that applies here.
 The first step is to make \co{dyntick_nohz()} indicate that
 it is done via a \co{dyntick_nohz_done} variable, as shown on
 \clnref{done} of the following:
 \end{fcvref}

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-sl-busted@dyntick_nohz.fcv}

 With this variable in place, we can add assertions to
 \co{grace_period()} to check for unnecessary blockage
 as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-sl-busted@grace_period.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-sl-busted:grace_period]
 We have added the \co{shouldexit} variable on \clnref{shex},
 which we initialize to zero on \clnref{init_shex}.
 \Clnref{assert_shex} asserts that \co{shouldexit} is not set, while
 \clnref{set_shex} sets \co{shouldexit} to the \co{dyntick_nohz_done}
 variable maintained by \co{dyntick_nohz()}.
 This assertion will therefore trigger if we attempt to take more than
 one pass through the wait-for-counter-flip-acknowledgement
 loop after \co{dyntick_nohz()} has completed
 execution.
 After all, if \co{dyntick_nohz()} is done, then there cannot be
 any more state changes to force us out of the loop, so going through twice
 in this state means an infinite loop, which in turn means no end to the
 grace period.

 \Clnref{init_shex2,assert_shex2,set_shex2} operate in a similar manner
 for the second (memory-barrier) loop.

 However, running this
 model (\path{dyntickRCU-base-sl-busted.spin})
 results in failure, as \clnref{chk_2} is checking that
 the wrong variable
 is even.
 Upon failure, \co{spin} writes out a
 ``trail'' file
 (\path{dyntickRCU-base-sl-busted.spin.trail}),
 which records the sequence of states that lead to the failure.
 Use the ``\co{spin -t -p -g -l\ }%
 \path{dyntickRCU-base-sl-busted.spin}''
 command to cause \co{spin} to retrace this sequence of states,
 printing the statements executed and the values of variables
 (\path{dyntickRCU-base-sl-busted.spin.trail.txt}).
 Note that the line numbers do not match the listing above due to
 the fact that \co{spin} takes both functions in a single file.
 However, the line numbers \emph{do} match the full
 model (\path{dyntickRCU-base-sl-busted.spin}).

 We see that the \co{dyntick_nohz()} process completed
 at step 34 (search for ``34:''), but that the
 \co{grace_period()} process nonetheless failed to exit the loop.
 The value of \co{curr} is \co{6} (see step 35)
 and that the value of \co{snap} is \co{5} (see step 17).
 Therefore the first condition on \clnref{chk_1} above
 does not hold because
 \qco{curr != snap}, and the second condition on \clnref{chk_2}
 does not hold either because \co{snap} is odd and because
 \co{curr} is only one greater than \co{snap}.
 \end{fcvref}

 So one of these two conditions has to be incorrect.
 Referring to the comment block in \co{rcu_try_flip_waitack_needed()}
 for the first condition:

 \begin{quote}
 	If the CPU remained in dynticks mode for the entire time
 	and didn't take any interrupts, NMIs, SMIs, or whatever,
 	then it cannot be in the middle of an \co{rcu_read_lock()}, so
 	the next \co{rcu_read_lock()} it executes must use the new value
 	of the counter.
 	So we can safely pretend that this CPU already acknowledged
 	the counter.
 \end{quote}

 The first condition does match this, because if \qco{curr == snap}
 and if \co{curr} is even, then the corresponding CPU has been
 in dynticks-idle mode the entire time, as required.
 So let's look at the comment block for the second condition:

 \begin{quote}
 	If the CPU passed through or entered a dynticks idle phase with
 	no active irq handlers, then, as above, we can safely pretend
 	that this CPU already acknowledged the counter.
 \end{quote}

 The first part of the condition is correct, because if \co{curr}
 and \co{snap} differ by two, there will be at least one even
 number in between, corresponding to having passed completely through
 a dynticks-idle phase.
 However, the second part of the condition corresponds to having
 \emph{started} in dynticks-idle mode, not having \emph{finished}
 in this mode.
 We therefore need to be testing \co{curr} rather than
 \co{snap} for being an even number.

 The corrected C code is as follows:

 \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed_fixed]
 \begin{VerbatimN}[commandchars=\\\[\]]
 static inline int
 rcu_try_flip_waitack_needed(int cpu)
 {
 	long curr;
 	long snap;

 	curr = per_cpu(dynticks_progress_counter, cpu);
 	snap = per_cpu(rcu_dyntick_snapshot, cpu);
 	smp_mb();
 	if ((curr == snap) && ((curr & 0x1) == 0)) \lnlbl[if:b]
 		return 0;
 	if ((curr - snap) > 2 || (curr & 0x1) == 0)
 		return 0;			\lnlbl[if:e]
 	return 1;
 }
 \end{VerbatimN}
 \end{fcvlabel}

 \begin{fcvref}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed_fixed]
 \Clnrefrange{if:b}{if:e} can now be combined and simplified,
 resulting in the following.
 A similar simplification can be applied to
 \co{rcu_try_flip_waitmb_needed()}.
 \end{fcvref}

 \begin{VerbatimN}[commandchars=\\\[\]]
 static inline int
 rcu_try_flip_waitack_needed(int cpu)
 {
 	long curr;
 	long snap;

 	curr = per_cpu(dynticks_progress_counter, cpu);
 	snap = per_cpu(rcu_dyntick_snapshot, cpu);
 	smp_mb();
 	if ((curr - snap) >= 2 || (curr & 0x1) == 0)
 		return 0;
 	return 1;
 }
 \end{VerbatimN}

 Making the corresponding correction in the
 model (\path{dyntickRCU-base-sl.spin})
 results in a correct verification with 661 states that passes without
 errors.
 However, it is worth noting that the first version of the liveness
 verification failed to catch this bug, due to a bug in the liveness
 verification itself.
 This liveness-verification bug was located by inserting an infinite
 loop in the \co{grace_period()} process, and noting that
 the liveness-verification code failed to detect this problem!

 We have now successfully verified both safety and liveness
 conditions, but only for processes running and blocking.
 We also need to handle interrupts, a task taken up in the next section.

 \subsubsection{Interrupts}
 \label{sec:formal:Interrupts}

 There are a couple of ways to model interrupts in Promela:
 \begin{enumerate}
 \item	Using C-preprocessor tricks to insert the interrupt handler
 	between each and every statement of the \co{dynticks_nohz()}
 	process, or
 \item	Modeling the interrupt handler with a separate process.
 \end{enumerate}

 A bit of thought indicated that the second approach would have a
 smaller state space, though it requires that the interrupt handler
 somehow run atomically with respect to the \co{dynticks_nohz()}
 process, but not with respect to the \co{grace_period()}
 process.

 Fortunately, it turns out that Promela permits you to branch
 out of atomic statements.
 This trick allows us to have the interrupt handler set a flag, and
 recode \co{dynticks_nohz()} to atomically check this flag
 and execute only when the flag is not set.
 This can be accomplished with a C-preprocessor macro that takes
 a label and a Promela statement as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@EXECUTE_MAINLINE.fcv}

 One might use this macro as follows:

 \begin{VerbatimU}
 EXECUTE_MAINLINE(stmt1,
                  tmp = dynticks_progress_counter)
 \end{VerbatimU}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:EXECUTE_MAINLINE]
 \Clnref{label} of the macro creates the specified statement label.
 \Clnrefrange{atm:b}{atm:e} are an atomic block that tests
 the \co{in_dyntick_irq}
 variable, and if this variable is set (indicating that the interrupt
 handler is active), branches out of the atomic block back to the
 label.
 Otherwise, \clnref{else} executes the specified statement.
 The overall effect is that mainline execution stalls any time an interrupt
 is active, as required.
 \end{fcvref}

 \subsubsection{Validating Interrupt Handlers}
 \label{sec:formal:Validating Interrupt Handlers}

 The first step is to convert \co{dyntick_nohz()} to
 \co{EXECUTE_MAINLINE()} form, as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@dyntick_nohz.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_nohz]
 It is important to note that when a group of statements is passed
 to \co{EXECUTE_MAINLINE()}, as in \clnrefrange{stmt2:b}{stmt2:e}, all
 statements in that group execute atomically.
 \end{fcvref}

 \QuickQuizSeries{%
 \QuickQuizB{
 	But what would you do if you needed the statements in a single
 	\co{EXECUTE_MAINLINE()} group to execute non-atomically?
 }\QuickQuizAnswerB{
 	The easiest thing to do would be to put
 	each such statement in its own \co{EXECUTE_MAINLINE()}
 	statement.
 }\QuickQuizEndB
 %
 \QuickQuizE{
 	But what if the \co{dynticks_nohz()} process had
 	\qco{if} or \qco{do} statements with conditions,
 	where the statement bodies of these constructs
 	needed to execute non-atomically?
 }\QuickQuizAnswerE{
 	One approach, as we will see in a later section,
 	is to use explicit labels and \qco{goto} statements.
 	For example, the construct:

 \begin{VerbatimU}
 	if
 	:: i == 0 -> a = -1;
 	:: else -> a = -2;
 	fi;
 \end{VerbatimU}
 %
 	could be modeled as something like:

 \begin{VerbatimU}
 	EXECUTE_MAINLINE(stmt1,
 	                 if
 	                 :: i == 0 -> goto stmt1_then;
 	                 :: else -> goto stmt1_else;
 	                 fi)
 	stmt1_then: skip;
 	EXECUTE_MAINLINE(stmt1_then1, a = -1; goto stmt1_end)
 	stmt1_else: skip;
 	EXECUTE_MAINLINE(stmt1_then1, a = -2)
 	stmt1_end: skip;
 \end{VerbatimU}

 	However, it is not clear that the macro is helping much in the case
 	of the \qco{if} statement, so these sorts of situations will
 	be open-coded in the following sections.
 }\QuickQuizEndE
 }

 The next step is to write a \co{dyntick_irq()} process
 to model an interrupt handler:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@dyntick_irq.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_irq]
 The loop from \clnrefrange{do}{od} models up to \co{MAX_DYNTICK_LOOP_IRQ}
 interrupts, with \clnref{cond1,cond2} forming the loop condition and
 \clnref{inc_i} incrementing the control variable.
 \Clnref{in_irq} tells \co{dyntick_nohz()} that an interrupt handler
 is running, and \clnref{clr_in_irq} tells \co{dyntick_nohz()} that this
 handler has completed.
 \Clnref{irq_done} is used for liveness verification, just like the corresponding
 line of \co{dyntick_nohz()}.
 \end{fcvref}

 \QuickQuiz{
 	\begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_irq]
 	Why are \clnref{clr_in_irq,inc_i} (the \qtco{in_dyntick_irq = 0;}
 	and the \qco{i++;}) executed atomically?
 	\end{fcvref}
 }\QuickQuizAnswer{
 	These lines of code pertain to controlling the
 	model, not to the code being modeled, so there is no reason to
 	model them non-atomically.
 	The motivation for modeling them atomically is to reduce the size
 	of the state space.
 }\QuickQuizEnd

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_irq]
 \Clnrefrange{enter:b}{enter:e} model \co{rcu_irq_enter()}, and
 \clnref{add_prmt_cnt:b,add_prmt_cnt:e} model the relevant snippet
 of \co{__irq_enter()}.
 \Clnrefrange{vrf_safe:b}{vrf_safe:e} verify safety in much the same manner
 as do the corresponding lines of \co{dynticks_nohz()}.
 \Clnref{irq_exit:b,irq_exit:e} model the relevant snippet of \co{__irq_exit()},
 and finally \clnrefrange{exit:b}{exit:e} model \co{rcu_irq_exit()}.
 \end{fcvref}

 \QuickQuiz{
 	What property of interrupts is this \co{dynticks_irq()}
 	process unable to model?
 }\QuickQuizAnswer{
 	One such property is nested interrupts,
 	which are handled in the following section.
 }\QuickQuizEnd

 The \co{grace_period()} process then becomes as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@grace_period.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:grace_period]
 The implementation of \co{grace_period()} is very similar
 to the earlier one.
 The only changes are the addition of \clnref{MDLI} to add the new
 interrupt-count parameter, changes to
 \clnref{edit1,edit3} to add the new \co{dyntick_irq_done} variable
 to the liveness checks, and of course the optimizations on \clnref{edit2,edit4}.
 \end{fcvref}

 This model (\path{dyntickRCU-irqnn-ssl.spin})
 results in a correct verification with roughly half a million
 states, passing without errors.
 However, this version of the model does not handle nested
 interrupts.
 This topic is taken up in the next section.

 \subsubsection{Validating Nested Interrupt Handlers}
 \label{sec:formal:Validating Nested Interrupt Handlers}

 Nested interrupt handlers may be modeled by splitting the body of
 the loop in \co{dyntick_irq()} as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-ssl@dyntick_irq.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irq-ssl:dyntick_irq]
 This is similar to the earlier \co{dynticks_irq()} process.
 It adds a second counter variable~\co{j} on \clnref{j}, so that
 \co{i}~counts entries to interrupt handlers and \co{j}~counts
 exits.
 The \co{outermost} variable on \clnref{om} helps determine
 when the \co{grace_period_state} variable needs to be sampled
 for the safety checks.
 The loop-exit check on \clnref{chk_ex:b,chk_ex:e} is updated to require that the
 specified number of interrupt handlers are exited as well as entered,
 and the increment of~\co{i} is moved to \clnref{inc_i}, which is
 the end of the interrupt-entry model.
 \Clnrefrange{atm1:b}{atm1:e} set the \co{outermost} variable to indicate
 whether this is the outermost of a set of nested interrupts and to
 set the \co{in_dyntick_irq} variable that is used by the
 \co{dyntick_nohz()} process.
 \Clnrefrange{atm2:b}{atm2:e} capture the state of the \co{grace_period_state}
 variable, but only when in the outermost interrupt handler.

 \Clnref{cnd_ex} has the do-loop conditional for interrupt-exit modeling:
 As long as we have exited fewer interrupts than we have entered, it is
 legal to exit another interrupt.
 \Clnrefrange{atm3:b}{atm3:e}
 check the safety criterion, but only if we are exiting
 from the outermost interrupt level.
 Finally, \clnrefrange{atm4:b}{atm4:e} increment the interrupt-exit count~\co{j}
 and, if this is the outermost interrupt level, clears
 \co{in_dyntick_irq}.
 \end{fcvref}

 This model (\path{dyntickRCU-irq-ssl.spin})
 results in a correct verification with a bit more than half a million
 states, passing without errors.
 However, this version of the model does not handle NMIs,
 which are taken up in the next section.

 \subsubsection{Validating NMI Handlers}
 \label{sec:formal:Validating NMI Handlers}

 We take the same general approach for NMIs as we do for interrupts,
 keeping in mind that NMIs do not nest.
 This results in a \co{dyntick_nmi()} process as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@dyntick_nmi.fcv}

 Of course, the fact that we have NMIs requires adjustments in
 the other components.
 For example, the \co{EXECUTE_MAINLINE()} macro now needs to
 pay attention to the NMI handler (\co{in_dyntick_nmi}) as well
 as the interrupt handler (\co{in_dyntick_irq}) by checking
 the \co{dyntick_nmi_done} variable as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@EXECUTE_MAINLINE.fcv}

 We will also need to introduce an \co{EXECUTE_IRQ()}
 macro that checks \co{in_dyntick_nmi} in order to allow
 \co{dyntick_irq()} to exclude \co{dyntick_nmi()}:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@EXECUTE_IRQ.fcv}

 It is further necessary to convert \co{dyntick_irq()}
 to \co{EXECUTE_IRQ()} as follows:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@dyntick_irq.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irq-nmi-ssl:dyntick_irq]
 Note that we have open-coded the \qco{if} statements
 (for example, \clnrefrange{stmt1:b}{stmt1:e}).
 In addition, statements that process strictly local state
 (such as \clnref{inc_i}) need not exclude \co{dyntick_nmi()}.
 \end{fcvref}

 Finally, \co{grace_period()} requires only a few changes:

 \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@grace_period.fcv}

 \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irq-nmi-ssl:grace_period]
 We have added the \co{printf()} for the new
 \co{MAX_DYNTICK_LOOP_NMI} parameter on \clnref{MDL_NMI} and
 added \co{dyntick_nmi_done} to the \co{shouldexit}
 assignments on \clnref{nmi_done1,nmi_done2}.
 \end{fcvref}

 The model (\path{dyntickRCU-irq-nmi-ssl.spin})
 results in a correct verification with several hundred million
 states, passing without errors.

 \QuickQuiz{
 	Does Paul \emph{always} write his code in this painfully incremental
 	manner?
 }\QuickQuizAnswer{
 	Not always, but more and more frequently.
 	In this case, Paul started with the smallest slice of code that
 	included an interrupt handler, because he was not sure how best
 	to model interrupts in Promela.
 	Once he got that working, he added other features.
 	(But if he was doing it again, he would start with a ``toy'' handler.
 	For example, he might have the handler increment a variable twice and
 	have the mainline code verify that the value was always even.)

 	Why the incremental approach?
 	Consider the following, attributed to Brian W. Kernighan:

 	\begin{quote}
 		Debugging is twice as hard as writing the code in the first
 		place.
 		Therefore, if you write the code as cleverly as possible,
 		you are, by definition, not smart enough to debug it.
 	\end{quote}

 	This means that any attempt to optimize the production of code should
 	place at least 66\pct\ of its emphasis on optimizing the debugging process,
 	even at the expense of increasing the time and effort spent coding.
 	Incremental coding and testing is one way to optimize the debugging
 	process, at the expense of some increase in coding effort.
 	Paul uses this approach because he rarely has the luxury of
 	devoting full days (let alone weeks) to coding and debugging.
 }\QuickQuizEnd

 \subsubsection{Lessons (Re)Learned}
 \label{sec:formal:Lessons (Re)Learned}

 This effort provided some lessons (re)learned:

 \begin{enumerate}
 \item	{\bf Promela and Spin can verify interrupt\slash NMI\-/handler
 	interactions.}
 \item	{\bf Documenting code can help locate bugs.}
 	In this case, the documentation effort located
 	a misplaced memory barrier in
 	\co{rcu_enter_nohz()} and \co{rcu_exit_nohz()},
 	as shown by the following patch~\cite{PaulEMcKenney2008commit:ae66be9b71b1}.

 \begin{VerbatimU}
  static inline void rcu_enter_nohz(void)
  {
 +       mb();
         __get_cpu_var(dynticks_progress_counter)++;
 -       mb();
  }

  static inline void rcu_exit_nohz(void)
  {
 -       mb();
         __get_cpu_var(dynticks_progress_counter)++;
 +       mb();
  }
 \end{VerbatimU}

 \item	{\bf Validate your code early, often, and up to the point
 	of destruction.}
 	This effort located one subtle bug in
 	\co{rcu_try_flip_waitack_needed()}
 	that would have been quite difficult to test or debug, as
 	shown by the following patch~\cite{PaulEMcKenney2008commit:d7c0651390b6}.

 \begin{VerbatimU}
 -       if ((curr - snap) > 2 || (snap & 0x1) == 0)
 +       if ((curr - snap) > 2 || (curr & 0x1) == 0)
 \end{VerbatimU}

 \item	{\bf Always verify your verification code.}
 	The usual way to do this is to insert a deliberate bug
 	and verify that the verification code catches it.
 	Of course, if the verification code fails to catch this bug,
 	you may also need to verify the bug itself, and so on,
 	recursing infinitely.
 	However, if you find yourself in this position,
 	getting a good night's sleep
 	can be an extremely effective debugging technique.
 	You will then see that the obvious verify-the-verification
 	technique is to deliberately insert bugs in the code being
 	verified.
 	If the verification fails to find them, the verification clearly
 	is buggy.
 \item	{\bf Use of atomic instructions can simplify verification.}
 	Unfortunately, use of the \co{cmpxchg} atomic instruction
 	would also slow down the critical \IXacr{irq} fastpath, so they
 	are not appropriate in this case.
 \item	{\bf The need for complex formal verification often indicates
 	a need to re-think your design.}
 \end{enumerate}

 To this last point, it turns out that there is a much simpler solution to
 the dynticks problem, which is presented in the next section.

 \subsubsection{Simplicity Avoids Formal Verification}
 \label{sec:formal:Simplicity Avoids Formal Verification}

 The complexity of the dynticks interface for preemptible RCU is primarily
 due to the fact that both \IRQ s and NMIs use the same code path and the
 same state variables.
 This leads to the notion of providing separate code paths and variables
 for \IRQ s and NMIs, as has been done for
 hierarchical RCU~\cite{PaulEMcKenney2008HierarchicalRCU}
 as indirectly suggested by
 Manfred Spraul~\cite{ManfredSpraul2008StateMachineRCU}.
 This work was pulled into mainline kernel during the v2.6.29
 development cycle~\cite{PaulEMcKenney2008commit:64db4cfff99c}.

 \subsubsection{State Variables for Simplified Dynticks Interface}
 \label{sec:formal:State Variables for Simplified Dynticks Interface}

 \Cref{lst:formal:Variables for Simple Dynticks Interface}
 shows the new per-CPU state variables.
 These variables are grouped into structs to allow multiple independent
 RCU implementations (e.g., \co{rcu} and \co{rcu_bh}) to conveniently
 and efficiently share dynticks state.
 In what follows, they can be thought of as independent per-CPU variables.

 \begin{listing}
 \begin{VerbatimL}
 struct rcu_dynticks {
 	int dynticks_nesting;
 	int dynticks;
 	int dynticks_nmi;
 };

 struct rcu_data {
 	...
 	int dynticks_snap;
 	int dynticks_nmi_snap;
 	...
 };
 \end{VerbatimL}
 \caption{Variables for Simple Dynticks Interface}
 \label{lst:formal:Variables for Simple Dynticks Interface}
 \end{listing}

 The \co{dynticks_nesting}, \co{dynticks}, and \co{dynticks_snap} variables
 are for the \IRQ\ code paths, and the \co{dynticks_nmi} and
 \co{dynticks_nmi_snap} variables are for the NMI code paths, although
 the NMI code path will also reference (but not modify) the
 \co{dynticks_nesting} variable.
 These variables are used as follows:

 \begin{description}[style=nextline]
 \item	[\tco{dynticks_nesting}]
 	This counts the number of reasons that the corresponding
 	CPU should be monitored for RCU read-side critical sections.
 	If the CPU is in dynticks-idle mode, then this counts the
 	\IRQ\ nesting level, otherwise it is one greater than the
 	\IRQ\ nesting level.
 \item	[\tco{dynticks}]
 	This counter's value is even if the corresponding CPU is
 	in dynticks-idle mode and there are no \IRQ\ handlers currently
 	running on that CPU, otherwise the counter's value is odd.
 	In other words, if this counter's value is odd, then the
 	corresponding CPU might be in an RCU read-side critical section.
 \item	[\tco{dynticks_nmi}]
 	This counter's value is odd if the corresponding CPU is
 	in an NMI handler, but only if the NMI arrived while this
 	CPU was in dyntick-idle mode with no \IRQ\ handlers running.
 	Otherwise, the counter's value will be even.
 \item	[\tco{dynticks_snap}]
 	This will be a snapshot of the \co{dynticks} counter, but
 	only if the current RCU grace period has extended for too
 	long a duration.
 \item	[\tco{dynticks_nmi_snap}]
 	This will be a snapshot of the \co{dynticks_nmi} counter, but
 	again only if the current RCU grace period has extended for too
 	long a duration.
 \end{description}

 If both \co{dynticks} and \co{dynticks_nmi} have taken on an even
 value during a given time interval, then the corresponding CPU has
 passed through a \IX{quiescent state} during that interval.

 \QuickQuiz{
 	But what happens if an NMI handler starts running before
 	an \IRQ\ handler completes, and if that NMI handler continues
 	running until a second \IRQ\ handler starts?
 }\QuickQuizAnswer{
 	This cannot happen within the confines of a single CPU\@.
 	The first \IRQ\ handler cannot complete until the NMI handler
 	returns.
 	Therefore, if each of the \co{dynticks} and \co{dynticks_nmi}
 	variables have taken on an even value during a given time
 	interval, the corresponding CPU really was in a quiescent
 	state at some time during that interval.
 }\QuickQuizEnd

 \subsubsection{Entering and Leaving Dynticks-Idle Mode}
 \label{sec:formal:Entering and Leaving Dynticks-Idle Mode}

 \Cref{lst:formal:Entering and Exiting Dynticks-Idle Mode}
 shows the \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()},
 which enter and exit dynticks-idle mode, also known as ``nohz'' mode.
 These two functions are invoked from process context.

 \begin{listing}
 \begin{fcvlabel}[ln:formal:Entering and Exiting Dynticks-Idle Mode]
 \begin{VerbatimL}[commandchars=\\\[\]]
 void rcu_enter_nohz(void)
 {
 	unsigned long flags;
 	struct rcu_dynticks *rdtp;

 	smp_mb();			\lnlbl[mb]
 	local_irq_save(flags);		\lnlbl[irq_sv]
 	rdtp = &__get_cpu_var(rcu_dynticks); \lnlbl[get_ptr]
 	rdtp->dynticks++;		\lnlbl[inc_cnt]
 	rdtp->dynticks_nesting--;	\lnlbl[dec_nst]
 	WARN_ON(rdtp->dynticks & 0x1);
 	local_irq_restore(flags);	\lnlbl[irq_rs]
 }

 void rcu_exit_nohz(void)
 {
 	unsigned long flags;
 	struct rcu_dynticks *rdtp;

 	local_irq_save(flags);
 	rdtp = &__get_cpu_var(rcu_dynticks);
 	rdtp->dynticks++;
 	rdtp->dynticks_nesting++;
 	WARN_ON(!(rdtp->dynticks & 0x1));
 	local_irq_restore(flags);
 	smp_mb();
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{Entering and Exiting Dynticks-Idle Mode}
 \label{lst:formal:Entering and Exiting Dynticks-Idle Mode}
 \end{listing}

 \begin{fcvref}[ln:formal:Entering and Exiting Dynticks-Idle Mode]
 \Clnref{mb} ensures that any prior memory accesses (which might
 include accesses from RCU read-side critical sections) are seen
 by other CPUs before those marking entry to dynticks-idle mode.
 \Clnref{irq_sv,irq_rs} disable and reenable \IRQ s.
 \Clnref{get_ptr} acquires a pointer to the current CPU's \co{rcu_dynticks}
 structure, and
 \clnref{inc_cnt} increments the current CPU's \co{dynticks} counter, which
 should now be even, given that we are entering dynticks-idle mode
 in process context.
 Finally, \clnref{dec_nst} decrements \co{dynticks_nesting},
 which should now be zero.
 \end{fcvref}

 The \co{rcu_exit_nohz()} function is quite similar, but increments
 \co{dynticks_nesting} rather than decrementing it and checks for
 the opposite \co{dynticks} polarity.

 \subsubsection{NMIs From Dynticks-Idle Mode}
 \label{sec:formal:NMIs From Dynticks-Idle Mode}

 \begin{fcvref}[ln:formal:NMIs From Dynticks-Idle Mode]
 \Cref{lst:formal:NMIs From Dynticks-Idle Mode}
 shows the \co{rcu_nmi_enter()} and \co{rcu_nmi_exit()} functions,
 which inform RCU of NMI entry and exit, respectively, from dynticks-idle
 mode.
 However, if the NMI arrives during an \IRQ\ handler, then RCU will already
 be on the lookout for RCU read-side critical sections from this CPU,
 so \clnref{chk_ext1,ret1} of \co{rcu_nmi_enter()} and \clnref{chk_ext2,ret2}
 of \co{rcu_nmi_exit()} silently return if \co{dynticks} is odd.
 Otherwise, the two functions increment \co{dynticks_nmi}, with
 \co{rcu_nmi_enter()} leaving it with an odd value and \co{rcu_nmi_exit()}
 leaving it with an even value.
 Both functions execute memory barriers between this increment
 and possible RCU read-side critical sections on \clnref{mb1,mb2},
 respectively.
 \end{fcvref}

 \begin{listing}
 \begin{fcvlabel}[ln:formal:NMIs From Dynticks-Idle Mode]
 \begin{VerbatimL}[commandchars=\\\[\]]
 void rcu_nmi_enter(void)
 {
 	struct rcu_dynticks *rdtp;

 	rdtp = &__get_cpu_var(rcu_dynticks);
 	if (rdtp->dynticks & 0x1)	\lnlbl[chk_ext1]
 		return;			\lnlbl[ret1]
 	rdtp->dynticks_nmi++;
 	WARN_ON(!(rdtp->dynticks_nmi & 0x1));
 	smp_mb();			\lnlbl[mb1]
 }

 void rcu_nmi_exit(void)
 {
 	struct rcu_dynticks *rdtp;

 	rdtp = &__get_cpu_var(rcu_dynticks);
 	if (rdtp->dynticks & 0x1)	\lnlbl[chk_ext2]
 		return;			\lnlbl[ret2]
 	smp_mb();			\lnlbl[mb2]
 	rdtp->dynticks_nmi++;
 	WARN_ON(rdtp->dynticks_nmi & 0x1);
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{NMIs From Dynticks-Idle Mode}
 \label{lst:formal:NMIs From Dynticks-Idle Mode}
 \end{listing}

 \subsubsection{Interrupts From Dynticks-Idle Mode}
 \label{sec:formal:Interrupts From Dynticks-Idle Mode}

 \begin{fcvref}[ln:formal:Interrupts From Dynticks-Idle Mode]
 \Cref{lst:formal:Interrupts From Dynticks-Idle Mode}
 shows \co{rcu_irq_enter()} and \co{rcu_irq_exit()}, which
 inform RCU of entry to and exit from, respectively, \IRQ\ context.
 \Clnref{inc_nst} of \co{rcu_irq_enter()} increments \co{dynticks_nesting},
 and if this variable was already non-zero, \clnref{ret1} silently returns.
 Otherwise, \clnref{inc_dynt1} increments \co{dynticks},
 which will then have
 an odd value, consistent with the fact that this CPU can now
 execute RCU read-side critical sections.
 \Clnref{mb1} therefore executes a memory barrier to ensure that
 the increment of \co{dynticks} is seen before any
 RCU read-side critical sections that the subsequent \IRQ\ handler
 might execute.

 \begin{listing}
 \begin{fcvlabel}[ln:formal:Interrupts From Dynticks-Idle Mode]
 \begin{VerbatimL}[commandchars=\\\[\]]
 void rcu_irq_enter(void)
 {
 	struct rcu_dynticks *rdtp;

 	rdtp = &__get_cpu_var(rcu_dynticks);
 	if (rdtp->dynticks_nesting++)		\lnlbl[inc_nst]
 		return;				\lnlbl[ret1]
 	rdtp->dynticks++;			\lnlbl[inc_dynt1]
 	WARN_ON(!(rdtp->dynticks & 0x1));
 	smp_mb();				\lnlbl[mb1]
 }

 void rcu_irq_exit(void)
 {
 	struct rcu_dynticks *rdtp;

 	rdtp = &__get_cpu_var(rcu_dynticks);
 	if (--rdtp->dynticks_nesting)		\lnlbl[dec_nst]
 		return;				\lnlbl[ret2]
 	smp_mb();				\lnlbl[mb2]
 	rdtp->dynticks++;			\lnlbl[inc_dynt2]
 	WARN_ON(rdtp->dynticks & 0x1);		\lnlbl[chk_even]
 	if (__get_cpu_var(rcu_data).nxtlist ||	\lnlbl[chk_cb:b]
 	    __get_cpu_var(rcu_bh_data).nxtlist)
 		set_need_resched();		\lnlbl[chk_cb:e]
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{Interrupts From Dynticks-Idle Mode}
 \label{lst:formal:Interrupts From Dynticks-Idle Mode}
 \end{listing}

 \Clnref{dec_nst} of \co{rcu_irq_exit()} decrements
 \co{dynticks_nesting}, and
 if the result is non-zero, \clnref{ret2} silently returns.
 Otherwise, \clnref{mb2} executes a memory barrier to ensure that the
 increment of \co{dynticks} on \clnref{inc_dynt2} is seen after any RCU
 read-side critical sections that the prior \IRQ\ handler might have executed.
 \Clnref{chk_even} verifies that \co{dynticks} is now even, consistent with
 the fact that no RCU read-side critical sections may appear in
 dynticks-idle mode.
 \Clnrefrange{chk_cb:b}{chk_cb:e} check to see
 if the prior \IRQ\ handlers enqueued any
 RCU callbacks, forcing this CPU out of dynticks-idle mode via
 a reschedule API if so.
 \end{fcvref}

 \subsubsection{Checking For Dynticks Quiescent States}
 \label{sec:formal:Checking For Dynticks Quiescent States}

 \begin{fcvref}[ln:formal:Saving Dyntick Progress Counters]
 \Cref{lst:formal:Saving Dyntick Progress Counters}
 shows \co{dyntick_save_progress_counter()}, which takes a snapshot
 of the specified CPU's \co{dynticks} and \co{dynticks_nmi}
 counters.
 \Clnref{snap,snapn} snapshot these two variables to locals, \clnref{mb}
 executes a memory barrier to pair with the memory barriers in the functions in
 \cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,%
 lst:formal:NMIs From Dynticks-Idle Mode,%
 lst:formal:Interrupts From Dynticks-Idle Mode}.
 \Clnref{rec_snap,rec_snapn} record the snapshots for later calls to
 \co{rcu_implicit_dynticks_qs()},
 and \clnref{chk_prgs} checks to see if the CPU is in dynticks-idle mode with
 neither \IRQ s nor NMIs in progress (in other words, both snapshots
 have even values), hence in an extended quiescent state.
 If so, \clnref{cnt:b,cnt:e} count this event, and \clnref{ret} returns
 true if the CPU was in a quiescent state.
 \end{fcvref}

 \begin{listing}
 \begin{fcvlabel}[ln:formal:Saving Dyntick Progress Counters]
 \begin{VerbatimL}[commandchars=\\\[\]]
 static int
 dyntick_save_progress_counter(struct rcu_data *rdp)
 {
 	int ret;
 	int snap;
 	int snap_nmi;

 	snap = rdp->dynticks->dynticks;		\lnlbl[snap]
 	snap_nmi = rdp->dynticks->dynticks_nmi;	\lnlbl[snapn]
 	smp_mb();				\lnlbl[mb]
 	rdp->dynticks_snap = snap;		\lnlbl[rec_snap]
 	rdp->dynticks_nmi_snap = snap_nmi;	\lnlbl[rec_snapn]
 	ret = ((snap & 0x1) == 0) && ((snap_nmi & 0x1) == 0); \lnlbl[chk_prgs]
 	if (ret)				\lnlbl[cnt:b]
 		rdp->dynticks_fqs++;		\lnlbl[cnt:e]
 	return ret;				\lnlbl[ret]
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{Saving Dyntick Progress Counters}
 \label{lst:formal:Saving Dyntick Progress Counters}
 \end{listing}

 \begin{fcvref}[ln:formal:Checking Dyntick Progress Counters]
 \Cref{lst:formal:Checking Dyntick Progress Counters}
 shows \co{rcu_implicit_dynticks_qs()}, which is called to check
 whether a CPU has entered dyntick-idle mode subsequent to a call
 to \co{dynticks_save_progress_counter()}.
 \Clnref{curr,currn} take new snapshots of the corresponding CPU's
 \co{dynticks} and \co{dynticks_nmi} variables, while
 \clnref{snap,snapn} retrieve the snapshots saved earlier by
 \co{dynticks_save_progress_counter()}.
 \Clnref{mb} then
 executes a memory barrier to pair with the memory barriers in
 the functions in
 \cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,%
 lst:formal:NMIs From Dynticks-Idle Mode,%
 lst:formal:Interrupts From Dynticks-Idle Mode}.
 \Clnrefrange{chk_q:b}{chk_q:e}
 then check to see if the CPU is either currently in
 a quiescent state (\co{curr} and \co{curr_nmi} having even values) or
 has passed through a quiescent state since the last call to
 \co{dynticks_save_progress_counter()} (the values of
 \co{dynticks} and \co{dynticks_nmi} having changed).
 If these checks confirm that the CPU has passed through a dyntick-idle
 quiescent state, then \clnref{cnt} counts that fact and
 \clnref{ret_1} returns an indication of this fact.
 Either way, \clnref{chk_race}
 checks for race conditions that can result in RCU
 waiting for a CPU that is offline.
 \end{fcvref}

 \begin{listing}
 \begin{fcvlabel}[ln:formal:Checking Dyntick Progress Counters]
 \begin{VerbatimL}[commandchars=\\\[\]]
 static int
 rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 {
 	long curr;
 	long curr_nmi;
 	long snap;
 	long snap_nmi;

 	curr = rdp->dynticks->dynticks;		\lnlbl[curr]
 	snap = rdp->dynticks_snap;		\lnlbl[snap]
 	curr_nmi = rdp->dynticks->dynticks_nmi;	\lnlbl[currn]
 	snap_nmi = rdp->dynticks_nmi_snap;	\lnlbl[snapn]
 	smp_mb();				\lnlbl[mb]
 	if ((curr != snap || (curr & 0x1) == 0) && \lnlbl[chk_q:b]
 	    (curr_nmi != snap_nmi || (curr_nmi & 0x1) == 0)) { \lnlbl[chk_q:e]
 		rdp->dynticks_fqs++;		\lnlbl[cnt]
 		return 1;			\lnlbl[ret_1]
 	}
 	return rcu_implicit_offline_qs(rdp);	\lnlbl[chk_race]
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{Checking Dyntick Progress Counters}
 \label{lst:formal:Checking Dyntick Progress Counters}
 \end{listing}

 \QuickQuiz{
 	This is still pretty complicated.
 	Why not just have a \co{cpumask_t} with per-CPU bits, clearing
 	the bit when entering an \IRQ\ or NMI handler, and setting it
 	upon exit?
 }\QuickQuizAnswer{
 	Although this approach would be functionally correct, it
 	would result in excessive \IRQ\ entry/exit overhead on
 	large machines.
 	In contrast, the approach laid out in this section allows
 	each CPU to touch only per-CPU data on \IRQ\ and NMI entry/exit,
 	resulting in much lower \IRQ\ entry/exit overhead, especially
 	on large machines.
 }\QuickQuizEnd

 Linux-kernel RCU's dyntick-idle code has since been rewritten yet again
 based on a suggestion from
 Andy Lutomirski~\cite{PaulMcKenney2015dyntickAndyNMI},
 but it is time to sum up and move on to other topics.

 \subsubsection{Discussion}
 \label{sec:formal:Discussion}

 A slight shift in viewpoint resulted in a substantial simplification
 of the dynticks interface for RCU\@.
 The key change leading to this simplification was minimizing of
 sharing between \IRQ\ and NMI contexts.
 The only sharing in this simplified interface is references from NMI
 context to \IRQ\ variables (the \co{dynticks} variable).
 This type of sharing is benign, because the NMI functions never update
 this variable, so that its value remains constant through the lifetime
 of the NMI handler.
 This limitation of sharing allows the individual functions to be
 understood one at a time, in happy contrast to the situation
 described in
 \cref{sec:formal:Promela Parable: dynticks and Preemptible RCU},
 where an NMI might change shared state at any point during execution of
 the \IRQ\ functions.

 Verification can be a good thing, but simplicity is even better.

 % Restore frame around VerbatimN in two-column layout
 \IfTwoColumn{
 \RecustomVerbatimEnvironment{VerbatimN}{Verbatim}%
 {numbers=left,numbersep=3pt,xleftmargin=5pt,xrightmargin=5pt,frame=single}
 \setlength{\lnlblraise}{6pt}
 }{}