| % formal/dyntickrcu.tex |
| % mainfile: ../perfbook.tex |
| % SPDX-License-Identifier: CC-BY-SA-3.0 |
| |
| % Disable frame around VerbatimN in two-column layout |
| \IfTwoColumn{ |
| \RecustomVerbatimEnvironment{VerbatimN}{Verbatim}% |
| {numbers=left,numbersep=5pt,xleftmargin=10pt,xrightmargin=0pt,frame=none} |
| \setlength{\lnlblraise}{0pt} |
| }{} |
| |
| \subsection{Promela Parable: |
| {dynticks} and Preemptible RCU} |
| \label{sec:formal:Promela Parable: dynticks and Preemptible RCU} |
| |
| In early 2008, a preemptible variant of RCU was accepted into |
| mainline Linux in support of real-time workloads, |
| a variant similar to the RCU implementations in |
| the \rt\ patchset~\cite{IngoMolnar05a} |
| since August 2005. |
| Preemptible RCU is needed for real-time workloads because older |
| RCU implementations disable preemption across RCU read-side |
| critical sections, resulting in excessive real-time latencies. |
| |
| However, one disadvantage of the older \rt\ implementation |
| was that each \IX{grace period} |
| requires work to be done on each CPU, even if that CPU is in a low-power |
| ``dynticks-idle'' state, |
| and thus incapable of executing RCU read-side critical sections. |
| The idea behind the dynticks-idle state is that idle CPUs |
| should be physically powered down in order to conserve energy. |
| In short, preemptible RCU can disable a valuable energy-conservation |
| feature of recent Linux kernels. |
| Although Josh Triplett and Paul McKenney |
| had discussed some approaches for allowing |
| CPUs to remain in low-power state throughout an RCU grace period |
| (thus preserving the Linux kernel's ability to conserve energy), matters |
| did not come to a head until Steve Rostedt integrated a new dyntick |
| implementation with preemptible RCU in the \rt\ patchset. |
| |
| This combination caused one of Steve's systems to hang on boot, so in |
| October, Paul coded up a dynticks-friendly modification to preemptible RCU's |
| grace-period processing. |
| Steve coded up \co{rcu_irq_enter()} and \co{rcu_irq_exit()} |
| interfaces called from the |
| \co{irq_enter()} and \co{irq_exit()} interrupt |
| entry/exit functions. |
| These \co{rcu_irq_enter()} and \co{rcu_irq_exit()} |
| functions are needed to allow RCU to reliably handle situations where |
| a dynticks-idle CPU is momentarily powered up for an interrupt |
| handler containing RCU read-side critical sections. |
| With these changes in place, Steve's system booted reliably, |
| but Paul continued inspecting the code periodically on the assumption |
| that we could not possibly have gotten the code right on the first try. |
| |
| Paul reviewed the code repeatedly from October 2007 to February 2008, |
| and almost always found at least one bug. |
| In one case, Paul even coded and tested a fix before realizing that the |
| bug was illusory, and in fact in all cases, the ``bug'' turned out to be |
| illusory. |
| |
| Near the end of February, Paul grew tired of this game. |
| He therefore decided to enlist the aid of |
| Promela and Spin. |
| The following presents a series of seven increasingly realistic |
| Promela models, the last of which passes, consuming about |
| 40\,GB of main memory for the state space. |
| |
| More important, Promela and Spin did find a very subtle bug for me! |
| |
| \QuickQuiz{ |
| Yeah, that's just great! |
| Now, just what am I supposed to do if I don't happen to have a |
| machine with 40\,GB of main memory??? |
| }\QuickQuizAnswer{ |
| Relax, there are a number of lawful answers to |
| this question: |
| \begin{enumerate} |
| \item Try compiler flags \co{-DCOLLAPSE} and \co{-DMA=N} |
| to reduce memory consumption. |
| See \cref{sec:formal:Running the QRCU Example}. |
| \item Further optimize the model, reducing its memory consumption. |
| \item Work out a pencil-and-paper proof, perhaps starting with the |
| comments in the code in the Linux kernel. |
| \item Devise careful torture tests, which, though they cannot prove |
| the code correct, can find hidden bugs. |
| \item There is some movement towards tools that do model |
| checking on clusters of smaller machines. |
| However, please note that we have not actually used such |
| tools myself, courtesy of some large machines that Paul has |
| occasional access to. |
| \item Wait for memory sizes of affordable systems to expand |
| to fit your problem. |
| \item Use one of a number of cloud-computing services to rent |
| a large system for a short time period. |
| \end{enumerate} |
| }\QuickQuizEnd |
| |
| Still better would be to come up with a simpler and faster algorithm |
| that has a smaller state space. |
| Even better would be an algorithm so simple that its correctness was |
| obvious to the casual observer! |
| |
| \Crefrange{sec:formal:Introduction to Preemptible RCU and dynticks} |
| {sec:formal:Grace-Period Interface} |
| give an overview of preemptible RCU's dynticks interface, |
| followed by |
| \cref{sec:formal:Validating Preemptible RCU and dynticks}'s |
| discussion of the validation of the interface. |
| |
| \subsubsection{Introduction to Preemptible RCU and dynticks} |
| \label{sec:formal:Introduction to Preemptible RCU and dynticks} |
| |
| The per-CPU \co{dynticks_progress_counter} variable is |
| central to the interface between dynticks and preemptible RCU\@. |
| This variable has an even value whenever the corresponding CPU |
| is in dynticks-idle mode, and an odd value otherwise. |
| A CPU exits dynticks-idle mode for the following three reasons: |
| |
| \begin{enumerate} |
| \item To start running a task, |
| \item When entering the outermost of a possibly nested set of interrupt |
| handlers, and |
| \item When entering an \IXacr{nmi} handler. |
| \end{enumerate} |
| |
| Preemptible RCU's grace-period machinery samples the value of |
| the \co{dynticks_progress_counter} variable in order to |
| determine when a dynticks-idle CPU may safely be ignored. |
| |
| The following three sections give an overview of the task |
| interface, the interrupt/NMI interface, and the use of |
| the \co{dynticks_progress_counter} variable by the |
| grace-period machinery as of Linux kernel v2.6.25-rc4. |
| |
| \subsubsection{Task Interface} |
| \label{sec:formal:Task Interface} |
| |
| When a given CPU enters dynticks-idle mode because it has no more |
| tasks to run, it invokes \co{rcu_enter_nohz()}: |
| |
| \begin{VerbatimN} |
| static inline void rcu_enter_nohz(void) |
| { |
| mb(); |
| __get_cpu_var(dynticks_progress_counter)++; |
| WARN_ON(__get_cpu_var(dynticks_progress_counter) & |
| 0x1); |
| } |
| \end{VerbatimN} |
| |
| This function simply increments \co{dynticks_progress_counter} and |
| checks that the result is even, but first executing a memory barrier |
| to ensure that any other CPU that sees the new value of |
| \co{dynticks_progress_counter} will also see the completion |
| of any prior RCU read-side critical sections. |
| |
| Similarly, when a CPU that is in dynticks-idle mode prepares to |
| start executing a newly runnable task, it invokes |
| \co{rcu_exit_nohz()}: |
| |
| \begin{VerbatimN} |
| static inline void rcu_exit_nohz(void) |
| { |
| __get_cpu_var(dynticks_progress_counter)++; |
| mb(); |
| WARN_ON(!(__get_cpu_var(dynticks_progress_counter) & |
| 0x1)); |
| } |
| \end{VerbatimN} |
| |
| This function again increments \co{dynticks_progress_counter}, |
| but follows it with a memory barrier to ensure that if any other CPU |
| sees the result of any subsequent RCU read-side critical section, |
| then that other CPU will also see the incremented value of |
| \co{dynticks_progress_counter}. |
| Finally, \co{rcu_exit_nohz()} checks that the result of the |
| increment is an odd value. |
| |
| The \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()} |
| functions handle the case where a CPU enters and exits dynticks-idle |
| mode due to task execution, but does not handle interrupts, which are |
| covered in the following section. |
| |
| \subsubsection{Interrupt Interface} |
| \label{sec:formal:Interrupt Interface} |
| |
| The \co{rcu_irq_enter()} and \co{rcu_irq_exit()} |
| functions handle interrupt/NMI entry and exit, respectively. |
| Of course, nested interrupts must also be properly accounted for. |
| The possibility of nested interrupts is handled by a second per-CPU |
| variable, \co{rcu_update_flag}, which is incremented upon |
| entry to an interrupt or NMI handler (in \co{rcu_irq_enter()}) |
| and is decremented upon exit (in \co{rcu_irq_exit()}). |
| In addition, the pre-existing \co{in_interrupt()} primitive is |
| used to distinguish between an outermost or a nested interrupt/NMI\@. |
| |
| Interrupt entry is handled by the \co{rcu_irq_enter()} |
| shown below: |
| |
| \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_irq_enter] |
| \begin{VerbatimN}[commandchars=\\\[\]] |
| void rcu_irq_enter(void) |
| { |
| int cpu = smp_processor_id(); \lnlbl[fetch] |
| |
| if (per_cpu(rcu_update_flag, cpu)) \lnlbl[inc:b] |
| per_cpu(rcu_update_flag, cpu)++; \lnlbl[inc:e] |
| if (!in_interrupt() && \lnlbl[chk_lv:b] |
| (per_cpu(dynticks_progress_counter, |
| cpu) & 0x1) == 0) { \lnlbl[chk_lv:e] |
| per_cpu(dynticks_progress_counter, cpu)++; \lnlbl[inc_cnt] |
| smp_mb(); \lnlbl[mb] |
| per_cpu(rcu_update_flag, cpu)++;\lnlbl[inc_flg] |
| } |
| } |
| \end{VerbatimN} |
| \end{fcvlabel} |
| |
| \begin{fcvref}[ln:formal:dyntickrcu:rcu_irq_enter] |
| \Clnref{fetch} fetches the current CPU's number, while \clnref{inc:b,inc:e} |
| increment the \co{rcu_update_flag} nesting counter if it |
| is already non-zero. |
| \Clnrefrange{chk_lv:b}{chk_lv:e} check to see whether we are |
| the outermost level of |
| interrupt, and, if so, whether \co{dynticks_progress_counter} |
| needs to be incremented. |
| If so, \clnref{inc_cnt} increments \co{dynticks_progress_counter}, |
| \clnref{mb} executes a memory barrier, and \clnref{inc_flg} increments |
| \co{rcu_update_flag}. |
| As with \co{rcu_exit_nohz()}, the memory barrier ensures that |
| any other CPU that sees the effects of an RCU read-side critical section |
| in the interrupt handler (following the \co{rcu_irq_enter()} |
| invocation) will also see the increment of |
| \co{dynticks_progress_counter}. |
| \end{fcvref} |
| |
| \QuickQuizSeries{% |
| \QuickQuizB{ |
| Why not simply increment \co{rcu_update_flag}, and then only |
| increment \co{dynticks_progress_counter} if the old value |
| of \co{rcu_update_flag} was zero??? |
| }\QuickQuizAnswerB{ |
| This fails in presence of NMIs. |
| To see this, suppose an NMI was received just after |
| \co{rcu_irq_enter()} incremented \co{rcu_update_flag}, |
| but before it incremented \co{dynticks_progress_counter}. |
| The instance of \co{rcu_irq_enter()} invoked by the NMI |
| would see that the original value of \co{rcu_update_flag} |
| was non-zero, and would therefore refrain from incrementing |
| \co{dynticks_progress_counter}. |
| This would leave the RCU grace-period machinery no clue that the |
| NMI handler was executing on this CPU, so that any RCU read-side |
| critical sections in the NMI handler would lose their RCU protection. |
| |
| The possibility of NMI handlers, which, by definition cannot |
| be masked, does complicate this code. |
| }\QuickQuizEndB |
| % |
| \QuickQuizE{ |
| \begin{fcvref}[ln:formal:dyntickrcu:rcu_irq_enter] |
| But if \clnref{chk_lv:b} finds that we are the outermost interrupt, |
| wouldn't we \emph{always} need to increment |
| \co{dynticks_progress_counter}? |
| \end{fcvref} |
| }\QuickQuizAnswerE{ |
| Not if we interrupted a running task! |
| In that case, \co{dynticks_progress_counter} would |
| have already been incremented by \co{rcu_exit_nohz()}, |
| and there would be no need to increment it again. |
| }\QuickQuizEndE |
| } |
| |
| Interrupt exit is handled similarly by |
| \co{rcu_irq_exit()}: |
| |
| \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_irq_exit] |
| \begin{VerbatimN}[commandchars=\\\[\]] |
| void rcu_irq_exit(void) |
| { |
| int cpu = smp_processor_id(); \lnlbl[fetch] |
| |
| if (per_cpu(rcu_update_flag, cpu)) { \lnlbl[chk_flg] |
| if (--per_cpu(rcu_update_flag, cpu)) \lnlbl[dec_flg] |
| return; |
| WARN_ON(in_interrupt()); \lnlbl[verify] |
| smp_mb(); \lnlbl[mb] |
| per_cpu(dynticks_progress_counter, cpu)++; \lnlbl[inc_cnt] |
| WARN_ON(per_cpu(dynticks_progress_counter, \lnlbl[vrf_even:b] |
| cpu) & 0x1); \lnlbl[vrf_even:e] |
| } |
| } |
| \end{VerbatimN} |
| \end{fcvlabel} |
| |
| \begin{fcvref}[ln:formal:dyntickrcu:rcu_irq_exit] |
| \Clnref{fetch} fetches the current CPU's number, as before. |
| \Clnref{chk_flg} checks to see if the \co{rcu_update_flag} is |
| non-zero, returning immediately (via falling off the end of the |
| function) if not. |
| Otherwise, \clnrefthro{dec_flg}{vrf_even:e} come into play. |
| \Clnref{dec_flg} decrements \co{rcu_update_flag}, returning |
| if the result is not zero. |
| \Clnref{verify} verifies that we are indeed leaving the outermost |
| level of nested interrupts, \clnref{mb} executes a memory barrier, |
| \clnref{inc_cnt} increments \co{dynticks_progress_counter}, |
| and \clnref{vrf_even:b,vrf_even:e} verify that this |
| variable is now even. |
| As with \co{rcu_enter_nohz()}, the memory barrier ensures that |
| any other CPU that sees the increment of |
| \co{dynticks_progress_counter} |
| will also see the effects of an RCU read-side critical section |
| in the interrupt handler (preceding the \co{rcu_irq_exit()} |
| invocation). |
| \end{fcvref} |
| |
| These two sections have described how the |
| \co{dynticks_progress_counter} variable is maintained during |
| entry to and exit from dynticks-idle mode, both by tasks and by |
| interrupts and NMIs. |
| The following section describes how this variable is used by |
| preemptible RCU's grace-period machinery. |
| |
| \subsubsection{Grace-Period Interface} |
| \label{sec:formal:Grace-Period Interface} |
| |
| Of the four preemptible RCU grace-period states shown in |
| \cref{fig:formal:Preemptible RCU State Machine}, |
| only the \co{rcu_try_flip_waitack_state} |
| and \co{rcu_try_flip_waitmb_state} states need to wait |
| for other CPUs to respond. |
| |
| \begin{figure} |
| \centering |
| \resizebox{2.5in}{!}{\includegraphics{formal/RCUpreemptStates}} |
| \caption{Preemptible RCU State Machine} |
| \label{fig:formal:Preemptible RCU State Machine} |
| \end{figure} |
| |
| Of course, if a given CPU is in dynticks-idle state, we shouldn't |
| wait for it. |
| Therefore, just before entering one of these two states, |
| the preceding state takes a snapshot of each CPU's |
| \co{dynticks_progress_counter} variable, placing the |
| snapshot in another per-CPU variable, |
| \co{rcu_dyntick_snapshot}. |
| This is accomplished by invoking |
| \co{dyntick_save_progress_counter()}, shown below: |
| |
| \begin{VerbatimN} |
| static void dyntick_save_progress_counter(int cpu) |
| { |
| per_cpu(rcu_dyntick_snapshot, cpu) = |
| per_cpu(dynticks_progress_counter, cpu); |
| } |
| \end{VerbatimN} |
| |
| The \co{rcu_try_flip_waitack_state} state invokes |
| \co{rcu_try_flip_waitack_needed()}, shown below: |
| |
| \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed] |
| \begin{VerbatimN}[samepage=true,commandchars=\\\[\]] |
| static inline int |
| rcu_try_flip_waitack_needed(int cpu) |
| { |
| long curr; |
| long snap; |
| |
| curr = per_cpu(dynticks_progress_counter, cpu); \lnlbl[curr] |
| snap = per_cpu(rcu_dyntick_snapshot, cpu); \lnlbl[snap] |
| smp_mb(); \lnlbl[mb] |
| if ((curr == snap) && ((curr & 0x1) == 0)) \lnlbl[chk_remain] |
| return 0; \lnlbl[ret_0_r] |
| if ((curr - snap) > 2 || (snap & 0x1) == 0) \lnlbl[chk_idle] |
| return 0; \lnlbl[ret_0_i] |
| return 1; \lnlbl[ret_1] |
| } |
| \end{VerbatimN} |
| \end{fcvlabel} |
| |
| \begin{fcvref}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed] |
| \Clnref{curr,snap} pick up current and snapshot versions of |
| \co{dynticks_progress_counter}, respectively. |
| The memory barrier on \clnref{mb} ensures that the counter checks |
| in the later \co{rcu_try_flip_waitzero_state} follow |
| the fetches of these counters. |
| \Clnref{chk_remain,ret_0_r} return zero |
| (meaning no communication with the |
| specified CPU is required) if that CPU has remained in dynticks-idle |
| state since the time that the snapshot was taken. |
| Similarly, \clnref{chk_idle,ret_0_i} return zero |
| if that CPU was initially |
| in dynticks-idle state or if it has completely passed through a |
| dynticks-idle state. |
| In both these cases, there is no way that the CPU could have retained |
| the old value of the grace-period counter. |
| If neither of these conditions hold, \clnref{ret_1} returns one, meaning |
| that the CPU needs to explicitly respond. |
| \end{fcvref} |
| |
| For its part, the \co{rcu_try_flip_waitmb_state} state |
| invokes \co{rcu_try_flip_waitmb_needed()}, shown below: |
| |
| \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_try_flip_waitmb_needed] |
| \begin{VerbatimN}[commandchars=\\\[\]] |
| static inline int |
| rcu_try_flip_waitmb_needed(int cpu) |
| { |
| long curr; |
| long snap; |
| |
| curr = per_cpu(dynticks_progress_counter, cpu); |
| snap = per_cpu(rcu_dyntick_snapshot, cpu); |
| smp_mb(); |
| if ((curr == snap) && ((curr & 0x1) == 0)) |
| return 0; |
| if (curr != snap) \lnlbl[chk_to_from] |
| return 0; \lnlbl[ret_0] |
| return 1; |
| } |
| \end{VerbatimN} |
| \end{fcvlabel} |
| |
| \begin{fcvref}[ln:formal:dyntickrcu:rcu_try_flip_waitmb_needed] |
| This is quite similar to \co{rcu_try_flip_waitack_needed()}, |
| the difference being in \clnref{chk_to_from,ret_0}, because any transition |
| either to or from dynticks-idle state executes the memory barrier |
| needed by the \co{rcu_try_flip_waitmb_state} state. |
| \end{fcvref} |
| |
| We now have seen all the code involved in the interface between |
| RCU and the dynticks-idle state. |
| The next section builds up the Promela model used to verify this |
| code. |
| |
| \QuickQuiz{ |
| Can you spot any bugs in any of the code in this section? |
| }\QuickQuizAnswer{ |
| Read the next section to see if you were correct. |
| }\QuickQuizEnd |
| |
| \subsection{Validating Preemptible RCU and dynticks} |
| \label{sec:formal:Validating Preemptible RCU and dynticks} |
| |
| This section develops a Promela model for the interface between |
| dynticks and RCU step by step, with each of |
| \crefrange{sec:formal:Basic Model}{sec:formal:Validating NMI Handlers} |
| illustrating one step, starting with the process-level code, |
| adding assertions, interrupts, and finally NMIs. |
| |
| \Cref{sec:formal:Lessons (Re)Learned} lists |
| lessons (re)learned during this effort, and |
| \crefrange{sec:formal:Simplicity Avoids Formal Verification}{sec:formal:Discussion} |
| present a simpler solution to RCU's dynticks problem. |
| |
| \subsubsection{Basic Model} |
| \label{sec:formal:Basic Model} |
| |
| This section translates the process-level dynticks entry/exit |
| code and the grace-period processing into |
| Promela~\cite{Holzmann03a}. |
| We start with \co{rcu_exit_nohz()} and |
| \co{rcu_enter_nohz()} |
| from the 2.6.25-rc4 kernel, placing these in a single Promela |
| process that models exiting and entering dynticks-idle mode in |
| a loop as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base@dyntick_nohz.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base:dyntick_nohz] |
| \Clnref{do,od} define a loop. |
| \Clnref{break} exits the loop once the loop counter~\co{i} |
| has exceeded the limit \co{MAX_DYNTICK_LOOP_NOHZ}. |
| \Clnref{kick_loop} tells the loop construct to execute |
| \clnrefrange{ex_inc:b}{inc_i} |
| for each pass through the loop. |
| Because the conditionals on \clnref{break,kick_loop} |
| are exclusive of |
| each other, the normal Promela random selection of true conditions |
| is disabled. |
| \Clnref{ex_inc:b,ex_inc:e} model |
| \co{rcu_exit_nohz()}'s non-atomic |
| increment of \co{dynticks_progress_counter}, while |
| \clnref{ex_assert} models the \co{WARN_ON()}. |
| The \co{atomic} construct simply reduces the Promela state space, |
| given that the \co{WARN_ON()} is not strictly speaking part |
| of the algorithm. |
| \Clnrefrange{ent_inc:b}{ent_inc:e} similarly model the increment and |
| \co{WARN_ON()} for \co{rcu_enter_nohz()}. |
| Finally, \clnref{inc_i} increments the loop counter. |
| \end{fcvref} |
| |
| Each pass through the loop therefore models a CPU exiting |
| dynticks-idle mode (for example, starting to execute a task), then |
| re-entering dynticks-idle mode (for example, that same task blocking). |
| |
| \QuickQuizSeries{% |
| \QuickQuizB{ |
| Why isn't the memory barrier in \co{rcu_exit_nohz()} |
| and \co{rcu_enter_nohz()} modeled in Promela? |
| }\QuickQuizAnswerB{ |
| Promela assumes sequential consistency, so |
| it is not necessary to model memory barriers. |
| In fact, one must instead explicitly model lack of memory barriers, |
| for example, as shown in |
| \cref{lst:formal:QRCU Unordered Summation} on |
| \cpageref{lst:formal:QRCU Unordered Summation}. |
| }\QuickQuizEndB |
| % |
| \QuickQuizE{ |
| Isn't it a bit strange to model \co{rcu_exit_nohz()} |
| followed by \co{rcu_enter_nohz()}? |
| Wouldn't it be more natural to instead model entry before exit? |
| }\QuickQuizAnswerE{ |
| It probably would be more natural, but we will need |
| this particular order for the liveness checks that we will add later. |
| }\QuickQuizEndE |
| } |
| |
| The next step is to model the interface to RCU's grace-period |
| processing. |
| For this, we need to model |
| \co{dyntick_save_progress_counter()}, |
| \co{rcu_try_flip_waitack_needed()}, |
| \co{rcu_try_flip_waitmb_needed()}, |
| as well as portions of |
| \co{rcu_try_flip_waitack()} and |
| \co{rcu_try_flip_waitmb()}, all from the 2.6.25-rc4 kernel. |
| The following \co{grace_period()} Promela process models |
| these functions as they would be invoked during a single pass |
| through preemptible RCU's grace-period processing. |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base@grace_period.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base:grace_period] |
| \Clnrefrange{print:b}{print:e} print out the loop limit |
| (but only into the ``.trail'' file |
| in case of error) and models a line of code |
| from \co{rcu_try_flip_idle()} and its call to |
| \co{dyntick_save_progress_counter()}, which takes a |
| snapshot of the current CPU's \co{dynticks_progress_counter} |
| variable. |
| These two lines are executed atomically to reduce state space. |
| |
| \Clnrefrange{do1}{od1} model the relevant code in |
| \co{rcu_try_flip_waitack()} and its call to |
| \co{rcu_try_flip_waitack_needed()}. |
| This loop is modeling the grace-period state machine waiting for |
| a counter-flip acknowledgement from each CPU, but only that part |
| that interacts with dynticks-idle CPUs. |
| |
| \Clnref{snap} models a line from \co{rcu_try_flip_waitzero()} |
| and its call to \co{dyntick_save_progress_counter()}, again |
| taking a snapshot of the CPU's \co{dynticks_progress_counter} |
| variable. |
| |
| Finally, \clnrefrange{do2}{od2} model the relevant code in |
| \co{rcu_try_flip_waitack()} and its call to |
| \co{rcu_try_flip_waitack_needed()}. |
| This loop is modeling the grace-period state-machine waiting for |
| each CPU to execute a memory barrier, but again only that part |
| that interacts with dynticks-idle CPUs. |
| \end{fcvref} |
| |
| \QuickQuiz{ |
| Wait a minute! |
| In the Linux kernel, both \co{dynticks_progress_counter} and |
| \co{rcu_dyntick_snapshot} are per-CPU variables. |
| So why are they instead being modeled as single global variables? |
| }\QuickQuizAnswer{ |
| Because the grace-period code processes each |
| CPU's \co{dynticks_progress_counter} and |
| \co{rcu_dyntick_snapshot} variables separately, |
| we can collapse the state onto a single CPU\@. |
| If the grace-period code were instead to do something special |
| given specific values on specific CPUs, then we would indeed need |
| to model multiple CPUs. |
| But fortunately, we can safely confine ourselves to two CPUs, the |
| one running the grace-period processing and the one entering and |
| leaving dynticks-idle mode. |
| }\QuickQuizEnd |
| |
| The resulting model (\path{dyntickRCU-base.spin}), |
| when run with the |
| \path{runspin.sh} script, |
| generates 691 states and |
| passes without errors, which is not at all surprising given that |
| it completely lacks the assertions that could find failures. |
| The next section therefore adds safety assertions. |
| |
| \subsubsection{Validating Safety} |
| \label{sec:formal:Validating Safety} |
| |
| A safe RCU implementation must never permit a grace period to |
| complete before the completion of any RCU readers that started |
| before the start of the grace period. |
| This is modeled by a \co{grace_period_state} variable that |
| can take on three states as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-s@grace_period_state.fcv} |
| |
| The \co{grace_period()} process sets this variable as it |
| progresses through the grace-period phases, as shown below: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-s@grace_period.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-s:grace_period] |
| \Clnref{upd_gps1,upd_gps2,upd_gps3,upd_gps4,upd_gps5,upd_gps6} |
| update this variable (combining |
| atomically with algorithmic operations where feasible) to |
| allow the \co{dyntick_nohz()} process to verify the basic |
| RCU safety property. |
| The form of this verification is to assert that the value of the |
| \co{grace_period_state} variable cannot jump from |
| \co{GP_IDLE} to \co{GP_DONE} during a time period |
| over which RCU readers could plausibly persist. |
| \end{fcvref} |
| |
| \QuickQuiz{ |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-s:grace_period] |
| Given there are a pair of back-to-back changes to |
| \co{grace_period_state} on \clnref{upd_gps3,upd_gps4}, |
| how can we be sure that \clnref{upd_gps3}'s changes won't be lost? |
| \end{fcvref} |
| }\QuickQuizAnswer{ |
| Recall that Promela and Spin trace out |
| every possible sequence of state changes. |
| Therefore, timing is irrelevant: |
| Promela/Spin will be quite happy to jam the entire rest of |
| the model between those two statements unless some state |
| variable specifically prohibits doing so. |
| }\QuickQuizEnd |
| |
| The \co{dyntick_nohz()} Promela process implements |
| this verification as shown below: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-s@dyntick_nohz.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-s:dyntick_nohz] |
| \Clnref{new_flg} sets a new \co{old_gp_idle} flag if the |
| value of the \co{grace_period_state} variable is |
| \co{GP_IDLE} at the beginning of task execution, |
| and the assertion at \clnref{assert:b,assert:e} |
| fire if the \co{grace_period_state} |
| variable has advanced to \co{GP_DONE} during task execution, |
| which would be illegal given that a single RCU read-side critical |
| section could span the entire intervening time period. |
| \end{fcvref} |
| |
| The resulting |
| model (\path{dyntickRCU-base-s.spin}), |
| when run with the \path{runspin.sh} script, |
| generates 964 states and passes without errors, which is reassuring. |
| That said, although safety is critically important, it is also quite |
| important to avoid indefinitely stalling grace periods. |
| The next section therefore covers verifying liveness. |
| |
| \subsubsection{Validating Liveness} |
| \label{sec:formal:Validating Liveness} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-sl-busted:dyntick_nohz] |
| Although liveness can be difficult to prove, there is a simple |
| trick that applies here. |
| The first step is to make \co{dyntick_nohz()} indicate that |
| it is done via a \co{dyntick_nohz_done} variable, as shown on |
| \clnref{done} of the following: |
| \end{fcvref} |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-sl-busted@dyntick_nohz.fcv} |
| |
| With this variable in place, we can add assertions to |
| \co{grace_period()} to check for unnecessary blockage |
| as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-base-sl-busted@grace_period.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-base-sl-busted:grace_period] |
| We have added the \co{shouldexit} variable on \clnref{shex}, |
| which we initialize to zero on \clnref{init_shex}. |
| \Clnref{assert_shex} asserts that \co{shouldexit} is not set, while |
| \clnref{set_shex} sets \co{shouldexit} to the \co{dyntick_nohz_done} |
| variable maintained by \co{dyntick_nohz()}. |
| This assertion will therefore trigger if we attempt to take more than |
| one pass through the wait-for-counter-flip-acknowledgement |
| loop after \co{dyntick_nohz()} has completed |
| execution. |
| After all, if \co{dyntick_nohz()} is done, then there cannot be |
| any more state changes to force us out of the loop, so going through twice |
| in this state means an infinite loop, which in turn means no end to the |
| grace period. |
| |
| \Clnref{init_shex2,assert_shex2,set_shex2} operate in a similar manner |
| for the second (memory-barrier) loop. |
| |
| However, running this |
| model (\path{dyntickRCU-base-sl-busted.spin}) |
| results in failure, as \clnref{chk_2} is checking that |
| the wrong variable |
| is even. |
| Upon failure, \co{spin} writes out a |
| ``trail'' file |
| (\path{dyntickRCU-base-sl-busted.spin.trail}), |
| which records the sequence of states that lead to the failure. |
| Use the ``\co{spin -t -p -g -l\ }% |
| \path{dyntickRCU-base-sl-busted.spin}'' |
| command to cause \co{spin} to retrace this sequence of states, |
| printing the statements executed and the values of variables |
| (\path{dyntickRCU-base-sl-busted.spin.trail.txt}). |
| Note that the line numbers do not match the listing above due to |
| the fact that \co{spin} takes both functions in a single file. |
| However, the line numbers \emph{do} match the full |
| model (\path{dyntickRCU-base-sl-busted.spin}). |
| |
| We see that the \co{dyntick_nohz()} process completed |
| at step 34 (search for ``34:''), but that the |
| \co{grace_period()} process nonetheless failed to exit the loop. |
| The value of \co{curr} is \co{6} (see step 35) |
| and that the value of \co{snap} is \co{5} (see step 17). |
| Therefore the first condition on \clnref{chk_1} above |
| does not hold because |
| \qco{curr != snap}, and the second condition on \clnref{chk_2} |
| does not hold either because \co{snap} is odd and because |
| \co{curr} is only one greater than \co{snap}. |
| \end{fcvref} |
| |
| So one of these two conditions has to be incorrect. |
| Referring to the comment block in \co{rcu_try_flip_waitack_needed()} |
| for the first condition: |
| |
| \begin{quote} |
| If the CPU remained in dynticks mode for the entire time |
| and didn't take any interrupts, NMIs, SMIs, or whatever, |
| then it cannot be in the middle of an \co{rcu_read_lock()}, so |
| the next \co{rcu_read_lock()} it executes must use the new value |
| of the counter. |
| So we can safely pretend that this CPU already acknowledged |
| the counter. |
| \end{quote} |
| |
| The first condition does match this, because if \qco{curr == snap} |
| and if \co{curr} is even, then the corresponding CPU has been |
| in dynticks-idle mode the entire time, as required. |
| So let's look at the comment block for the second condition: |
| |
| \begin{quote} |
| If the CPU passed through or entered a dynticks idle phase with |
| no active irq handlers, then, as above, we can safely pretend |
| that this CPU already acknowledged the counter. |
| \end{quote} |
| |
| The first part of the condition is correct, because if \co{curr} |
| and \co{snap} differ by two, there will be at least one even |
| number in between, corresponding to having passed completely through |
| a dynticks-idle phase. |
| However, the second part of the condition corresponds to having |
| \emph{started} in dynticks-idle mode, not having \emph{finished} |
| in this mode. |
| We therefore need to be testing \co{curr} rather than |
| \co{snap} for being an even number. |
| |
| The corrected C code is as follows: |
| |
| \begin{fcvlabel}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed_fixed] |
| \begin{VerbatimN}[commandchars=\\\[\]] |
| static inline int |
| rcu_try_flip_waitack_needed(int cpu) |
| { |
| long curr; |
| long snap; |
| |
| curr = per_cpu(dynticks_progress_counter, cpu); |
| snap = per_cpu(rcu_dyntick_snapshot, cpu); |
| smp_mb(); |
| if ((curr == snap) && ((curr & 0x1) == 0)) \lnlbl[if:b] |
| return 0; |
| if ((curr - snap) > 2 || (curr & 0x1) == 0) |
| return 0; \lnlbl[if:e] |
| return 1; |
| } |
| \end{VerbatimN} |
| \end{fcvlabel} |
| |
| \begin{fcvref}[ln:formal:dyntickrcu:rcu_try_flip_waitack_needed_fixed] |
| \Clnrefrange{if:b}{if:e} can now be combined and simplified, |
| resulting in the following. |
| A similar simplification can be applied to |
| \co{rcu_try_flip_waitmb_needed()}. |
| \end{fcvref} |
| |
| \begin{VerbatimN}[commandchars=\\\[\]] |
| static inline int |
| rcu_try_flip_waitack_needed(int cpu) |
| { |
| long curr; |
| long snap; |
| |
| curr = per_cpu(dynticks_progress_counter, cpu); |
| snap = per_cpu(rcu_dyntick_snapshot, cpu); |
| smp_mb(); |
| if ((curr - snap) >= 2 || (curr & 0x1) == 0) |
| return 0; |
| return 1; |
| } |
| \end{VerbatimN} |
| |
| Making the corresponding correction in the |
| model (\path{dyntickRCU-base-sl.spin}) |
| results in a correct verification with 661 states that passes without |
| errors. |
| However, it is worth noting that the first version of the liveness |
| verification failed to catch this bug, due to a bug in the liveness |
| verification itself. |
| This liveness-verification bug was located by inserting an infinite |
| loop in the \co{grace_period()} process, and noting that |
| the liveness-verification code failed to detect this problem! |
| |
| We have now successfully verified both safety and liveness |
| conditions, but only for processes running and blocking. |
| We also need to handle interrupts, a task taken up in the next section. |
| |
| \subsubsection{Interrupts} |
| \label{sec:formal:Interrupts} |
| |
| There are a couple of ways to model interrupts in Promela: |
| \begin{enumerate} |
| \item Using C-preprocessor tricks to insert the interrupt handler |
| between each and every statement of the \co{dynticks_nohz()} |
| process, or |
| \item Modeling the interrupt handler with a separate process. |
| \end{enumerate} |
| |
| A bit of thought indicated that the second approach would have a |
| smaller state space, though it requires that the interrupt handler |
| somehow run atomically with respect to the \co{dynticks_nohz()} |
| process, but not with respect to the \co{grace_period()} |
| process. |
| |
| Fortunately, it turns out that Promela permits you to branch |
| out of atomic statements. |
| This trick allows us to have the interrupt handler set a flag, and |
| recode \co{dynticks_nohz()} to atomically check this flag |
| and execute only when the flag is not set. |
| This can be accomplished with a C-preprocessor macro that takes |
| a label and a Promela statement as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@EXECUTE_MAINLINE.fcv} |
| |
| One might use this macro as follows: |
| |
| \begin{VerbatimU} |
| EXECUTE_MAINLINE(stmt1, |
| tmp = dynticks_progress_counter) |
| \end{VerbatimU} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:EXECUTE_MAINLINE] |
| \Clnref{label} of the macro creates the specified statement label. |
| \Clnrefrange{atm:b}{atm:e} are an atomic block that tests |
| the \co{in_dyntick_irq} |
| variable, and if this variable is set (indicating that the interrupt |
| handler is active), branches out of the atomic block back to the |
| label. |
| Otherwise, \clnref{else} executes the specified statement. |
| The overall effect is that mainline execution stalls any time an interrupt |
| is active, as required. |
| \end{fcvref} |
| |
| \subsubsection{Validating Interrupt Handlers} |
| \label{sec:formal:Validating Interrupt Handlers} |
| |
| The first step is to convert \co{dyntick_nohz()} to |
| \co{EXECUTE_MAINLINE()} form, as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@dyntick_nohz.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_nohz] |
| It is important to note that when a group of statements is passed |
| to \co{EXECUTE_MAINLINE()}, as in \clnrefrange{stmt2:b}{stmt2:e}, all |
| statements in that group execute atomically. |
| \end{fcvref} |
| |
| \QuickQuizSeries{% |
| \QuickQuizB{ |
| But what would you do if you needed the statements in a single |
| \co{EXECUTE_MAINLINE()} group to execute non-atomically? |
| }\QuickQuizAnswerB{ |
| The easiest thing to do would be to put |
| each such statement in its own \co{EXECUTE_MAINLINE()} |
| statement. |
| }\QuickQuizEndB |
| % |
| \QuickQuizE{ |
| But what if the \co{dynticks_nohz()} process had |
| \qco{if} or \qco{do} statements with conditions, |
| where the statement bodies of these constructs |
| needed to execute non-atomically? |
| }\QuickQuizAnswerE{ |
| One approach, as we will see in a later section, |
| is to use explicit labels and \qco{goto} statements. |
| For example, the construct: |
| |
| \begin{VerbatimU} |
| if |
| :: i == 0 -> a = -1; |
| :: else -> a = -2; |
| fi; |
| \end{VerbatimU} |
| % |
| could be modeled as something like: |
| |
| \begin{VerbatimU} |
| EXECUTE_MAINLINE(stmt1, |
| if |
| :: i == 0 -> goto stmt1_then; |
| :: else -> goto stmt1_else; |
| fi) |
| stmt1_then: skip; |
| EXECUTE_MAINLINE(stmt1_then1, a = -1; goto stmt1_end) |
| stmt1_else: skip; |
| EXECUTE_MAINLINE(stmt1_then1, a = -2) |
| stmt1_end: skip; |
| \end{VerbatimU} |
| |
| However, it is not clear that the macro is helping much in the case |
| of the \qco{if} statement, so these sorts of situations will |
| be open-coded in the following sections. |
| }\QuickQuizEndE |
| } |
| |
| The next step is to write a \co{dyntick_irq()} process |
| to model an interrupt handler: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@dyntick_irq.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_irq] |
| The loop from \clnrefrange{do}{od} models up to \co{MAX_DYNTICK_LOOP_IRQ} |
| interrupts, with \clnref{cond1,cond2} forming the loop condition and |
| \clnref{inc_i} incrementing the control variable. |
| \Clnref{in_irq} tells \co{dyntick_nohz()} that an interrupt handler |
| is running, and \clnref{clr_in_irq} tells \co{dyntick_nohz()} that this |
| handler has completed. |
| \Clnref{irq_done} is used for liveness verification, just like the corresponding |
| line of \co{dyntick_nohz()}. |
| \end{fcvref} |
| |
| \QuickQuiz{ |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_irq] |
| Why are \clnref{clr_in_irq,inc_i} (the \qtco{in_dyntick_irq = 0;} |
| and the \qco{i++;}) executed atomically? |
| \end{fcvref} |
| }\QuickQuizAnswer{ |
| These lines of code pertain to controlling the |
| model, not to the code being modeled, so there is no reason to |
| model them non-atomically. |
| The motivation for modeling them atomically is to reduce the size |
| of the state space. |
| }\QuickQuizEnd |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:dyntick_irq] |
| \Clnrefrange{enter:b}{enter:e} model \co{rcu_irq_enter()}, and |
| \clnref{add_prmt_cnt:b,add_prmt_cnt:e} model the relevant snippet |
| of \co{__irq_enter()}. |
| \Clnrefrange{vrf_safe:b}{vrf_safe:e} verify safety in much the same manner |
| as do the corresponding lines of \co{dynticks_nohz()}. |
| \Clnref{irq_exit:b,irq_exit:e} model the relevant snippet of \co{__irq_exit()}, |
| and finally \clnrefrange{exit:b}{exit:e} model \co{rcu_irq_exit()}. |
| \end{fcvref} |
| |
| \QuickQuiz{ |
| What property of interrupts is this \co{dynticks_irq()} |
| process unable to model? |
| }\QuickQuizAnswer{ |
| One such property is nested interrupts, |
| which are handled in the following section. |
| }\QuickQuizEnd |
| |
| The \co{grace_period()} process then becomes as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irqnn-ssl@grace_period.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irqnn-ssl:grace_period] |
| The implementation of \co{grace_period()} is very similar |
| to the earlier one. |
| The only changes are the addition of \clnref{MDLI} to add the new |
| interrupt-count parameter, changes to |
| \clnref{edit1,edit3} to add the new \co{dyntick_irq_done} variable |
| to the liveness checks, and of course the optimizations on \clnref{edit2,edit4}. |
| \end{fcvref} |
| |
| This model (\path{dyntickRCU-irqnn-ssl.spin}) |
| results in a correct verification with roughly half a million |
| states, passing without errors. |
| However, this version of the model does not handle nested |
| interrupts. |
| This topic is taken up in the next section. |
| |
| \subsubsection{Validating Nested Interrupt Handlers} |
| \label{sec:formal:Validating Nested Interrupt Handlers} |
| |
| Nested interrupt handlers may be modeled by splitting the body of |
| the loop in \co{dyntick_irq()} as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-ssl@dyntick_irq.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irq-ssl:dyntick_irq] |
| This is similar to the earlier \co{dynticks_irq()} process. |
| It adds a second counter variable~\co{j} on \clnref{j}, so that |
| \co{i}~counts entries to interrupt handlers and \co{j}~counts |
| exits. |
| The \co{outermost} variable on \clnref{om} helps determine |
| when the \co{grace_period_state} variable needs to be sampled |
| for the safety checks. |
| The loop-exit check on \clnref{chk_ex:b,chk_ex:e} is updated to require that the |
| specified number of interrupt handlers are exited as well as entered, |
| and the increment of~\co{i} is moved to \clnref{inc_i}, which is |
| the end of the interrupt-entry model. |
| \Clnrefrange{atm1:b}{atm1:e} set the \co{outermost} variable to indicate |
| whether this is the outermost of a set of nested interrupts and to |
| set the \co{in_dyntick_irq} variable that is used by the |
| \co{dyntick_nohz()} process. |
| \Clnrefrange{atm2:b}{atm2:e} capture the state of the \co{grace_period_state} |
| variable, but only when in the outermost interrupt handler. |
| |
| \Clnref{cnd_ex} has the do-loop conditional for interrupt-exit modeling: |
| As long as we have exited fewer interrupts than we have entered, it is |
| legal to exit another interrupt. |
| \Clnrefrange{atm3:b}{atm3:e} |
| check the safety criterion, but only if we are exiting |
| from the outermost interrupt level. |
| Finally, \clnrefrange{atm4:b}{atm4:e} increment the interrupt-exit count~\co{j} |
| and, if this is the outermost interrupt level, clears |
| \co{in_dyntick_irq}. |
| \end{fcvref} |
| |
| This model (\path{dyntickRCU-irq-ssl.spin}) |
| results in a correct verification with a bit more than half a million |
| states, passing without errors. |
| However, this version of the model does not handle NMIs, |
| which are taken up in the next section. |
| |
| \subsubsection{Validating NMI Handlers} |
| \label{sec:formal:Validating NMI Handlers} |
| |
| We take the same general approach for NMIs as we do for interrupts, |
| keeping in mind that NMIs do not nest. |
| This results in a \co{dyntick_nmi()} process as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@dyntick_nmi.fcv} |
| |
| Of course, the fact that we have NMIs requires adjustments in |
| the other components. |
| For example, the \co{EXECUTE_MAINLINE()} macro now needs to |
| pay attention to the NMI handler (\co{in_dyntick_nmi}) as well |
| as the interrupt handler (\co{in_dyntick_irq}) by checking |
| the \co{dyntick_nmi_done} variable as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@EXECUTE_MAINLINE.fcv} |
| |
| We will also need to introduce an \co{EXECUTE_IRQ()} |
| macro that checks \co{in_dyntick_nmi} in order to allow |
| \co{dyntick_irq()} to exclude \co{dyntick_nmi()}: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@EXECUTE_IRQ.fcv} |
| |
| It is further necessary to convert \co{dyntick_irq()} |
| to \co{EXECUTE_IRQ()} as follows: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@dyntick_irq.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irq-nmi-ssl:dyntick_irq] |
| Note that we have open-coded the \qco{if} statements |
| (for example, \clnrefrange{stmt1:b}{stmt1:e}). |
| In addition, statements that process strictly local state |
| (such as \clnref{inc_i}) need not exclude \co{dyntick_nmi()}. |
| \end{fcvref} |
| |
| Finally, \co{grace_period()} requires only a few changes: |
| |
| \input{CodeSamples/formal/promela/dyntick/dyntickRCU-irq-nmi-ssl@grace_period.fcv} |
| |
| \begin{fcvref}[ln:formal:promela:dyntick:dyntickRCU-irq-nmi-ssl:grace_period] |
| We have added the \co{printf()} for the new |
| \co{MAX_DYNTICK_LOOP_NMI} parameter on \clnref{MDL_NMI} and |
| added \co{dyntick_nmi_done} to the \co{shouldexit} |
| assignments on \clnref{nmi_done1,nmi_done2}. |
| \end{fcvref} |
| |
| The model (\path{dyntickRCU-irq-nmi-ssl.spin}) |
| results in a correct verification with several hundred million |
| states, passing without errors. |
| |
| \QuickQuiz{ |
| Does Paul \emph{always} write his code in this painfully incremental |
| manner? |
| }\QuickQuizAnswer{ |
| Not always, but more and more frequently. |
| In this case, Paul started with the smallest slice of code that |
| included an interrupt handler, because he was not sure how best |
| to model interrupts in Promela. |
| Once he got that working, he added other features. |
| (But if he was doing it again, he would start with a ``toy'' handler. |
| For example, he might have the handler increment a variable twice and |
| have the mainline code verify that the value was always even.) |
| |
| Why the incremental approach? |
| Consider the following, attributed to Brian W. Kernighan: |
| |
| \begin{quote} |
| Debugging is twice as hard as writing the code in the first |
| place. |
| Therefore, if you write the code as cleverly as possible, |
| you are, by definition, not smart enough to debug it. |
| \end{quote} |
| |
| This means that any attempt to optimize the production of code should |
| place at least 66\pct\ of its emphasis on optimizing the debugging process, |
| even at the expense of increasing the time and effort spent coding. |
| Incremental coding and testing is one way to optimize the debugging |
| process, at the expense of some increase in coding effort. |
| Paul uses this approach because he rarely has the luxury of |
| devoting full days (let alone weeks) to coding and debugging. |
| }\QuickQuizEnd |
| |
| \subsubsection{Lessons (Re)Learned} |
| \label{sec:formal:Lessons (Re)Learned} |
| |
| This effort provided some lessons (re)learned: |
| |
| \begin{enumerate} |
| \item {\bf Promela and Spin can verify interrupt\slash NMI\-/handler |
| interactions.} |
| \item {\bf Documenting code can help locate bugs.} |
| In this case, the documentation effort located |
| a misplaced memory barrier in |
| \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()}, |
| as shown by the following patch~\cite{PaulEMcKenney2008commit:ae66be9b71b1}. |
| |
| \begin{VerbatimU} |
| static inline void rcu_enter_nohz(void) |
| { |
| + mb(); |
| __get_cpu_var(dynticks_progress_counter)++; |
| - mb(); |
| } |
| |
| static inline void rcu_exit_nohz(void) |
| { |
| - mb(); |
| __get_cpu_var(dynticks_progress_counter)++; |
| + mb(); |
| } |
| \end{VerbatimU} |
| |
| \item {\bf Validate your code early, often, and up to the point |
| of destruction.} |
| This effort located one subtle bug in |
| \co{rcu_try_flip_waitack_needed()} |
| that would have been quite difficult to test or debug, as |
| shown by the following patch~\cite{PaulEMcKenney2008commit:d7c0651390b6}. |
| |
| \begin{VerbatimU} |
| - if ((curr - snap) > 2 || (snap & 0x1) == 0) |
| + if ((curr - snap) > 2 || (curr & 0x1) == 0) |
| \end{VerbatimU} |
| |
| \item {\bf Always verify your verification code.} |
| The usual way to do this is to insert a deliberate bug |
| and verify that the verification code catches it. |
| Of course, if the verification code fails to catch this bug, |
| you may also need to verify the bug itself, and so on, |
| recursing infinitely. |
| However, if you find yourself in this position, |
| getting a good night's sleep |
| can be an extremely effective debugging technique. |
| You will then see that the obvious verify-the-verification |
| technique is to deliberately insert bugs in the code being |
| verified. |
| If the verification fails to find them, the verification clearly |
| is buggy. |
| \item {\bf Use of atomic instructions can simplify verification.} |
| Unfortunately, use of the \co{cmpxchg} atomic instruction |
| would also slow down the critical \IXacr{irq} fastpath, so they |
| are not appropriate in this case. |
| \item {\bf The need for complex formal verification often indicates |
| a need to re-think your design.} |
| \end{enumerate} |
| |
| To this last point, it turns out that there is a much simpler solution to |
| the dynticks problem, which is presented in the next section. |
| |
| \subsubsection{Simplicity Avoids Formal Verification} |
| \label{sec:formal:Simplicity Avoids Formal Verification} |
| |
| The complexity of the dynticks interface for preemptible RCU is primarily |
| due to the fact that both \IRQ s and NMIs use the same code path and the |
| same state variables. |
| This leads to the notion of providing separate code paths and variables |
| for \IRQ s and NMIs, as has been done for |
| hierarchical RCU~\cite{PaulEMcKenney2008HierarchicalRCU} |
| as indirectly suggested by |
| Manfred Spraul~\cite{ManfredSpraul2008StateMachineRCU}. |
| This work was pulled into mainline kernel during the v2.6.29 |
| development cycle~\cite{PaulEMcKenney2008commit:64db4cfff99c}. |
| |
| \subsubsection{State Variables for Simplified Dynticks Interface} |
| \label{sec:formal:State Variables for Simplified Dynticks Interface} |
| |
| \Cref{lst:formal:Variables for Simple Dynticks Interface} |
| shows the new per-CPU state variables. |
| These variables are grouped into structs to allow multiple independent |
| RCU implementations (e.g., \co{rcu} and \co{rcu_bh}) to conveniently |
| and efficiently share dynticks state. |
| In what follows, they can be thought of as independent per-CPU variables. |
| |
| \begin{listing} |
| \begin{VerbatimL} |
| struct rcu_dynticks { |
| int dynticks_nesting; |
| int dynticks; |
| int dynticks_nmi; |
| }; |
| |
| struct rcu_data { |
| ... |
| int dynticks_snap; |
| int dynticks_nmi_snap; |
| ... |
| }; |
| \end{VerbatimL} |
| \caption{Variables for Simple Dynticks Interface} |
| \label{lst:formal:Variables for Simple Dynticks Interface} |
| \end{listing} |
| |
| The \co{dynticks_nesting}, \co{dynticks}, and \co{dynticks_snap} variables |
| are for the \IRQ\ code paths, and the \co{dynticks_nmi} and |
| \co{dynticks_nmi_snap} variables are for the NMI code paths, although |
| the NMI code path will also reference (but not modify) the |
| \co{dynticks_nesting} variable. |
| These variables are used as follows: |
| |
| \begin{description}[style=nextline] |
| \item [\tco{dynticks_nesting}] |
| This counts the number of reasons that the corresponding |
| CPU should be monitored for RCU read-side critical sections. |
| If the CPU is in dynticks-idle mode, then this counts the |
| \IRQ\ nesting level, otherwise it is one greater than the |
| \IRQ\ nesting level. |
| \item [\tco{dynticks}] |
| This counter's value is even if the corresponding CPU is |
| in dynticks-idle mode and there are no \IRQ\ handlers currently |
| running on that CPU, otherwise the counter's value is odd. |
| In other words, if this counter's value is odd, then the |
| corresponding CPU might be in an RCU read-side critical section. |
| \item [\tco{dynticks_nmi}] |
| This counter's value is odd if the corresponding CPU is |
| in an NMI handler, but only if the NMI arrived while this |
| CPU was in dyntick-idle mode with no \IRQ\ handlers running. |
| Otherwise, the counter's value will be even. |
| \item [\tco{dynticks_snap}] |
| This will be a snapshot of the \co{dynticks} counter, but |
| only if the current RCU grace period has extended for too |
| long a duration. |
| \item [\tco{dynticks_nmi_snap}] |
| This will be a snapshot of the \co{dynticks_nmi} counter, but |
| again only if the current RCU grace period has extended for too |
| long a duration. |
| \end{description} |
| |
| If both \co{dynticks} and \co{dynticks_nmi} have taken on an even |
| value during a given time interval, then the corresponding CPU has |
| passed through a \IX{quiescent state} during that interval. |
| |
| \QuickQuiz{ |
| But what happens if an NMI handler starts running before |
| an \IRQ\ handler completes, and if that NMI handler continues |
| running until a second \IRQ\ handler starts? |
| }\QuickQuizAnswer{ |
| This cannot happen within the confines of a single CPU\@. |
| The first \IRQ\ handler cannot complete until the NMI handler |
| returns. |
| Therefore, if each of the \co{dynticks} and \co{dynticks_nmi} |
| variables have taken on an even value during a given time |
| interval, the corresponding CPU really was in a quiescent |
| state at some time during that interval. |
| }\QuickQuizEnd |
| |
| \subsubsection{Entering and Leaving Dynticks-Idle Mode} |
| \label{sec:formal:Entering and Leaving Dynticks-Idle Mode} |
| |
| \Cref{lst:formal:Entering and Exiting Dynticks-Idle Mode} |
| shows the \co{rcu_enter_nohz()} and \co{rcu_exit_nohz()}, |
| which enter and exit dynticks-idle mode, also known as ``nohz'' mode. |
| These two functions are invoked from process context. |
| |
| \begin{listing} |
| \begin{fcvlabel}[ln:formal:Entering and Exiting Dynticks-Idle Mode] |
| \begin{VerbatimL}[commandchars=\\\[\]] |
| void rcu_enter_nohz(void) |
| { |
| unsigned long flags; |
| struct rcu_dynticks *rdtp; |
| |
| smp_mb(); \lnlbl[mb] |
| local_irq_save(flags); \lnlbl[irq_sv] |
| rdtp = &__get_cpu_var(rcu_dynticks); \lnlbl[get_ptr] |
| rdtp->dynticks++; \lnlbl[inc_cnt] |
| rdtp->dynticks_nesting--; \lnlbl[dec_nst] |
| WARN_ON(rdtp->dynticks & 0x1); |
| local_irq_restore(flags); \lnlbl[irq_rs] |
| } |
| |
| void rcu_exit_nohz(void) |
| { |
| unsigned long flags; |
| struct rcu_dynticks *rdtp; |
| |
| local_irq_save(flags); |
| rdtp = &__get_cpu_var(rcu_dynticks); |
| rdtp->dynticks++; |
| rdtp->dynticks_nesting++; |
| WARN_ON(!(rdtp->dynticks & 0x1)); |
| local_irq_restore(flags); |
| smp_mb(); |
| } |
| \end{VerbatimL} |
| \end{fcvlabel} |
| \caption{Entering and Exiting Dynticks-Idle Mode} |
| \label{lst:formal:Entering and Exiting Dynticks-Idle Mode} |
| \end{listing} |
| |
| \begin{fcvref}[ln:formal:Entering and Exiting Dynticks-Idle Mode] |
| \Clnref{mb} ensures that any prior memory accesses (which might |
| include accesses from RCU read-side critical sections) are seen |
| by other CPUs before those marking entry to dynticks-idle mode. |
| \Clnref{irq_sv,irq_rs} disable and reenable \IRQ s. |
| \Clnref{get_ptr} acquires a pointer to the current CPU's \co{rcu_dynticks} |
| structure, and |
| \clnref{inc_cnt} increments the current CPU's \co{dynticks} counter, which |
| should now be even, given that we are entering dynticks-idle mode |
| in process context. |
| Finally, \clnref{dec_nst} decrements \co{dynticks_nesting}, |
| which should now be zero. |
| \end{fcvref} |
| |
| The \co{rcu_exit_nohz()} function is quite similar, but increments |
| \co{dynticks_nesting} rather than decrementing it and checks for |
| the opposite \co{dynticks} polarity. |
| |
| \subsubsection{NMIs From Dynticks-Idle Mode} |
| \label{sec:formal:NMIs From Dynticks-Idle Mode} |
| |
| \begin{fcvref}[ln:formal:NMIs From Dynticks-Idle Mode] |
| \Cref{lst:formal:NMIs From Dynticks-Idle Mode} |
| shows the \co{rcu_nmi_enter()} and \co{rcu_nmi_exit()} functions, |
| which inform RCU of NMI entry and exit, respectively, from dynticks-idle |
| mode. |
| However, if the NMI arrives during an \IRQ\ handler, then RCU will already |
| be on the lookout for RCU read-side critical sections from this CPU, |
| so \clnref{chk_ext1,ret1} of \co{rcu_nmi_enter()} and \clnref{chk_ext2,ret2} |
| of \co{rcu_nmi_exit()} silently return if \co{dynticks} is odd. |
| Otherwise, the two functions increment \co{dynticks_nmi}, with |
| \co{rcu_nmi_enter()} leaving it with an odd value and \co{rcu_nmi_exit()} |
| leaving it with an even value. |
| Both functions execute memory barriers between this increment |
| and possible RCU read-side critical sections on \clnref{mb1,mb2}, |
| respectively. |
| \end{fcvref} |
| |
| \begin{listing} |
| \begin{fcvlabel}[ln:formal:NMIs From Dynticks-Idle Mode] |
| \begin{VerbatimL}[commandchars=\\\[\]] |
| void rcu_nmi_enter(void) |
| { |
| struct rcu_dynticks *rdtp; |
| |
| rdtp = &__get_cpu_var(rcu_dynticks); |
| if (rdtp->dynticks & 0x1) \lnlbl[chk_ext1] |
| return; \lnlbl[ret1] |
| rdtp->dynticks_nmi++; |
| WARN_ON(!(rdtp->dynticks_nmi & 0x1)); |
| smp_mb(); \lnlbl[mb1] |
| } |
| |
| void rcu_nmi_exit(void) |
| { |
| struct rcu_dynticks *rdtp; |
| |
| rdtp = &__get_cpu_var(rcu_dynticks); |
| if (rdtp->dynticks & 0x1) \lnlbl[chk_ext2] |
| return; \lnlbl[ret2] |
| smp_mb(); \lnlbl[mb2] |
| rdtp->dynticks_nmi++; |
| WARN_ON(rdtp->dynticks_nmi & 0x1); |
| } |
| \end{VerbatimL} |
| \end{fcvlabel} |
| \caption{NMIs From Dynticks-Idle Mode} |
| \label{lst:formal:NMIs From Dynticks-Idle Mode} |
| \end{listing} |
| |
| \subsubsection{Interrupts From Dynticks-Idle Mode} |
| \label{sec:formal:Interrupts From Dynticks-Idle Mode} |
| |
| \begin{fcvref}[ln:formal:Interrupts From Dynticks-Idle Mode] |
| \Cref{lst:formal:Interrupts From Dynticks-Idle Mode} |
| shows \co{rcu_irq_enter()} and \co{rcu_irq_exit()}, which |
| inform RCU of entry to and exit from, respectively, \IRQ\ context. |
| \Clnref{inc_nst} of \co{rcu_irq_enter()} increments \co{dynticks_nesting}, |
| and if this variable was already non-zero, \clnref{ret1} silently returns. |
| Otherwise, \clnref{inc_dynt1} increments \co{dynticks}, |
| which will then have |
| an odd value, consistent with the fact that this CPU can now |
| execute RCU read-side critical sections. |
| \Clnref{mb1} therefore executes a memory barrier to ensure that |
| the increment of \co{dynticks} is seen before any |
| RCU read-side critical sections that the subsequent \IRQ\ handler |
| might execute. |
| |
| \begin{listing} |
| \begin{fcvlabel}[ln:formal:Interrupts From Dynticks-Idle Mode] |
| \begin{VerbatimL}[commandchars=\\\[\]] |
| void rcu_irq_enter(void) |
| { |
| struct rcu_dynticks *rdtp; |
| |
| rdtp = &__get_cpu_var(rcu_dynticks); |
| if (rdtp->dynticks_nesting++) \lnlbl[inc_nst] |
| return; \lnlbl[ret1] |
| rdtp->dynticks++; \lnlbl[inc_dynt1] |
| WARN_ON(!(rdtp->dynticks & 0x1)); |
| smp_mb(); \lnlbl[mb1] |
| } |
| |
| void rcu_irq_exit(void) |
| { |
| struct rcu_dynticks *rdtp; |
| |
| rdtp = &__get_cpu_var(rcu_dynticks); |
| if (--rdtp->dynticks_nesting) \lnlbl[dec_nst] |
| return; \lnlbl[ret2] |
| smp_mb(); \lnlbl[mb2] |
| rdtp->dynticks++; \lnlbl[inc_dynt2] |
| WARN_ON(rdtp->dynticks & 0x1); \lnlbl[chk_even] |
| if (__get_cpu_var(rcu_data).nxtlist || \lnlbl[chk_cb:b] |
| __get_cpu_var(rcu_bh_data).nxtlist) |
| set_need_resched(); \lnlbl[chk_cb:e] |
| } |
| \end{VerbatimL} |
| \end{fcvlabel} |
| \caption{Interrupts From Dynticks-Idle Mode} |
| \label{lst:formal:Interrupts From Dynticks-Idle Mode} |
| \end{listing} |
| |
| \Clnref{dec_nst} of \co{rcu_irq_exit()} decrements |
| \co{dynticks_nesting}, and |
| if the result is non-zero, \clnref{ret2} silently returns. |
| Otherwise, \clnref{mb2} executes a memory barrier to ensure that the |
| increment of \co{dynticks} on \clnref{inc_dynt2} is seen after any RCU |
| read-side critical sections that the prior \IRQ\ handler might have executed. |
| \Clnref{chk_even} verifies that \co{dynticks} is now even, consistent with |
| the fact that no RCU read-side critical sections may appear in |
| dynticks-idle mode. |
| \Clnrefrange{chk_cb:b}{chk_cb:e} check to see |
| if the prior \IRQ\ handlers enqueued any |
| RCU callbacks, forcing this CPU out of dynticks-idle mode via |
| a reschedule API if so. |
| \end{fcvref} |
| |
| \subsubsection{Checking For Dynticks Quiescent States} |
| \label{sec:formal:Checking For Dynticks Quiescent States} |
| |
| \begin{fcvref}[ln:formal:Saving Dyntick Progress Counters] |
| \Cref{lst:formal:Saving Dyntick Progress Counters} |
| shows \co{dyntick_save_progress_counter()}, which takes a snapshot |
| of the specified CPU's \co{dynticks} and \co{dynticks_nmi} |
| counters. |
| \Clnref{snap,snapn} snapshot these two variables to locals, \clnref{mb} |
| executes a memory barrier to pair with the memory barriers in the functions in |
| \cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,% |
| lst:formal:NMIs From Dynticks-Idle Mode,% |
| lst:formal:Interrupts From Dynticks-Idle Mode}. |
| \Clnref{rec_snap,rec_snapn} record the snapshots for later calls to |
| \co{rcu_implicit_dynticks_qs()}, |
| and \clnref{chk_prgs} checks to see if the CPU is in dynticks-idle mode with |
| neither \IRQ s nor NMIs in progress (in other words, both snapshots |
| have even values), hence in an extended quiescent state. |
| If so, \clnref{cnt:b,cnt:e} count this event, and \clnref{ret} returns |
| true if the CPU was in a quiescent state. |
| \end{fcvref} |
| |
| \begin{listing} |
| \begin{fcvlabel}[ln:formal:Saving Dyntick Progress Counters] |
| \begin{VerbatimL}[commandchars=\\\[\]] |
| static int |
| dyntick_save_progress_counter(struct rcu_data *rdp) |
| { |
| int ret; |
| int snap; |
| int snap_nmi; |
| |
| snap = rdp->dynticks->dynticks; \lnlbl[snap] |
| snap_nmi = rdp->dynticks->dynticks_nmi; \lnlbl[snapn] |
| smp_mb(); \lnlbl[mb] |
| rdp->dynticks_snap = snap; \lnlbl[rec_snap] |
| rdp->dynticks_nmi_snap = snap_nmi; \lnlbl[rec_snapn] |
| ret = ((snap & 0x1) == 0) && ((snap_nmi & 0x1) == 0); \lnlbl[chk_prgs] |
| if (ret) \lnlbl[cnt:b] |
| rdp->dynticks_fqs++; \lnlbl[cnt:e] |
| return ret; \lnlbl[ret] |
| } |
| \end{VerbatimL} |
| \end{fcvlabel} |
| \caption{Saving Dyntick Progress Counters} |
| \label{lst:formal:Saving Dyntick Progress Counters} |
| \end{listing} |
| |
| \begin{fcvref}[ln:formal:Checking Dyntick Progress Counters] |
| \Cref{lst:formal:Checking Dyntick Progress Counters} |
| shows \co{rcu_implicit_dynticks_qs()}, which is called to check |
| whether a CPU has entered dyntick-idle mode subsequent to a call |
| to \co{dynticks_save_progress_counter()}. |
| \Clnref{curr,currn} take new snapshots of the corresponding CPU's |
| \co{dynticks} and \co{dynticks_nmi} variables, while |
| \clnref{snap,snapn} retrieve the snapshots saved earlier by |
| \co{dynticks_save_progress_counter()}. |
| \Clnref{mb} then |
| executes a memory barrier to pair with the memory barriers in |
| the functions in |
| \cref{lst:formal:Entering and Exiting Dynticks-Idle Mode,% |
| lst:formal:NMIs From Dynticks-Idle Mode,% |
| lst:formal:Interrupts From Dynticks-Idle Mode}. |
| \Clnrefrange{chk_q:b}{chk_q:e} |
| then check to see if the CPU is either currently in |
| a quiescent state (\co{curr} and \co{curr_nmi} having even values) or |
| has passed through a quiescent state since the last call to |
| \co{dynticks_save_progress_counter()} (the values of |
| \co{dynticks} and \co{dynticks_nmi} having changed). |
| If these checks confirm that the CPU has passed through a dyntick-idle |
| quiescent state, then \clnref{cnt} counts that fact and |
| \clnref{ret_1} returns an indication of this fact. |
| Either way, \clnref{chk_race} |
| checks for race conditions that can result in RCU |
| waiting for a CPU that is offline. |
| \end{fcvref} |
| |
| \begin{listing} |
| \begin{fcvlabel}[ln:formal:Checking Dyntick Progress Counters] |
| \begin{VerbatimL}[commandchars=\\\[\]] |
| static int |
| rcu_implicit_dynticks_qs(struct rcu_data *rdp) |
| { |
| long curr; |
| long curr_nmi; |
| long snap; |
| long snap_nmi; |
| |
| curr = rdp->dynticks->dynticks; \lnlbl[curr] |
| snap = rdp->dynticks_snap; \lnlbl[snap] |
| curr_nmi = rdp->dynticks->dynticks_nmi; \lnlbl[currn] |
| snap_nmi = rdp->dynticks_nmi_snap; \lnlbl[snapn] |
| smp_mb(); \lnlbl[mb] |
| if ((curr != snap || (curr & 0x1) == 0) && \lnlbl[chk_q:b] |
| (curr_nmi != snap_nmi || (curr_nmi & 0x1) == 0)) { \lnlbl[chk_q:e] |
| rdp->dynticks_fqs++; \lnlbl[cnt] |
| return 1; \lnlbl[ret_1] |
| } |
| return rcu_implicit_offline_qs(rdp); \lnlbl[chk_race] |
| } |
| \end{VerbatimL} |
| \end{fcvlabel} |
| \caption{Checking Dyntick Progress Counters} |
| \label{lst:formal:Checking Dyntick Progress Counters} |
| \end{listing} |
| |
| \QuickQuiz{ |
| This is still pretty complicated. |
| Why not just have a \co{cpumask_t} with per-CPU bits, clearing |
| the bit when entering an \IRQ\ or NMI handler, and setting it |
| upon exit? |
| }\QuickQuizAnswer{ |
| Although this approach would be functionally correct, it |
| would result in excessive \IRQ\ entry/exit overhead on |
| large machines. |
| In contrast, the approach laid out in this section allows |
| each CPU to touch only per-CPU data on \IRQ\ and NMI entry/exit, |
| resulting in much lower \IRQ\ entry/exit overhead, especially |
| on large machines. |
| }\QuickQuizEnd |
| |
| Linux-kernel RCU's dyntick-idle code has since been rewritten yet again |
| based on a suggestion from |
| Andy Lutomirski~\cite{PaulMcKenney2015dyntickAndyNMI}, |
| but it is time to sum up and move on to other topics. |
| |
| \subsubsection{Discussion} |
| \label{sec:formal:Discussion} |
| |
| A slight shift in viewpoint resulted in a substantial simplification |
| of the dynticks interface for RCU\@. |
| The key change leading to this simplification was minimizing of |
| sharing between \IRQ\ and NMI contexts. |
| The only sharing in this simplified interface is references from NMI |
| context to \IRQ\ variables (the \co{dynticks} variable). |
| This type of sharing is benign, because the NMI functions never update |
| this variable, so that its value remains constant through the lifetime |
| of the NMI handler. |
| This limitation of sharing allows the individual functions to be |
| understood one at a time, in happy contrast to the situation |
| described in |
| \cref{sec:formal:Promela Parable: dynticks and Preemptible RCU}, |
| where an NMI might change shared state at any point during execution of |
| the \IRQ\ functions. |
| |
| Verification can be a good thing, but simplicity is even better. |
| |
| % Restore frame around VerbatimN in two-column layout |
| \IfTwoColumn{ |
| \RecustomVerbatimEnvironment{VerbatimN}{Verbatim}% |
| {numbers=left,numbersep=3pt,xleftmargin=5pt,xrightmargin=5pt,frame=single} |
| \setlength{\lnlblraise}{6pt} |
| }{} |