| Proper Locking Under a Preemptible Kernel: |
| Keeping Kernel Code Preempt-Safe |
| Robert Love <rml@tech9.net> |
| Last Updated: 22 Jan 2002 |
| |
| |
| INTRODUCTION |
| |
| |
| A preemptible kernel creates new locking issues. The issues are the same as |
| those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible |
| kernel model leverages existing SMP locking mechanisms. Thus, the kernel |
| requires explicit additional locking for very few additional situations. |
| |
| This document is for all kernel hackers. Developing code in the kernel |
| requires protecting these situations. |
| |
| |
| RULE #1: Per-CPU data structures need explicit protection |
| |
| |
| Two similar problems arise. An example code snippet: |
| |
| struct this_needs_locking tux[NR_CPUS]; |
| tux[smp_processor_id()] = some_value; |
| /* task is preempted here... */ |
| something = tux[smp_processor_id()]; |
| |
| First, since the data is per-CPU, it may not have explicit SMP locking, but |
| require it otherwise. Second, when a preempted task is finally rescheduled, |
| the previous value of smp_processor_id may not equal the current. You must |
| protect these situations by disabling preemption around them. |
| |
| |
| RULE #2: CPU state must be protected. |
| |
| |
| Under preemption, the state of the CPU must be protected. This is arch- |
| dependent, but includes CPU structures and state not preserved over a context |
| switch. For example, on x86, entering and exiting FPU mode is now a critical |
| section that must occur while preemption is disabled. Think what would happen |
| if the kernel is executing a floating-point instruction and is then preempted. |
| Remember, the kernel does not save FPU state except for user tasks. Therefore, |
| upon preemption, the FPU registers will be sold to the lowest bidder. Thus, |
| preemption must be disabled around such regions. |
| |
| Note, some FPU functions are already explicitly preempt safe. For example, |
| kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. |
| However, math_state_restore must be called with preemption disabled. |
| |
| |
| RULE #3: Lock acquire and release must be performed by same task |
| |
| |
| A lock acquired in one task must be released by the same task. This |
| means you can't do oddball things like acquire a lock and go off to |
| play while another task releases it. If you want to do something |
| like this, acquire and release the task in the same code path and |
| have the caller wait on an event by the other task. |
| |
| |
| SOLUTION |
| |
| |
| Data protection under preemption is achieved by disabling preemption for the |
| duration of the critical region. |
| |
| preempt_enable() decrement the preempt counter |
| preempt_disable() increment the preempt counter |
| preempt_enable_no_resched() decrement, but do not immediately preempt |
| preempt_get_count() return the preempt counter |
| |
| The functions are nestable. In other words, you can call preempt_disable |
| n-times in a code path, and preemption will not be reenabled until the n-th |
| call to preempt_enable. The preempt statements define to nothing if |
| preemption is not enabled. |
| |
| Note that you do not need to explicitly prevent preemption if you are holding |
| any locks or interrupts are disabled, since preemption is implicitly disabled |
| in those cases. |
| |
| Example: |
| |
| cpucache_t *cc; /* this is per-CPU */ |
| preempt_disable(); |
| cc = cc_data(searchp); |
| if (cc && cc->avail) { |
| __free_block(searchp, cc_entry(cc), cc->avail); |
| cc->avail = 0; |
| } |
| preempt_enable(); |
| return 0; |
| |
| Notice how the preemption statements must encompass every reference of the |
| critical variables. Another example: |
| |
| int buf[NR_CPUS]; |
| set_cpu_val(buf); |
| if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n"); |
| spin_lock(&buf_lock); |
| /* ... */ |
| |
| This code is not preempt-safe, but see how easily we can fix it by simply |
| moving the spin_lock up two lines. |