preempt: Debug for possible missed preemption checks
Peter Zijlstra mentioned that he wanted to catch the following problem:
local_irq_disable();
preempt_disable();
local_irq_enable();
local_irq_disable();
preempt_enable();
local_irq_enable();
Now what's wrong with the above? What happens if an interrupt comes in
the middle of that (between the local_irq_enable() and
local_irq_disable()) and sets the NEED_RESCHED flag? Because preemption
is disabled, it wont schedule there. It expects the schedule will
happen when preemption is enabled.
But the problem is that the preempt_enable() happens when a schedule
can not take place (interrupts are disabled), so it is ignored. Now
when interrupts are enabled, there's no NEED_RESCHED check, and we miss
our schedule.
Note, the first preempt_disable() did not need to even be within
the irq_disabled section. You can get the same problem with:
preempt_disable();
<interrupt - set NEED_RESCHED>
local_irq_disable();
preempt_enable();
local_irq_enable();
Basically anytime you go from preemption disabled and interrupts
enabled, to preemption enabled and interrupts disabled without ever
going back to preemption enabled and interrupts disabled state.
This can be solved by keeping track of the preemption and interrupt
states for the CPU. Here's the state diagram:
State 0: Preempt Enabled, Interrupts Enabled (PEIE)
State 1: Preempt Enabled, Interrupts Disabled (PEID)
State 2: Preempt Disabled, Interrupts Enabled (PDIEX) *
State 3: Preempt Disabled, Interrupts Enabled (PDIE) **
State 4: Preempt Disabled, Interrupts Disabled (PDID)
State 5: Preempt Disabled, Interrupts Disabled (PDIDX)
State 6: Preempt Enabled, Interrupts Disabled (PEIDX)
State 7: Preempt Enabled, Interrupts Enabled (PEIEX) ***
(*) State 2 is the state where problems can occur (an interrupt
setting NEED_RESCHED while preemption is disabled).
Notice that some of the states have the same preemption and interrupts
disabled state. The difference between them is that those that went
through state 2 (denoted with an "X"), can lead us to state 6 which is
the state that can miss a preemption point.
(**) The difference between state 2 and state 3, is that state 3 is
state 2 when it is in an interrupt. Ideally we would just switch
state 7 to state 0 if we are in an interrupt, but this code can
be called outside the setting of the "in_interrupt()" counter, and
we can not detect it. To work around this, state 3 is created to
keep from going into states 5, 6 and 7 while in an interrupt.
(***) If we hit state 7, we know that there's a path that exists that
can lead us to miss a required schedule.
The state transactions are:
[preemption state changes] [interrupt state changes]
State 0: (PEIE) State 2 State 1
State 1: (PEID) State 4 State 0
State 2: (PDIEX) State 0 State 5
State 3: (PDIE) State 0 State 4
State 4: (PDID) State 1 State 2
State 5: (PDIDX) State 6 State 2
State 6: (PEIDX) State 5 State 7
State 7: (PEIEX) [End] [End]
When PROVE_LOCKING and PREEMPT is configured, the preempt state
tracking is active. Testing this out, I added a module that did the
following:
preempt_disable();
local_irq_disable();
preempt_enable();
local_irq_enable();
I also tested against:
local_irq_disable();
preempt_disable();
local_irq_enable();
local_irq_disable();
preempt_enable();
local_irq_enable();
And here's what the output looks like:
===============================
[INFO: preempt check hit problem state]
irq event stamp: 12
hardirqs last enabled at (11): [<ffffffff81666510>] _raw_spin_unlock_irq+0x30/0x60
hardirqs last disabled at (12): [<ffffffffa050d01e>] dumb_thread+0x1e/0x80 [preempt_bug]
softirqs last enabled at (0): [<ffffffff8104f038>] copy_process+0x788/0x1a50
softirqs last disabled at (0): [< (null)>] (null)
Entered dangerous state at:
[<ffffffff8166a83b>] preempt_count_add+0xab/0x110
[<ffffffffa050d018>] dumb_thread+0x18/0x80 [preempt_bug]
[<ffffffff81076c03>] kthread+0xf3/0x110
[<ffffffff8166e52c>] ret_from_fork+0x7c/0xb0
[<ffffffffffffffff>] 0xffffffffffffffff
stack backtrace:
CPU: 4 PID: 3405 Comm: task1 Tainted: G O 3.13.0-rc8-test+ #60
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
0000000000000006 ffff8800d47e5da8 ffffffff8165d459 0000000000000002
ffffffffa050d028 ffff8800d47e5de8 ffffffff81099fc8 ffff8800d47e5df8
ffffffffa050d028 0000000000000000 ffffffffa050d000 0000000000000000
Call Trace:
[<ffffffff8165d459>] dump_stack+0x4f/0x7c
[<ffffffffa050d028>] ? dumb_thread+0x28/0x80 [preempt_bug]
[<ffffffff81099fc8>] update_pied_state+0x398/0x3b0
[<ffffffffa050d028>] ? dumb_thread+0x28/0x80 [preempt_bug]
[<ffffffffa050d000>] ? 0xffffffffa050cfff
[<ffffffff8109de40>] trace_preempt_on+0x20/0x30
[<ffffffff8166a749>] preempt_count_sub+0xb9/0x100
[<ffffffffa050d028>] dumb_thread+0x28/0x80 [preempt_bug]
[<ffffffff81076c03>] kthread+0xf3/0x110
[<ffffffff81076b10>] ? flush_kthread_worker+0x150/0x150
[<ffffffff8166e52c>] ret_from_fork+0x7c/0xb0
[<ffffffff81076b10>] ? flush_kthread_worker+0x150/0x150
Last states (starting with most recent):
1) State 5 (PDIDX)
pc: 00000001 irqs: D
.. [<ffffffffa050d01e>] .... dumb_thread+0x1e/0x80 [preempt_bug]
2) State 2 (PDIEX)
pc: 00000001 irqs: E
.. [<ffffffffa050d018>] .... dumb_thread+0x18/0x80 [preempt_bug]
3) State 0 (PEIE)
pc: 00000001 irqs: E
.. [<ffffffff81660d64>] .... __schedule+0x3b4/0x840
4) State 2 (PDIEX)
pc: 00000002 irqs: D
.. [<ffffffff81666510>] .... _raw_spin_unlock_irq+0x30/0x60
5) State 5 (PDIDX)
pc: 00000001 irqs: D
.. [<ffffffff81665dc7>] .... _raw_spin_lock_irq+0x17/0x50
6) State 2 (PDIEX)
pc: 00000001 irqs: D
.. [<ffffffff810b8e19>] .... rcu_note_context_switch+0x99/0x300
7) State 5 (PDIDX)
pc: 00000001 irqs: D
.. [<ffffffff810b8de7>] .... rcu_note_context_switch+0x67/0x300
8) State 2 (PDIEX)
pc: 00000001 irqs: E
.. [<ffffffff816609f9>] .... __schedule+0x49/0x840
9) State 0 (PEIE)
pc: 00000001 irqs: E
.. [<ffffffff81661633>] .... schedule_preempt_disabled+0x13/0x30
10) State 2 (PDIEX)
pc: 00000001 irqs: D
.. [<ffffffff810c49e9>] .... tick_nohz_idle_exit+0x129/0x190
11) State 5 (PDIDX)
pc: 00000001 irqs: D
.. [<ffffffff810c48f1>] .... tick_nohz_idle_exit+0x31/0x190
12) State 2 (PDIEX)
pc: 00000001 irqs: D
.. [<ffffffff810b6019>] .... rcu_idle_exit+0x79/0xe0
13) State 5 (PDIDX)
pc: 00000001 irqs: D
.. [<ffffffff810b5fc2>] .... rcu_idle_exit+0x22/0xe0
14) State 2 (PDIEX)
pc: 00000001 irqs: D
.. [<ffffffff814f122f>] .... cpuidle_enter_state+0x5f/0xe0
15) State 5 (PDIDX)
pc: 00000001 irqs: D
.. [<ffffffff810aca4f>] .... cpu_startup_entry+0xbf/0x320
16) State 2 (PDIEX)
pc: 00000001 irqs: D
.. [<ffffffff810c47aa>] .... tick_nohz_idle_enter+0x4a/0x80
Link: http://lkml.kernel.org/r/20140116174536.GB9655@laptop.programming.kicks-ass.net
Link: http://lkml.kernel.org/r/20140116235751.084728c8@gandalf.local.home
Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
5 files changed