timers: Expand DEBUG_OBJECTS_TIMER to check if it ever was used
There's been too many bugs happening where a timer is removed, either by
del_timer() or even del_timer_sync() but get's re-armed again by a
workqueue or some other task. Then the timer is freed while it's still
queued to go off. When the timer eventually goes off, as its content no
longer exists, it causes a crash in the timer code.
This is very hard to debug because all evidence of who added the timer is
gone.
Currently, DEBUG_OBJECTS_TIMER will trigger if this happens, but as this
only happens rarely (but in the field, thousands of times) and may depend
on performing various tasks (USB unplug, CPU hotplug, suspend and resume),
not to mention that enabling DEBUG_OBJECTS_TIMER has too much overhead to
run in the field, it seldom catches these types of bugs.
Now that timer_shutdown_sync() is to be called before freeing, move the
checks of DEBUG_OBJECTS_TIMER to if it ever gets armed to where
timer_shutdown_sync() is called. If there's a case where a timer is armed,
and then freed without calling timer_shutdown_sync() DEBUG_OBJECTS_TIMER
will now trigger on it.
This catches cases that are potential issues instead of just catching
when the race condition occurs.
Note, due to delayed workqueues that use timers but they themselves do not
supply a shutdown method, there's no way to be able to call
timer_shutdown() on delayed work timers correctly. Because of this, the
delayed work timers will add a state to inform the DEBUG_OBJECTS_TIMER
code that its a timer for a delayed work. The delayed work timers will be
treated the old way of only trigging an issue if its timer is active when
freed, but does not need to be shutdown first.
Work may be needed to make workqueue code also have a shutdown state.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <jstultz@google.com>
Cc: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 files changed