| From foo@baz Mon Sep 17 12:33:31 CEST 2018 |
| From: Gaurav Kohli <gkohli@codeaurora.org> |
| Date: Thu, 2 Aug 2018 14:21:03 +0530 |
| Subject: timers: Clear timer_base::must_forward_clk with timer_base::lock held |
| |
| From: Gaurav Kohli <gkohli@codeaurora.org> |
| |
| [ Upstream commit 363e934d8811d799c88faffc5bfca782fd728334 ] |
| |
| timer_base::must_forward_clock is indicating that the base clock might be |
| stale due to a long idle sleep. |
| |
| The forwarding of the base clock takes place in the timer softirq or when a |
| timer is enqueued to a base which is idle. If the enqueue of timer to an |
| idle base happens from a remote CPU, then the following race can happen: |
| |
| CPU0 CPU1 |
| run_timer_softirq mod_timer |
| |
| base = lock_timer_base(timer); |
| base->must_forward_clk = false |
| if (base->must_forward_clk) |
| forward(base); -> skipped |
| |
| enqueue_timer(base, timer, idx); |
| -> idx is calculated high due to |
| stale base |
| unlock_timer_base(timer); |
| base = lock_timer_base(timer); |
| forward(base); |
| |
| The root cause is that timer_base::must_forward_clk is cleared outside the |
| timer_base::lock held region, so the remote queuing CPU observes it as |
| cleared, but the base clock is still stale. This can cause large |
| granularity values for timers, i.e. the accuracy of the expiry time |
| suffers. |
| |
| Prevent this by clearing the flag with timer_base::lock held, so that the |
| forwarding takes place before the cleared flag is observable by a remote |
| CPU. |
| |
| Signed-off-by: Gaurav Kohli <gkohli@codeaurora.org> |
| Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
| Cc: john.stultz@linaro.org |
| Cc: sboyd@kernel.org |
| Cc: linux-arm-msm@vger.kernel.org |
| Link: https://lkml.kernel.org/r/1533199863-22748-1-git-send-email-gkohli@codeaurora.org |
| Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| kernel/time/timer.c | 29 ++++++++++++++++------------- |
| 1 file changed, 16 insertions(+), 13 deletions(-) |
| |
| --- a/kernel/time/timer.c |
| +++ b/kernel/time/timer.c |
| @@ -1609,6 +1609,22 @@ static inline void __run_timers(struct t |
| |
| raw_spin_lock_irq(&base->lock); |
| |
| + /* |
| + * timer_base::must_forward_clk must be cleared before running |
| + * timers so that any timer functions that call mod_timer() will |
| + * not try to forward the base. Idle tracking / clock forwarding |
| + * logic is only used with BASE_STD timers. |
| + * |
| + * The must_forward_clk flag is cleared unconditionally also for |
| + * the deferrable base. The deferrable base is not affected by idle |
| + * tracking and never forwarded, so clearing the flag is a NOOP. |
| + * |
| + * The fact that the deferrable base is never forwarded can cause |
| + * large variations in granularity for deferrable timers, but they |
| + * can be deferred for long periods due to idle anyway. |
| + */ |
| + base->must_forward_clk = false; |
| + |
| while (time_after_eq(jiffies, base->clk)) { |
| |
| levels = collect_expired_timers(base, heads); |
| @@ -1628,19 +1644,6 @@ static __latent_entropy void run_timer_s |
| { |
| struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); |
| |
| - /* |
| - * must_forward_clk must be cleared before running timers so that any |
| - * timer functions that call mod_timer will not try to forward the |
| - * base. idle trcking / clock forwarding logic is only used with |
| - * BASE_STD timers. |
| - * |
| - * The deferrable base does not do idle tracking at all, so we do |
| - * not forward it. This can result in very large variations in |
| - * granularity for deferrable timers, but they can be deferred for |
| - * long periods due to idle. |
| - */ |
| - base->must_forward_clk = false; |
| - |
| __run_timers(base); |
| if (IS_ENABLED(CONFIG_NO_HZ_COMMON)) |
| __run_timers(this_cpu_ptr(&timer_bases[BASE_DEF])); |