| From 10c535787436d62ea28156a4b91365fd89b5a432 Mon Sep 17 00:00:00 2001 |
| From: "Paul E. McKenney" <paulmck@kernel.org> |
| Date: Fri, 21 Jan 2022 12:40:08 -0800 |
| Subject: rcu: Don't deboost before reporting expedited quiescent state |
| |
| From: Paul E. McKenney <paulmck@kernel.org> |
| |
| commit 10c535787436d62ea28156a4b91365fd89b5a432 upstream. |
| |
| Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx |
| before reporting the expedited quiescent state. Under heavy real-time |
| load, this can result in this function being preempted before the |
| quiescent state is reported, which can in turn prevent the expedited grace |
| period from completing. Tim Murray reports that the resulting expedited |
| grace periods can take hundreds of milliseconds and even more than one |
| second, when they should normally complete in less than a millisecond. |
| |
| This was fine given that there were no particular response-time |
| constraints for synchronize_rcu_expedited(), as it was designed |
| for throughput rather than latency. However, some users now need |
| sub-100-millisecond response-time constratints. |
| |
| This patch therefore follows Neeraj's suggestion (seconded by Tim and |
| by Uladzislau Rezki) of simply reversing the two operations. |
| |
| Reported-by: Tim Murray <timmurray@google.com> |
| Reported-by: Joel Fernandes <joelaf@google.com> |
| Reported-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> |
| Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com> |
| Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> |
| Tested-by: Tim Murray <timmurray@google.com> |
| Cc: Todd Kjos <tkjos@google.com> |
| Cc: Sandeep Patil <sspatil@google.com> |
| Cc: <stable@vger.kernel.org> # 5.4.x |
| Signed-off-by: Paul E. McKenney <paulmck@kernel.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| kernel/rcu/tree_plugin.h | 8 ++++---- |
| 1 file changed, 4 insertions(+), 4 deletions(-) |
| |
| --- a/kernel/rcu/tree_plugin.h |
| +++ b/kernel/rcu/tree_plugin.h |
| @@ -554,16 +554,16 @@ rcu_preempt_deferred_qs_irqrestore(struc |
| raw_spin_unlock_irqrestore_rcu_node(rnp, flags); |
| } |
| |
| - /* Unboost if we were boosted. */ |
| - if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex) |
| - rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex); |
| - |
| /* |
| * If this was the last task on the expedited lists, |
| * then we need to report up the rcu_node hierarchy. |
| */ |
| if (!empty_exp && empty_exp_now) |
| rcu_report_exp_rnp(rnp, true); |
| + |
| + /* Unboost if we were boosted. */ |
| + if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex) |
| + rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex); |
| } else { |
| local_irq_restore(flags); |
| } |