| Subject: net: Avoid livelock in net_tx_action() on RT |
| From: Steven Rostedt <srostedt@redhat.com> |
| Date: Thu, 06 Oct 2011 10:48:39 -0400 |
| |
| qdisc_lock is taken w/o disabling interrupts or bottom halfs. So code |
| holding a qdisc_lock() can be interrupted and softirqs can run on the |
| return of interrupt in !RT. |
| |
| The spin_trylock() in net_tx_action() makes sure, that the softirq |
| does not deadlock. When the lock can't be acquired q is requeued and |
| the NET_TX softirq is raised. That causes the softirq to run over and |
| over. |
| |
| That works in mainline as do_softirq() has a retry loop limit and |
| leaves the softirq processing in the interrupt return path and |
| schedules ksoftirqd. The task which holds qdisc_lock cannot be |
| preempted, so the lock is released and either ksoftirqd or the next |
| softirq in the return from interrupt path can proceed. Though it's a |
| bit strange to actually run MAX_SOFTIRQ_RESTART (10) loops before it |
| decides to bail out even if it's clear in the first iteration :) |
| |
| On RT all softirq processing is done in a FIFO thread and we don't |
| have a loop limit, so ksoftirqd preempts the lock holder forever and |
| unqueues and requeues until the reset button is hit. |
| |
| Due to the forced threading of ksoftirqd on RT we actually cannot |
| deadlock on qdisc_lock because it's a "sleeping lock". So it's safe to |
| replace the spin_trylock() with a spin_lock(). When contended, |
| ksoftirqd is scheduled out and the lock holder can proceed. |
| |
| [ tglx: Massaged changelog and code comments ] |
| |
| Solved-by: Thomas Gleixner <tglx@linuxtronix.de> |
| Signed-off-by: Steven Rostedt <rostedt@goodmis.org> |
| Tested-by: Carsten Emde <cbe@osadl.org> |
| Cc: Clark Williams <williams@redhat.com> |
| Cc: John Kacur <jkacur@redhat.com> |
| Cc: Luis Claudio R. Goncalves <lclaudio@redhat.com> |
| Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
| |
| --- |
| net/core/dev.c | 32 +++++++++++++++++++++++++++++++- |
| 1 file changed, 31 insertions(+), 1 deletion(-) |
| |
| --- a/net/core/dev.c |
| +++ b/net/core/dev.c |
| @@ -3128,6 +3128,36 @@ int netif_rx_ni(struct sk_buff *skb) |
| } |
| EXPORT_SYMBOL(netif_rx_ni); |
| |
| +#ifdef CONFIG_PREEMPT_RT_FULL |
| +/* |
| + * RT runs ksoftirqd as a real time thread and the root_lock is a |
| + * "sleeping spinlock". If the trylock fails then we can go into an |
| + * infinite loop when ksoftirqd preempted the task which actually |
| + * holds the lock, because we requeue q and raise NET_TX softirq |
| + * causing ksoftirqd to loop forever. |
| + * |
| + * It's safe to use spin_lock on RT here as softirqs run in thread |
| + * context and cannot deadlock against the thread which is holding |
| + * root_lock. |
| + * |
| + * On !RT the trylock might fail, but there we bail out from the |
| + * softirq loop after 10 attempts which we can't do on RT. And the |
| + * task holding root_lock cannot be preempted, so the only downside of |
| + * that trylock is that we need 10 loops to decide that we should have |
| + * given up in the first one :) |
| + */ |
| +static inline int take_root_lock(spinlock_t *lock) |
| +{ |
| + spin_lock(lock); |
| + return 1; |
| +} |
| +#else |
| +static inline int take_root_lock(spinlock_t *lock) |
| +{ |
| + return spin_trylock(lock); |
| +} |
| +#endif |
| + |
| static void net_tx_action(struct softirq_action *h) |
| { |
| struct softnet_data *sd = &__get_cpu_var(softnet_data); |
| @@ -3166,7 +3196,7 @@ static void net_tx_action(struct softirq |
| head = head->next_sched; |
| |
| root_lock = qdisc_lock(q); |
| - if (spin_trylock(root_lock)) { |
| + if (take_root_lock(root_lock)) { |
| smp_mb__before_clear_bit(); |
| clear_bit(__QDISC_STATE_SCHED, |
| &q->state); |