| From 509eb76ebf9771abc9fe51859382df2571f11447 Mon Sep 17 00:00:00 2001 |
| From: Will Deacon <will.deacon@arm.com> |
| Date: Wed, 5 Jun 2013 11:20:33 +0100 |
| Subject: ARM: 7747/1: pcpu: ensure __my_cpu_offset cannot be re-ordered across barrier() |
| |
| From: Will Deacon <will.deacon@arm.com> |
| |
| commit 509eb76ebf9771abc9fe51859382df2571f11447 upstream. |
| |
| __my_cpu_offset is non-volatile, since we want its value to be cached |
| when we access several per-cpu variables in a row with preemption |
| disabled. This means that we rely on preempt_{en,dis}able to hazard |
| with the operation via the barrier() macro, so that we can't end up |
| migrating CPUs without reloading the per-cpu offset. |
| |
| Unfortunately, GCC doesn't treat a "memory" clobber on a non-volatile |
| asm block as a side-effect, and will happily re-order it before other |
| memory clobbers (including those in prempt_disable()) and cache the |
| value. This has been observed to break the cmpxchg logic in the slub |
| allocator, leading to livelock in kmem_cache_alloc in mainline kernels. |
| |
| This patch adds a dummy memory input operand to __my_cpu_offset, |
| forcing it to be ordered with respect to the barrier() macro. |
| |
| Reviewed-by: Nicolas Pitre <nico@linaro.org> |
| Cc: Rob Herring <rob.herring@calxeda.com> |
| Signed-off-by: Will Deacon <will.deacon@arm.com> |
| Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| arch/arm/include/asm/percpu.h | 11 +++++++++-- |
| 1 file changed, 9 insertions(+), 2 deletions(-) |
| |
| --- a/arch/arm/include/asm/percpu.h |
| +++ b/arch/arm/include/asm/percpu.h |
| @@ -30,8 +30,15 @@ static inline void set_my_cpu_offset(uns |
| static inline unsigned long __my_cpu_offset(void) |
| { |
| unsigned long off; |
| - /* Read TPIDRPRW */ |
| - asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : : "memory"); |
| + register unsigned long *sp asm ("sp"); |
| + |
| + /* |
| + * Read TPIDRPRW. |
| + * We want to allow caching the value, so avoid using volatile and |
| + * instead use a fake stack read to hazard against barrier(). |
| + */ |
| + asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off) : "Q" (*sp)); |
| + |
| return off; |
| } |
| #define __my_cpu_offset __my_cpu_offset() |