| From a7f4fa01c0db71d7df577343c1bc1235f81e5911 Mon Sep 17 00:00:00 2001 |
| From: Vlastimil Babka <vbabka@suse.cz> |
| Date: Thu, 25 Jun 2020 20:29:24 -0700 |
| Subject: [PATCH] mm, compaction: make capture control handling safe wrt |
| interrupts |
| |
| commit b9e20f0da1f5c9c68689450a8cb436c9486434c8 upstream. |
| |
| Hugh reports: |
| |
| "While stressing compaction, one run oopsed on NULL capc->cc in |
| __free_one_page()'s task_capc(zone): compact_zone_order() had been |
| interrupted, and a page was being freed in the return from interrupt. |
| |
| Though you would not expect it from the source, both gccs I was using |
| (4.8.1 and 7.5.0) had chosen to compile compact_zone_order() with the |
| ".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before |
| callq compact_zone - long after the "current->capture_control = |
| &capc". An interrupt in between those finds capc->cc NULL (zeroed by |
| an earlier rep stos). |
| |
| This could presumably be fixed by a barrier() before setting |
| current->capture_control in compact_zone_order(); but would also need |
| more care on return from compact_zone(), in order not to risk leaking |
| a page captured by interrupt just before capture_control is reset. |
| |
| Maybe that is the preferable fix, but I felt safer for task_capc() to |
| exclude the rather surprising possibility of capture at interrupt |
| time" |
| |
| I have checked that gcc10 also behaves the same. |
| |
| The advantage of fix in compact_zone_order() is that we don't add |
| another test in the page freeing hot path, and that it might prevent |
| future problems if we stop exposing pointers to uninitialized structures |
| in current task. |
| |
| So this patch implements the suggestion for compact_zone_order() with |
| barrier() (and WRITE_ONCE() to prevent store tearing) for setting |
| current->capture_control, and prevents page leaking with |
| WRITE_ONCE/READ_ONCE in the proper order. |
| |
| Link: http://lkml.kernel.org/r/20200616082649.27173-1-vbabka@suse.cz |
| Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction") |
| Signed-off-by: Vlastimil Babka <vbabka@suse.cz> |
| Reported-by: Hugh Dickins <hughd@google.com> |
| Suggested-by: Hugh Dickins <hughd@google.com> |
| Acked-by: Hugh Dickins <hughd@google.com> |
| Cc: Alex Shi <alex.shi@linux.alibaba.com> |
| Cc: Li Wang <liwang@redhat.com> |
| Cc: Mel Gorman <mgorman@techsingularity.net> |
| Cc: <stable@vger.kernel.org> [5.1+] |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> |
| |
| diff --git a/mm/compaction.c b/mm/compaction.c |
| index a713628eb6dc..4df0841bd7cd 100644 |
| --- a/mm/compaction.c |
| +++ b/mm/compaction.c |
| @@ -2311,15 +2311,26 @@ static enum compact_result compact_zone_order(struct zone *zone, int order, |
| .page = NULL, |
| }; |
| |
| - current->capture_control = &capc; |
| + /* |
| + * Make sure the structs are really initialized before we expose the |
| + * capture control, in case we are interrupted and the interrupt handler |
| + * frees a page. |
| + */ |
| + barrier(); |
| + WRITE_ONCE(current->capture_control, &capc); |
| |
| ret = compact_zone(&cc, &capc); |
| |
| VM_BUG_ON(!list_empty(&cc.freepages)); |
| VM_BUG_ON(!list_empty(&cc.migratepages)); |
| |
| - *capture = capc.page; |
| - current->capture_control = NULL; |
| + /* |
| + * Make sure we hide capture control first before we read the captured |
| + * page pointer, otherwise an interrupt could free and capture a page |
| + * and we would leak it. |
| + */ |
| + WRITE_ONCE(current->capture_control, NULL); |
| + *capture = READ_ONCE(capc.page); |
| |
| return ret; |
| } |
| -- |
| 2.27.0 |
| |