| From e0a352fabce61f730341d119fbedf71ffdb8663f Mon Sep 17 00:00:00 2001 |
| From: David Hildenbrand <david@redhat.com> |
| Date: Fri, 1 Feb 2019 14:21:19 -0800 |
| Subject: mm: migrate: don't rely on __PageMovable() of newpage after unlocking it |
| |
| From: David Hildenbrand <david@redhat.com> |
| |
| commit e0a352fabce61f730341d119fbedf71ffdb8663f upstream. |
| |
| We had a race in the old balloon compaction code before b1123ea6d3b3 |
| ("mm: balloon: use general non-lru movable page feature") refactored it |
| that became visible after backporting 195a8c43e93d ("virtio-balloon: |
| deflate via a page list") without the refactoring. |
| |
| The bug existed from commit d6d86c0a7f8d ("mm/balloon_compaction: |
| redesign ballooned pages management") till b1123ea6d3b3 ("mm: balloon: |
| use general non-lru movable page feature"). d6d86c0a7f8d |
| ("mm/balloon_compaction: redesign ballooned pages management") was |
| backported to 3.12, so the broken kernels are stable kernels [3.12 - |
| 4.7]. |
| |
| There was a subtle race between dropping the page lock of the newpage in |
| __unmap_and_move() and checking for __is_movable_balloon_page(newpage). |
| |
| Just after dropping this page lock, virtio-balloon could go ahead and |
| deflate the newpage, effectively dequeueing it and clearing PageBalloon, |
| in turn making __is_movable_balloon_page(newpage) fail. |
| |
| This resulted in dropping the reference of the newpage via |
| putback_lru_page(newpage) instead of put_page(newpage), leading to |
| page->lru getting modified and a !LRU page ending up in the LRU lists. |
| With 195a8c43e93d ("virtio-balloon: deflate via a page list") |
| backported, one would suddenly get corrupted lists in |
| release_pages_balloon(): |
| |
| - WARNING: CPU: 13 PID: 6586 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0 |
| - list_del corruption. prev->next should be ffffe253961090a0, but was dead000000000100 |
| |
| Nowadays this race is no longer possible, but it is hidden behind very |
| ugly handling of __ClearPageMovable() and __PageMovable(). |
| |
| __ClearPageMovable() will not make __PageMovable() fail, only |
| PageMovable(). So the new check (__PageMovable(newpage)) will still |
| hold even after newpage was dequeued by virtio-balloon. |
| |
| If anybody would ever change that special handling, the BUG would be |
| introduced again. So instead, make it explicit and use the information |
| of the original isolated page before migration. |
| |
| This patch can be backported fairly easy to stable kernels (in contrast |
| to the refactoring). |
| |
| Link: http://lkml.kernel.org/r/20190129233217.10747-1-david@redhat.com |
| Fixes: d6d86c0a7f8d ("mm/balloon_compaction: redesign ballooned pages management") |
| Signed-off-by: David Hildenbrand <david@redhat.com> |
| Reported-by: Vratislav Bendel <vbendel@redhat.com> |
| Acked-by: Michal Hocko <mhocko@suse.com> |
| Acked-by: Rafael Aquini <aquini@redhat.com> |
| Cc: Mel Gorman <mgorman@techsingularity.net> |
| Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> |
| Cc: Michal Hocko <mhocko@suse.com> |
| Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> |
| Cc: Jan Kara <jack@suse.cz> |
| Cc: Andrea Arcangeli <aarcange@redhat.com> |
| Cc: Dominik Brodowski <linux@dominikbrodowski.net> |
| Cc: Matthew Wilcox <willy@infradead.org> |
| Cc: Vratislav Bendel <vbendel@redhat.com> |
| Cc: Rafael Aquini <aquini@redhat.com> |
| Cc: Konstantin Khlebnikov <k.khlebnikov@samsung.com> |
| Cc: Minchan Kim <minchan@kernel.org> |
| Cc: <stable@vger.kernel.org> [3.12 - 4.7] |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: David Hildenbrand <david@redhat.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| --- |
| mm/migrate.c | 7 +++++-- |
| 1 file changed, 5 insertions(+), 2 deletions(-) |
| |
| --- a/mm/migrate.c |
| +++ b/mm/migrate.c |
| @@ -936,6 +936,7 @@ static ICE_noinline int unmap_and_move(n |
| int rc = MIGRATEPAGE_SUCCESS; |
| int *result = NULL; |
| struct page *newpage; |
| + bool is_lru = !isolated_balloon_page(page); |
| |
| newpage = get_new_page(page, private, &result); |
| if (!newpage) |
| @@ -983,11 +984,13 @@ out: |
| /* |
| * If migration was not successful and there's a freeing callback, use |
| * it. Otherwise, putback_lru_page() will drop the reference grabbed |
| - * during isolation. |
| + * during isolation. Use the old state of the isolated source page to |
| + * determine if we migrated a LRU page. newpage was already unlocked |
| + * and possibly modified by its owner - don't rely on the page state. |
| */ |
| if (put_new_page) |
| put_new_page(newpage, private); |
| - else if (unlikely(__is_movable_balloon_page(newpage))) { |
| + else if (rc == MIGRATEPAGE_SUCCESS && unlikely(!is_lru)) { |
| /* drop our reference, page already in the balloon */ |
| put_page(newpage); |
| } else |