| From foo@baz Sun May 27 17:33:38 CEST 2018 |
| From: Huang Ying <ying.huang@intel.com> |
| Date: Thu, 5 Apr 2018 16:23:20 -0700 |
| Subject: mm: fix races between address_space dereference and free in page_evicatable |
| |
| From: Huang Ying <ying.huang@intel.com> |
| |
| [ Upstream commit e92bb4dd9673945179b1fc738c9817dd91bfb629 ] |
| |
| When page_mapping() is called and the mapping is dereferenced in |
| page_evicatable() through shrink_active_list(), it is possible for the |
| inode to be truncated and the embedded address space to be freed at the |
| same time. This may lead to the following race. |
| |
| CPU1 CPU2 |
| |
| truncate(inode) shrink_active_list() |
| ... page_evictable(page) |
| truncate_inode_page(mapping, page); |
| delete_from_page_cache(page) |
| spin_lock_irqsave(&mapping->tree_lock, flags); |
| __delete_from_page_cache(page, NULL) |
| page_cache_tree_delete(..) |
| ... mapping = page_mapping(page); |
| page->mapping = NULL; |
| ... |
| spin_unlock_irqrestore(&mapping->tree_lock, flags); |
| page_cache_free_page(mapping, page) |
| put_page(page) |
| if (put_page_testzero(page)) -> false |
| - inode now has no pages and can be freed including embedded address_space |
| |
| mapping_unevictable(mapping) |
| test_bit(AS_UNEVICTABLE, &mapping->flags); |
| - we've dereferenced mapping which is potentially already free. |
| |
| Similar race exists between swap cache freeing and page_evicatable() |
| too. |
| |
| The address_space in inode and swap cache will be freed after a RCU |
| grace period. So the races are fixed via enclosing the page_mapping() |
| and address_space usage in rcu_read_lock/unlock(). Some comments are |
| added in code to make it clear what is protected by the RCU read lock. |
| |
| Link: http://lkml.kernel.org/r/20180212081227.1940-1-ying.huang@intel.com |
| Signed-off-by: "Huang, Ying" <ying.huang@intel.com> |
| Reviewed-by: Jan Kara <jack@suse.cz> |
| Reviewed-by: Andrew Morton <akpm@linux-foundation.org> |
| Cc: Mel Gorman <mgorman@techsingularity.net> |
| Cc: Minchan Kim <minchan@kernel.org> |
| Cc: "Huang, Ying" <ying.huang@intel.com> |
| Cc: Johannes Weiner <hannes@cmpxchg.org> |
| Cc: Michal Hocko <mhocko@suse.com> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| mm/vmscan.c | 8 +++++++- |
| 1 file changed, 7 insertions(+), 1 deletion(-) |
| |
| --- a/mm/vmscan.c |
| +++ b/mm/vmscan.c |
| @@ -3857,7 +3857,13 @@ int node_reclaim(struct pglist_data *pgd |
| */ |
| int page_evictable(struct page *page) |
| { |
| - return !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); |
| + int ret; |
| + |
| + /* Prevent address_space of inode and swap cache from being freed */ |
| + rcu_read_lock(); |
| + ret = !mapping_unevictable(page_mapping(page)) && !PageMlocked(page); |
| + rcu_read_unlock(); |
| + return ret; |
| } |
| |
| #ifdef CONFIG_SHMEM |