| From 0b28179a6138a5edd9d82ad2687c05b3773c387b Mon Sep 17 00:00:00 2001 |
| From: Vasily Averin <vvs@virtuozzo.com> |
| Date: Fri, 5 Nov 2021 13:38:02 -0700 |
| Subject: mm, oom: pagefault_out_of_memory: don't force global OOM for dying tasks |
| |
| From: Vasily Averin <vvs@virtuozzo.com> |
| |
| commit 0b28179a6138a5edd9d82ad2687c05b3773c387b upstream. |
| |
| Patch series "memcg: prohibit unconditional exceeding the limit of dying tasks", v3. |
| |
| Memory cgroup charging allows killed or exiting tasks to exceed the hard |
| limit. It can be misused and allowed to trigger global OOM from inside |
| a memcg-limited container. On the other hand if memcg fails allocation, |
| called from inside #PF handler it triggers global OOM from inside |
| pagefault_out_of_memory(). |
| |
| To prevent these problems this patchset: |
| (a) removes execution of out_of_memory() from |
| pagefault_out_of_memory(), becasue nobody can explain why it is |
| necessary. |
| (b) allow memcg to fail allocation of dying/killed tasks. |
| |
| This patch (of 3): |
| |
| Any allocation failure during the #PF path will return with VM_FAULT_OOM |
| which in turn results in pagefault_out_of_memory which in turn executes |
| out_out_memory() and can kill a random task. |
| |
| An allocation might fail when the current task is the oom victim and |
| there are no memory reserves left. The OOM killer is already handled at |
| the page allocator level for the global OOM and at the charging level |
| for the memcg one. Both have much more information about the scope of |
| allocation/charge request. This means that either the OOM killer has |
| been invoked properly and didn't lead to the allocation success or it |
| has been skipped because it couldn't have been invoked. In both cases |
| triggering it from here is pointless and even harmful. |
| |
| It makes much more sense to let the killed task die rather than to wake |
| up an eternally hungry oom-killer and send him to choose a fatter victim |
| for breakfast. |
| |
| Link: https://lkml.kernel.org/r/0828a149-786e-7c06-b70a-52d086818ea3@virtuozzo.com |
| Signed-off-by: Vasily Averin <vvs@virtuozzo.com> |
| Suggested-by: Michal Hocko <mhocko@suse.com> |
| Acked-by: Michal Hocko <mhocko@suse.com> |
| Cc: Johannes Weiner <hannes@cmpxchg.org> |
| Cc: Mel Gorman <mgorman@techsingularity.net> |
| Cc: Roman Gushchin <guro@fb.com> |
| Cc: Shakeel Butt <shakeelb@google.com> |
| Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> |
| Cc: Uladzislau Rezki <urezki@gmail.com> |
| Cc: Vladimir Davydov <vdavydov.dev@gmail.com> |
| Cc: Vlastimil Babka <vbabka@suse.cz> |
| Cc: <stable@vger.kernel.org> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| mm/oom_kill.c | 3 +++ |
| 1 file changed, 3 insertions(+) |
| |
| --- a/mm/oom_kill.c |
| +++ b/mm/oom_kill.c |
| @@ -1131,6 +1131,9 @@ void pagefault_out_of_memory(void) |
| if (mem_cgroup_oom_synchronize(true)) |
| return; |
| |
| + if (fatal_signal_pending(current)) |
| + return; |
| + |
| if (!mutex_trylock(&oom_lock)) |
| return; |
| out_of_memory(&oc); |