| From: Michal Hocko <mhocko@suse.com> |
| Subject: memcg, oom: do not bypass oom killer for dying tasks |
| Date: Wed, 2 Apr 2025 11:01:17 +0200 |
| |
| 7775face2079 ("memcg: killed threads should not invoke memcg OOM killer") |
| has added a bypass of the oom killer path for dying threads because a very |
| specific workload (described in the changelog) could hit "no killable |
| tasks" path. This itself is not fatal condition but it could be annoying |
| if this was a common case. |
| |
| On the other hand the bypass has some issues on its own. Without |
| triggering oom killer we won't be able to trigger async oom reclaim |
| (oom_reaper) which can operate on killed tasks as well as long as they |
| still have their mm available. This could be the case during futex |
| cleanup when the memory as pointed out by Johannes in [1]. The said case |
| is still not fully understood but let's drop this bypass that was mostly |
| driven by an artificial workload and allow dying tasks to go into oom |
| path. This will make the code easier to reason about and also help corner |
| cases where oom_reaper could help to release memory. |
| |
| Link: https://lore.kernel.org/all/20241212183012.GB1026@cmpxchg.org/T/#u [1] |
| Link: https://lkml.kernel.org/r/20250402090117.130245-1-mhocko@kernel.org |
| Signed-off-by: Michal Hocko <mhocko@suse.com> |
| Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> |
| Suggested-by: Johannes Weiner <hannes@cmpxchg.org> |
| Acked-by: Shakeel Butt <shakeel.butt@linux.dev> |
| Acked-by: David Rientjes <rientjes@google.com> |
| Cc: Muchun Song <muchun.song@linux.dev> |
| Cc: Rik van Riel <riel@surriel.com> |
| Cc: Roman Gushchin <roman.gushchin@linux.dev> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| mm/memcontrol.c | 2 +- |
| 1 file changed, 1 insertion(+), 1 deletion(-) |
| |
| --- a/mm/memcontrol.c~memcg-oom-do-not-bypass-oom-killer-for-dying-tasks |
| +++ a/mm/memcontrol.c |
| @@ -1664,7 +1664,7 @@ static bool mem_cgroup_out_of_memory(str |
| * A few threads which were not waiting at mutex_lock_killable() can |
| * fail to bail out. Therefore, check again after holding oom_lock. |
| */ |
| - ret = task_is_dying() || out_of_memory(&oc); |
| + ret = out_of_memory(&oc); |
| |
| unlock: |
| mutex_unlock(&oom_lock); |
| _ |