| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-57977: memcg: fix soft lockup in the OOM process |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| memcg: fix soft lockup in the OOM process |
| |
| A soft lockup issue was found in the product with about 56,000 tasks were |
| in the OOM cgroup, it was traversing them when the soft lockup was |
| triggered. |
| |
| watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [VM Thread:1503066] |
| CPU: 2 PID: 1503066 Comm: VM Thread Kdump: loaded Tainted: G |
| Hardware name: Huawei Cloud OpenStack Nova, BIOS |
| RIP: 0010:console_unlock+0x343/0x540 |
| RSP: 0000:ffffb751447db9a0 EFLAGS: 00000247 ORIG_RAX: ffffffffffffff13 |
| RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000ffffffff |
| RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000247 |
| RBP: ffffffffafc71f90 R08: 0000000000000000 R09: 0000000000000040 |
| R10: 0000000000000080 R11: 0000000000000000 R12: ffffffffafc74bd0 |
| R13: ffffffffaf60a220 R14: 0000000000000247 R15: 0000000000000000 |
| CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 |
| CR2: 00007f2fe6ad91f0 CR3: 00000004b2076003 CR4: 0000000000360ee0 |
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 |
| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 |
| Call Trace: |
| vprintk_emit+0x193/0x280 |
| printk+0x52/0x6e |
| dump_task+0x114/0x130 |
| mem_cgroup_scan_tasks+0x76/0x100 |
| dump_header+0x1fe/0x210 |
| oom_kill_process+0xd1/0x100 |
| out_of_memory+0x125/0x570 |
| mem_cgroup_out_of_memory+0xb5/0xd0 |
| try_charge+0x720/0x770 |
| mem_cgroup_try_charge+0x86/0x180 |
| mem_cgroup_try_charge_delay+0x1c/0x40 |
| do_anonymous_page+0xb5/0x390 |
| handle_mm_fault+0xc4/0x1f0 |
| |
| This is because thousands of processes are in the OOM cgroup, it takes a |
| long time to traverse all of them. As a result, this lead to soft lockup |
| in the OOM process. |
| |
| To fix this issue, call 'cond_resched' in the 'mem_cgroup_scan_tasks' |
| function per 1000 iterations. For global OOM, call |
| 'touch_softlockup_watchdog' per 1000 iterations to avoid this issue. |
| |
| The Linux kernel CVE team has assigned CVE-2024-57977 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 5.4.291 with commit 72f2c0b7c152c2983ed51d48c3272cab4f34d965 |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 5.10.235 with commit 110399858194c71f11afefad6e7be9e3876b284f |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 5.15.179 with commit a9042dbc1ed4bf25a5f5c699d10c3d676abf8ca2 |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 6.1.130 with commit 0a09d56e1682c951046bf15542b3e9553046c9f6 |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 6.6.80 with commit 972486d37169fe85035e81b8c5dff21f70df1173 |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 6.12.13 with commit c3a3741db8c1202aa959c77df3a4c361612d1eb1 |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 6.13.2 with commit 46576834291869457d4772bb7df72d7c2bb3d57f |
| Issue introduced in 3.6 with commit 9cbb78bb314360a860a8b23723971cb6fcb54176 and fixed in 6.14 with commit ade81479c7dda1ce3eedb215c78bc615bbd04f06 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-57977 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| mm/memcontrol.c |
| mm/oom_kill.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/72f2c0b7c152c2983ed51d48c3272cab4f34d965 |
| https://git.kernel.org/stable/c/110399858194c71f11afefad6e7be9e3876b284f |
| https://git.kernel.org/stable/c/a9042dbc1ed4bf25a5f5c699d10c3d676abf8ca2 |
| https://git.kernel.org/stable/c/0a09d56e1682c951046bf15542b3e9553046c9f6 |
| https://git.kernel.org/stable/c/972486d37169fe85035e81b8c5dff21f70df1173 |
| https://git.kernel.org/stable/c/c3a3741db8c1202aa959c77df3a4c361612d1eb1 |
| https://git.kernel.org/stable/c/46576834291869457d4772bb7df72d7c2bb3d57f |
| https://git.kernel.org/stable/c/ade81479c7dda1ce3eedb215c78bc615bbd04f06 |