| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-36888: workqueue: Fix selection of wake_cpu in kick_pool() |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| workqueue: Fix selection of wake_cpu in kick_pool() |
| |
| With cpu_possible_mask=0-63 and cpu_online_mask=0-7 the following |
| kernel oops was observed: |
| |
| smp: Bringing up secondary CPUs ... |
| smp: Brought up 1 node, 8 CPUs |
| Unable to handle kernel pointer dereference in virtual kernel address space |
| Failing address: 0000000000000000 TEID: 0000000000000803 |
| [..] |
| Call Trace: |
| arch_vcpu_is_preempted+0x12/0x80 |
| select_idle_sibling+0x42/0x560 |
| select_task_rq_fair+0x29a/0x3b0 |
| try_to_wake_up+0x38e/0x6e0 |
| kick_pool+0xa4/0x198 |
| __queue_work.part.0+0x2bc/0x3a8 |
| call_timer_fn+0x36/0x160 |
| __run_timers+0x1e2/0x328 |
| __run_timer_base+0x5a/0x88 |
| run_timer_softirq+0x40/0x78 |
| __do_softirq+0x118/0x388 |
| irq_exit_rcu+0xc0/0xd8 |
| do_ext_irq+0xae/0x168 |
| ext_int_handler+0xbe/0xf0 |
| psw_idle_exit+0x0/0xc |
| default_idle_call+0x3c/0x110 |
| do_idle+0xd4/0x158 |
| cpu_startup_entry+0x40/0x48 |
| rest_init+0xc6/0xc8 |
| start_kernel+0x3c4/0x5e0 |
| startup_continue+0x3c/0x50 |
| |
| The crash is caused by calling arch_vcpu_is_preempted() for an offline |
| CPU. To avoid this, select the cpu with cpumask_any_and_distribute() |
| to mask __pod_cpumask with cpu_online_mask. In case no cpu is left in |
| the pool, skip the assignment. |
| |
| tj: This doesn't fully fix the bug as CPUs can still go down between picking |
| the target CPU and the wake call. Fixing that likely requires adding |
| cpu_online() test to either the sched or s390 arch code. However, regardless |
| of how that is fixed, workqueue shouldn't be picking a CPU which isn't |
| online as that would result in unpredictable and worse behavior. |
| |
| The Linux kernel CVE team has assigned CVE-2024-36888 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.6 with commit 8639ecebc9b1796d7074751a350462f5e1c61cd4 and fixed in 6.6.31 with commit c57824d4fe07c2131f8c48687cbd5ee2be60c767 |
| Issue introduced in 6.6 with commit 8639ecebc9b1796d7074751a350462f5e1c61cd4 and fixed in 6.8.10 with commit 6d559e70b3eb6623935cbe7f94c1912c1099777b |
| Issue introduced in 6.6 with commit 8639ecebc9b1796d7074751a350462f5e1c61cd4 and fixed in 6.9 with commit 57a01eafdcf78f6da34fad9ff075ed5dfdd9f420 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-36888 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| kernel/workqueue.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/c57824d4fe07c2131f8c48687cbd5ee2be60c767 |
| https://git.kernel.org/stable/c/6d559e70b3eb6623935cbe7f94c1912c1099777b |
| https://git.kernel.org/stable/c/57a01eafdcf78f6da34fad9ff075ed5dfdd9f420 |