| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-46797: powerpc/qspinlock: Fix deadlock in MCS queue |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| powerpc/qspinlock: Fix deadlock in MCS queue |
| |
| If an interrupt occurs in queued_spin_lock_slowpath() after we increment |
| qnodesp->count and before node->lock is initialized, another CPU might |
| see stale lock values in get_tail_qnode(). If the stale lock value happens |
| to match the lock on that CPU, then we write to the "next" pointer of |
| the wrong qnode. This causes a deadlock as the former CPU, once it becomes |
| the head of the MCS queue, will spin indefinitely until it's "next" pointer |
| is set by its successor in the queue. |
| |
| Running stress-ng on a 16 core (16EC/16VP) shared LPAR, results in |
| occasional lockups similar to the following: |
| |
| $ stress-ng --all 128 --vm-bytes 80% --aggressive \ |
| --maximize --oomable --verify --syslog \ |
| --metrics --times --timeout 5m |
| |
| watchdog: CPU 15 Hard LOCKUP |
| ...... |
| NIP [c0000000000b78f4] queued_spin_lock_slowpath+0x1184/0x1490 |
| LR [c000000001037c5c] _raw_spin_lock+0x6c/0x90 |
| Call Trace: |
| 0xc000002cfffa3bf0 (unreliable) |
| _raw_spin_lock+0x6c/0x90 |
| raw_spin_rq_lock_nested.part.135+0x4c/0xd0 |
| sched_ttwu_pending+0x60/0x1f0 |
| __flush_smp_call_function_queue+0x1dc/0x670 |
| smp_ipi_demux_relaxed+0xa4/0x100 |
| xive_muxed_ipi_action+0x20/0x40 |
| __handle_irq_event_percpu+0x80/0x240 |
| handle_irq_event_percpu+0x2c/0x80 |
| handle_percpu_irq+0x84/0xd0 |
| generic_handle_irq+0x54/0x80 |
| __do_irq+0xac/0x210 |
| __do_IRQ+0x74/0xd0 |
| 0x0 |
| do_IRQ+0x8c/0x170 |
| hardware_interrupt_common_virt+0x29c/0x2a0 |
| --- interrupt: 500 at queued_spin_lock_slowpath+0x4b8/0x1490 |
| ...... |
| NIP [c0000000000b6c28] queued_spin_lock_slowpath+0x4b8/0x1490 |
| LR [c000000001037c5c] _raw_spin_lock+0x6c/0x90 |
| --- interrupt: 500 |
| 0xc0000029c1a41d00 (unreliable) |
| _raw_spin_lock+0x6c/0x90 |
| futex_wake+0x100/0x260 |
| do_futex+0x21c/0x2a0 |
| sys_futex+0x98/0x270 |
| system_call_exception+0x14c/0x2f0 |
| system_call_vectored_common+0x15c/0x2ec |
| |
| The following code flow illustrates how the deadlock occurs. |
| For the sake of brevity, assume that both locks (A and B) are |
| contended and we call the queued_spin_lock_slowpath() function. |
| |
| CPU0 CPU1 |
| ---- ---- |
| spin_lock_irqsave(A) | |
| spin_unlock_irqrestore(A) | |
| spin_lock(B) | |
| | | |
| ▼ | |
| id = qnodesp->count++; | |
| (Note that nodes[0].lock == A) | |
| | | |
| ▼ | |
| Interrupt | |
| (happens before "nodes[0].lock = B") | |
| | | |
| ▼ | |
| spin_lock_irqsave(A) | |
| | | |
| ▼ | |
| id = qnodesp->count++ | |
| nodes[1].lock = A | |
| | | |
| ▼ | |
| Tail of MCS queue | |
| | spin_lock_irqsave(A) |
| ▼ | |
| Head of MCS queue ▼ |
| | CPU0 is previous tail |
| ▼ | |
| Spin indefinitely ▼ |
| (until "nodes[1].next != NULL") prev = get_tail_qnode(A, CPU0) |
| | |
| ▼ |
| prev == &qnodes[CPU0].nodes[0] |
| (as qnodes[CPU0].nodes[0].lock == A) |
| | |
| ▼ |
| WRITE_ONCE(prev->next, node) |
| | |
| ▼ |
| Spin indefinitely |
| (until nodes[0].locked == 1) |
| |
| Thanks to Saket Kumar Bhaskar for help with recreating the issue |
| |
| The Linux kernel CVE team has assigned CVE-2024-46797 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.2 with commit 84990b169557428c318df87b7836cd15f65b62dc and fixed in 6.6.51 with commit d84ab6661e8d09092de9b034b016515ef9b66085 |
| Issue introduced in 6.2 with commit 84990b169557428c318df87b7836cd15f65b62dc and fixed in 6.10.10 with commit f06af737e4be28c0e926dc25d5f0a111da4e2987 |
| Issue introduced in 6.2 with commit 84990b169557428c318df87b7836cd15f65b62dc and fixed in 6.11 with commit 734ad0af3609464f8f93e00b6c0de1e112f44559 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-46797 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| arch/powerpc/lib/qspinlock.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/d84ab6661e8d09092de9b034b016515ef9b66085 |
| https://git.kernel.org/stable/c/f06af737e4be28c0e926dc25d5f0a111da4e2987 |
| https://git.kernel.org/stable/c/734ad0af3609464f8f93e00b6c0de1e112f44559 |