| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-56547: rcu/nocb: Fix missed RCU barrier on deoffloading |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| rcu/nocb: Fix missed RCU barrier on deoffloading |
| |
| Currently, running rcutorture test with torture_type=rcu fwd_progress=8 |
| n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60 |
| test_boost=2, will trigger the following warning: |
| |
| WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0 |
| RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0 |
| Call Trace: |
| <TASK> |
| ? __warn+0x7e/0x120 |
| ? rcu_nocb_rdp_deoffload+0x292/0x2a0 |
| ? report_bug+0x18e/0x1a0 |
| ? handle_bug+0x3d/0x70 |
| ? exc_invalid_op+0x18/0x70 |
| ? asm_exc_invalid_op+0x1a/0x20 |
| ? rcu_nocb_rdp_deoffload+0x292/0x2a0 |
| rcu_nocb_cpu_deoffload+0x70/0xa0 |
| rcu_nocb_toggle+0x136/0x1c0 |
| ? __pfx_rcu_nocb_toggle+0x10/0x10 |
| kthread+0xd1/0x100 |
| ? __pfx_kthread+0x10/0x10 |
| ret_from_fork+0x2f/0x50 |
| ? __pfx_kthread+0x10/0x10 |
| ret_from_fork_asm+0x1a/0x30 |
| </TASK> |
| |
| CPU0 CPU2 CPU3 |
| //rcu_nocb_toggle //nocb_cb_wait //rcutorture |
| |
| // deoffload CPU1 // process CPU1's rdp |
| rcu_barrier() |
| rcu_segcblist_entrain() |
| rcu_segcblist_add_len(1); |
| // len == 2 |
| // enqueue barrier |
| // callback to CPU1's |
| // rdp->cblist |
| rcu_do_batch() |
| // invoke CPU1's rdp->cblist |
| // callback |
| rcu_barrier_callback() |
| rcu_barrier() |
| mutex_lock(&rcu_state.barrier_mutex); |
| // still see len == 2 |
| // enqueue barrier callback |
| // to CPU1's rdp->cblist |
| rcu_segcblist_entrain() |
| rcu_segcblist_add_len(1); |
| // len == 3 |
| // decrement len |
| rcu_segcblist_add_len(-2); |
| kthread_parkme() |
| |
| // CPU1's rdp->cblist len == 1 |
| // Warn because there is |
| // still a pending barrier |
| // trigger warning |
| WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); |
| cpus_read_unlock(); |
| |
| // wait CPU1 to comes online and |
| // invoke barrier callback on |
| // CPU1 rdp's->cblist |
| wait_for_completion(&rcu_state.barrier_completion); |
| // deoffload CPU4 |
| cpus_read_lock() |
| rcu_barrier() |
| mutex_lock(&rcu_state.barrier_mutex); |
| // block on barrier_mutex |
| // wait rcu_barrier() on |
| // CPU3 to unlock barrier_mutex |
| // but CPU3 unlock barrier_mutex |
| // need to wait CPU1 comes online |
| // when CPU1 going online will block on cpus_write_lock |
| |
| The above scenario will not only trigger a WARN_ON_ONCE(), but also |
| trigger a deadlock. |
| |
| Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU |
| will either observe the decremented callback counter down to 0 and spare |
| the callback enqueue, or rcuo will observe the new callback and keep |
| rdp->nocb_cb_sleep to false. |
| |
| Therefore check rdp->nocb_cb_sleep before parking to make sure no |
| further rcu_barrier() is waiting on the rdp. |
| |
| The Linux kernel CVE team has assigned CVE-2024-56547 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.12 with commit 1fcb932c8b5ce86219d7dedcd63659351a43291c and fixed in 6.12.2 with commit 224b62028959858294789772d372dcb36cf5f820 |
| Issue introduced in 6.12 with commit 1fcb932c8b5ce86219d7dedcd63659351a43291c and fixed in 6.13 with commit 2996980e20b7a54a1869df15b3445374b850b155 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-56547 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| kernel/rcu/tree_nocb.h |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/224b62028959858294789772d372dcb36cf5f820 |
| https://git.kernel.org/stable/c/2996980e20b7a54a1869df15b3445374b850b155 |