| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-27004: clk: Get runtime PM before walking tree during disable_unused |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| clk: Get runtime PM before walking tree during disable_unused |
| |
| Doug reported [1] the following hung task: |
| |
| INFO: task swapper/0:1 blocked for more than 122 seconds. |
| Not tainted 5.15.149-21875-gf795ebc40eb8 #1 |
| "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
| task:swapper/0 state:D stack: 0 pid: 1 ppid: 0 flags:0x00000008 |
| Call trace: |
| __switch_to+0xf4/0x1f4 |
| __schedule+0x418/0xb80 |
| schedule+0x5c/0x10c |
| rpm_resume+0xe0/0x52c |
| rpm_resume+0x178/0x52c |
| __pm_runtime_resume+0x58/0x98 |
| clk_pm_runtime_get+0x30/0xb0 |
| clk_disable_unused_subtree+0x58/0x208 |
| clk_disable_unused_subtree+0x38/0x208 |
| clk_disable_unused_subtree+0x38/0x208 |
| clk_disable_unused_subtree+0x38/0x208 |
| clk_disable_unused_subtree+0x38/0x208 |
| clk_disable_unused+0x4c/0xe4 |
| do_one_initcall+0xcc/0x2d8 |
| do_initcall_level+0xa4/0x148 |
| do_initcalls+0x5c/0x9c |
| do_basic_setup+0x24/0x30 |
| kernel_init_freeable+0xec/0x164 |
| kernel_init+0x28/0x120 |
| ret_from_fork+0x10/0x20 |
| INFO: task kworker/u16:0:9 blocked for more than 122 seconds. |
| Not tainted 5.15.149-21875-gf795ebc40eb8 #1 |
| "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
| task:kworker/u16:0 state:D stack: 0 pid: 9 ppid: 2 flags:0x00000008 |
| Workqueue: events_unbound deferred_probe_work_func |
| Call trace: |
| __switch_to+0xf4/0x1f4 |
| __schedule+0x418/0xb80 |
| schedule+0x5c/0x10c |
| schedule_preempt_disabled+0x2c/0x48 |
| __mutex_lock+0x238/0x488 |
| __mutex_lock_slowpath+0x1c/0x28 |
| mutex_lock+0x50/0x74 |
| clk_prepare_lock+0x7c/0x9c |
| clk_core_prepare_lock+0x20/0x44 |
| clk_prepare+0x24/0x30 |
| clk_bulk_prepare+0x40/0xb0 |
| mdss_runtime_resume+0x54/0x1c8 |
| pm_generic_runtime_resume+0x30/0x44 |
| __genpd_runtime_resume+0x68/0x7c |
| genpd_runtime_resume+0x108/0x1f4 |
| __rpm_callback+0x84/0x144 |
| rpm_callback+0x30/0x88 |
| rpm_resume+0x1f4/0x52c |
| rpm_resume+0x178/0x52c |
| __pm_runtime_resume+0x58/0x98 |
| __device_attach+0xe0/0x170 |
| device_initial_probe+0x1c/0x28 |
| bus_probe_device+0x3c/0x9c |
| device_add+0x644/0x814 |
| mipi_dsi_device_register_full+0xe4/0x170 |
| devm_mipi_dsi_device_register_full+0x28/0x70 |
| ti_sn_bridge_probe+0x1dc/0x2c0 |
| auxiliary_bus_probe+0x4c/0x94 |
| really_probe+0xcc/0x2c8 |
| __driver_probe_device+0xa8/0x130 |
| driver_probe_device+0x48/0x110 |
| __device_attach_driver+0xa4/0xcc |
| bus_for_each_drv+0x8c/0xd8 |
| __device_attach+0xf8/0x170 |
| device_initial_probe+0x1c/0x28 |
| bus_probe_device+0x3c/0x9c |
| deferred_probe_work_func+0x9c/0xd8 |
| process_one_work+0x148/0x518 |
| worker_thread+0x138/0x350 |
| kthread+0x138/0x1e0 |
| ret_from_fork+0x10/0x20 |
| |
| The first thread is walking the clk tree and calling |
| clk_pm_runtime_get() to power on devices required to read the clk |
| hardware via struct clk_ops::is_enabled(). This thread holds the clk |
| prepare_lock, and is trying to runtime PM resume a device, when it finds |
| that the device is in the process of resuming so the thread schedule()s |
| away waiting for the device to finish resuming before continuing. The |
| second thread is runtime PM resuming the same device, but the runtime |
| resume callback is calling clk_prepare(), trying to grab the |
| prepare_lock waiting on the first thread. |
| |
| This is a classic ABBA deadlock. To properly fix the deadlock, we must |
| never runtime PM resume or suspend a device with the clk prepare_lock |
| held. Actually doing that is near impossible today because the global |
| prepare_lock would have to be dropped in the middle of the tree, the |
| device runtime PM resumed/suspended, and then the prepare_lock grabbed |
| again to ensure consistency of the clk tree topology. If anything |
| changes with the clk tree in the meantime, we've lost and will need to |
| start the operation all over again. |
| |
| Luckily, most of the time we're simply incrementing or decrementing the |
| runtime PM count on an active device, so we don't have the chance to |
| schedule away with the prepare_lock held. Let's fix this immediate |
| problem that can be triggered more easily by simply booting on Qualcomm |
| sc7180. |
| |
| Introduce a list of clk_core structures that have been registered, or |
| are in the process of being registered, that require runtime PM to |
| operate. Iterate this list and call clk_pm_runtime_get() on each of them |
| without holding the prepare_lock during clk_disable_unused(). This way |
| we can be certain that the runtime PM state of the devices will be |
| active and resumed so we can't schedule away while walking the clk tree |
| with the prepare_lock held. Similarly, call clk_pm_runtime_put() without |
| the prepare_lock held to properly drop the runtime PM reference. We |
| remove the calls to clk_pm_runtime_{get,put}() in this path because |
| they're superfluous now that we know the devices are runtime resumed. |
| |
| The Linux kernel CVE team has assigned CVE-2024-27004 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 5.4.275 with commit 253ab38d1ee652a596942156978a233970d185ba |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 5.10.216 with commit 4af115f1a20a3d9093586079206ee37c2ac55123 |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 5.15.157 with commit a29ec0465dce0b871003698698ac6fa92c9a5034 |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 6.1.88 with commit a424e713e0cc33d4b969cfda25b9f46df4d7b5bc |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 6.6.29 with commit 60ff482c4205a5aac3b0595ab794cfd62295dab5 |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 6.8.8 with commit 115554862294397590088ba02f11f2aba6d5016c |
| Issue introduced in 4.15 with commit 9a34b45397e5a389e25a0c5d39983300d040e5e2 and fixed in 6.9 with commit e581cf5d216289ef292d1a4036d53ce90e122469 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-27004 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/clk/clk.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/253ab38d1ee652a596942156978a233970d185ba |
| https://git.kernel.org/stable/c/4af115f1a20a3d9093586079206ee37c2ac55123 |
| https://git.kernel.org/stable/c/a29ec0465dce0b871003698698ac6fa92c9a5034 |
| https://git.kernel.org/stable/c/a424e713e0cc33d4b969cfda25b9f46df4d7b5bc |
| https://git.kernel.org/stable/c/60ff482c4205a5aac3b0595ab794cfd62295dab5 |
| https://git.kernel.org/stable/c/115554862294397590088ba02f11f2aba6d5016c |
| https://git.kernel.org/stable/c/e581cf5d216289ef292d1a4036d53ce90e122469 |