| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-26873: scsi: hisi_sas: Fix a deadlock issue related to automatic dump |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| scsi: hisi_sas: Fix a deadlock issue related to automatic dump |
| |
| If we issue a disabling PHY command, the device attached with it will go |
| offline, if a 2 bit ECC error occurs at the same time, a hung task may be |
| found: |
| |
| [ 4613.652388] INFO: task kworker/u256:0:165233 blocked for more than 120 seconds. |
| [ 4613.666297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
| [ 4613.674809] task:kworker/u256:0 state:D stack: 0 pid:165233 ppid: 2 flags:0x00000208 |
| [ 4613.683959] Workqueue: 0000:74:02.0_disco_q sas_revalidate_domain [libsas] |
| [ 4613.691518] Call trace: |
| [ 4613.694678] __switch_to+0xf8/0x17c |
| [ 4613.698872] __schedule+0x660/0xee0 |
| [ 4613.703063] schedule+0xac/0x240 |
| [ 4613.706994] schedule_timeout+0x500/0x610 |
| [ 4613.711705] __down+0x128/0x36c |
| [ 4613.715548] down+0x240/0x2d0 |
| [ 4613.719221] hisi_sas_internal_abort_timeout+0x1bc/0x260 [hisi_sas_main] |
| [ 4613.726618] sas_execute_internal_abort+0x144/0x310 [libsas] |
| [ 4613.732976] sas_execute_internal_abort_dev+0x44/0x60 [libsas] |
| [ 4613.739504] hisi_sas_internal_task_abort_dev.isra.0+0xbc/0x1b0 [hisi_sas_main] |
| [ 4613.747499] hisi_sas_dev_gone+0x174/0x250 [hisi_sas_main] |
| [ 4613.753682] sas_notify_lldd_dev_gone+0xec/0x2e0 [libsas] |
| [ 4613.759781] sas_unregister_common_dev+0x4c/0x7a0 [libsas] |
| [ 4613.765962] sas_destruct_devices+0xb8/0x120 [libsas] |
| [ 4613.771709] sas_do_revalidate_domain.constprop.0+0x1b8/0x31c [libsas] |
| [ 4613.778930] sas_revalidate_domain+0x60/0xa4 [libsas] |
| [ 4613.784716] process_one_work+0x248/0x950 |
| [ 4613.789424] worker_thread+0x318/0x934 |
| [ 4613.793878] kthread+0x190/0x200 |
| [ 4613.797810] ret_from_fork+0x10/0x18 |
| [ 4613.802121] INFO: task kworker/u256:4:316722 blocked for more than 120 seconds. |
| [ 4613.816026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
| [ 4613.824538] task:kworker/u256:4 state:D stack: 0 pid:316722 ppid: 2 flags:0x00000208 |
| [ 4613.833670] Workqueue: 0000:74:02.0 hisi_sas_rst_work_handler [hisi_sas_main] |
| [ 4613.841491] Call trace: |
| [ 4613.844647] __switch_to+0xf8/0x17c |
| [ 4613.848852] __schedule+0x660/0xee0 |
| [ 4613.853052] schedule+0xac/0x240 |
| [ 4613.856984] schedule_timeout+0x500/0x610 |
| [ 4613.861695] __down+0x128/0x36c |
| [ 4613.865542] down+0x240/0x2d0 |
| [ 4613.869216] hisi_sas_controller_prereset+0x58/0x1fc [hisi_sas_main] |
| [ 4613.876324] hisi_sas_rst_work_handler+0x40/0x8c [hisi_sas_main] |
| [ 4613.883019] process_one_work+0x248/0x950 |
| [ 4613.887732] worker_thread+0x318/0x934 |
| [ 4613.892204] kthread+0x190/0x200 |
| [ 4613.896118] ret_from_fork+0x10/0x18 |
| [ 4613.900423] INFO: task kworker/u256:1:348985 blocked for more than 121 seconds. |
| [ 4613.914341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
| [ 4613.922852] task:kworker/u256:1 state:D stack: 0 pid:348985 ppid: 2 flags:0x00000208 |
| [ 4613.931984] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas] |
| [ 4613.939549] Call trace: |
| [ 4613.942702] __switch_to+0xf8/0x17c |
| [ 4613.946892] __schedule+0x660/0xee0 |
| [ 4613.951083] schedule+0xac/0x240 |
| [ 4613.955015] schedule_timeout+0x500/0x610 |
| [ 4613.959725] wait_for_common+0x200/0x610 |
| [ 4613.964349] wait_for_completion+0x3c/0x5c |
| [ 4613.969146] flush_workqueue+0x198/0x790 |
| [ 4613.973776] sas_porte_broadcast_rcvd+0x1e8/0x320 [libsas] |
| [ 4613.979960] sas_port_event_worker+0x54/0xa0 [libsas] |
| [ 4613.985708] process_one_work+0x248/0x950 |
| [ 4613.990420] worker_thread+0x318/0x934 |
| [ 4613.994868] kthread+0x190/0x200 |
| [ 4613.998800] ret_from_fork+0x10/0x18 |
| |
| This is because when the device goes offline, we obtain the hisi_hba |
| semaphore and send the ABORT_DEV command to the device. However, the |
| internal abort timed out due to the 2 bit ECC error and triggers automatic |
| dump. In addition, since the hisi_hba semaphore has been obtained, the dump |
| cannot be executed and the controller cannot be reset. |
| |
| Therefore, the deadlocks occur on the following circular dependencies: |
| hisi_sas_dev_gone() -> down() -> hisi_sas_internal_task_abort_dev() -> ... |
| -> hisi_sas_internal_abort_timeout() -> down(). |
| |
| The deadlock is triggered only when the timeout occurs during device goes |
| offline. To fix this issue, use .rst_ha_timeout to distinguish the scenario |
| where a device goes offline from other scenarios. |
| |
| The Linux kernel CVE team has assigned CVE-2024-26873 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.7 with commit 2ff07b5c6fe9173e7a7de3b23f300d71ad4d8fde and fixed in 6.7.11 with commit e022dd3b875315a2d2001a512e98d1dc8c991f4a |
| Issue introduced in 6.7 with commit 2ff07b5c6fe9173e7a7de3b23f300d71ad4d8fde and fixed in 6.8.2 with commit 85c98073ffcfe9e46abfb9c66f3364467119d563 |
| Issue introduced in 6.7 with commit 2ff07b5c6fe9173e7a7de3b23f300d71ad4d8fde and fixed in 6.9 with commit 3c4f53b2c341ec6428b98cb51a89a09b025d0953 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-26873 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/scsi/hisi_sas/hisi_sas_main.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/a47f0b03149af538af4442ff0702eac430ace1cb |
| https://git.kernel.org/stable/c/e022dd3b875315a2d2001a512e98d1dc8c991f4a |
| https://git.kernel.org/stable/c/85c98073ffcfe9e46abfb9c66f3364467119d563 |
| https://git.kernel.org/stable/c/3c4f53b2c341ec6428b98cb51a89a09b025d0953 |