| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-35907: mlxbf_gige: call request_irq() after NAPI initialized |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| mlxbf_gige: call request_irq() after NAPI initialized |
| |
| The mlxbf_gige driver encounters a NULL pointer exception in |
| mlxbf_gige_open() when kdump is enabled. The sequence to reproduce |
| the exception is as follows: |
| a) enable kdump |
| b) trigger kdump via "echo c > /proc/sysrq-trigger" |
| c) kdump kernel executes |
| d) kdump kernel loads mlxbf_gige module |
| e) the mlxbf_gige module runs its open() as the |
| the "oob_net0" interface is brought up |
| f) mlxbf_gige module will experience an exception |
| during its open(), something like: |
| |
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 |
| Mem abort info: |
| ESR = 0x0000000086000004 |
| EC = 0x21: IABT (current EL), IL = 32 bits |
| SET = 0, FnV = 0 |
| EA = 0, S1PTW = 0 |
| FSC = 0x04: level 0 translation fault |
| user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000 |
| [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 |
| Internal error: Oops: 0000000086000004 [#1] SMP |
| CPU: 0 PID: 812 Comm: NetworkManager Tainted: G OE 5.15.0-1035-bluefield #37-Ubuntu |
| Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024 |
| pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) |
| pc : 0x0 |
| lr : __napi_poll+0x40/0x230 |
| sp : ffff800008003e00 |
| x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff |
| x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8 |
| x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000 |
| x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000 |
| x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0 |
| x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c |
| x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398 |
| x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2 |
| x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100 |
| x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238 |
| Call trace: |
| 0x0 |
| net_rx_action+0x178/0x360 |
| __do_softirq+0x15c/0x428 |
| __irq_exit_rcu+0xac/0xec |
| irq_exit+0x18/0x2c |
| handle_domain_irq+0x6c/0xa0 |
| gic_handle_irq+0xec/0x1b0 |
| call_on_irq_stack+0x20/0x2c |
| do_interrupt_handler+0x5c/0x70 |
| el1_interrupt+0x30/0x50 |
| el1h_64_irq_handler+0x18/0x2c |
| el1h_64_irq+0x7c/0x80 |
| __setup_irq+0x4c0/0x950 |
| request_threaded_irq+0xf4/0x1bc |
| mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige] |
| mlxbf_gige_open+0x5c/0x170 [mlxbf_gige] |
| __dev_open+0x100/0x220 |
| __dev_change_flags+0x16c/0x1f0 |
| dev_change_flags+0x2c/0x70 |
| do_setlink+0x220/0xa40 |
| __rtnl_newlink+0x56c/0x8a0 |
| rtnl_newlink+0x58/0x84 |
| rtnetlink_rcv_msg+0x138/0x3c4 |
| netlink_rcv_skb+0x64/0x130 |
| rtnetlink_rcv+0x20/0x30 |
| netlink_unicast+0x2ec/0x360 |
| netlink_sendmsg+0x278/0x490 |
| __sock_sendmsg+0x5c/0x6c |
| ____sys_sendmsg+0x290/0x2d4 |
| ___sys_sendmsg+0x84/0xd0 |
| __sys_sendmsg+0x70/0xd0 |
| __arm64_sys_sendmsg+0x2c/0x40 |
| invoke_syscall+0x78/0x100 |
| el0_svc_common.constprop.0+0x54/0x184 |
| do_el0_svc+0x30/0xac |
| el0_svc+0x48/0x160 |
| el0t_64_sync_handler+0xa4/0x12c |
| el0t_64_sync+0x1a4/0x1a8 |
| Code: bad PC value |
| ---[ end trace 7d1c3f3bf9d81885 ]--- |
| Kernel panic - not syncing: Oops: Fatal exception in interrupt |
| Kernel Offset: 0x2870a7a00000 from 0xffff800008000000 |
| PHYS_OFFSET: 0x80000000 |
| CPU features: 0x0,000005c1,a3332a5a |
| Memory Limit: none |
| ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- |
| |
| The exception happens because there is a pending RX interrupt before the |
| call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires |
| immediately after this request_irq() completes. The RX IRQ handler runs |
| "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()" |
| and "napi_enable()", both which happen later in the open() logic. |
| |
| The logic in mlxbf_gige_open() must fully initialize NAPI before any calls |
| to request_irq() execute. |
| |
| The Linux kernel CVE team has assigned CVE-2024-35907 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.14 with commit f92e1869d74e1acc6551256eb084a1c14a054e19 and fixed in 5.15.154 with commit a583117668ddb86e98f2e11c7caa3db0e6df52a3 |
| Issue introduced in 5.14 with commit f92e1869d74e1acc6551256eb084a1c14a054e19 and fixed in 6.1.85 with commit 24444af5ddf729376b90db0f135fa19973cb5dab |
| Issue introduced in 5.14 with commit f92e1869d74e1acc6551256eb084a1c14a054e19 and fixed in 6.6.26 with commit 867a2f598af6a645c865d1101b58c5e070c6dd9e |
| Issue introduced in 5.14 with commit f92e1869d74e1acc6551256eb084a1c14a054e19 and fixed in 6.8.5 with commit 8feb1652afe9c5d019059a55c90f70690dce0f52 |
| Issue introduced in 5.14 with commit f92e1869d74e1acc6551256eb084a1c14a054e19 and fixed in 6.9 with commit f7442a634ac06b953fc1f7418f307b25acd4cfbc |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-35907 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/a583117668ddb86e98f2e11c7caa3db0e6df52a3 |
| https://git.kernel.org/stable/c/24444af5ddf729376b90db0f135fa19973cb5dab |
| https://git.kernel.org/stable/c/867a2f598af6a645c865d1101b58c5e070c6dd9e |
| https://git.kernel.org/stable/c/8feb1652afe9c5d019059a55c90f70690dce0f52 |
| https://git.kernel.org/stable/c/f7442a634ac06b953fc1f7418f307b25acd4cfbc |