| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-58018: nvkm: correctly calculate the available space of the GSP cmdq buffer |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| nvkm: correctly calculate the available space of the GSP cmdq buffer |
| |
| r535_gsp_cmdq_push() waits for the available page in the GSP cmdq |
| buffer when handling a large RPC request. When it sees at least one |
| available page in the cmdq, it quits the waiting with the amount of |
| free buffer pages in the queue. |
| |
| Unfortunately, it always takes the [write pointer, buf_size) as |
| available buffer pages before rolling back and wrongly calculates the |
| size of the data should be copied. Thus, it can overwrite the RPC |
| request that GSP is currently reading, which causes GSP hang due |
| to corrupted RPC request: |
| |
| [ 549.209389] ------------[ cut here ]------------ |
| [ 549.214010] WARNING: CPU: 8 PID: 6314 at drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c:116 r535_gsp_msgq_wait+0xd0/0x190 [nvkm] |
| [ 549.225678] Modules linked in: nvkm(E+) gsp_log(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_timer(E) snd_seq_device(E) snd(E) soundcore(E) rfkill(E) qrtr(E) vfat(E) fat(E) ipmi_ssif(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) mlx5_ib(E) amd64_edac(E) edac_mce_amd(E) kvm_amd(E) ib_uverbs(E) kvm(E) ib_core(E) acpi_ipmi(E) ipmi_si(E) mxm_wmi(E) ipmi_devintf(E) rapl(E) i2c_piix4(E) wmi_bmof(E) joydev(E) ptdma(E) acpi_cpufreq(E) k10temp(E) pcspkr(E) ipmi_msghandler(E) xfs(E) libcrc32c(E) ast(E) i2c_algo_bit(E) crct10dif_pclmul(E) drm_shmem_helper(E) nvme_tcp(E) crc32_pclmul(E) ahci(E) drm_kms_helper(E) libahci(E) nvme_fabrics(E) crc32c_intel(E) nvme(E) cdc_ether(E) mlx5_core(E) nvme_core(E) usbnet(E) drm(E) libata(E) ccp(E) ghash_clmulni_intel(E) mii(E) t10_pi(E) mlxfw(E) sp5100_tco(E) psample(E) pci_hyperv_intf(E) wmi(E) dm_multipath(E) sunrpc(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) be2iscsi(E) bnx2i(E) cnic(E) uio(E) cxgb4i(E) cxgb4(E) tls(E) libcxgbi(E) libcxgb(E) qla4xxx(E) |
| [ 549.225752] iscsi_boot_sysfs(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) fuse(E) [last unloaded: gsp_log(E)] |
| [ 549.326293] CPU: 8 PID: 6314 Comm: insmod Tainted: G E 6.9.0-rc6+ #1 |
| [ 549.334039] Hardware name: ASRockRack 1U1G-MILAN/N/ROMED8-NL, BIOS L3.12E 09/06/2022 |
| [ 549.341781] RIP: 0010:r535_gsp_msgq_wait+0xd0/0x190 [nvkm] |
| [ 549.347343] Code: 08 00 00 89 da c1 e2 0c 48 8d ac 11 00 10 00 00 48 8b 0c 24 48 85 c9 74 1f c1 e0 0c 4c 8d 6d 30 83 e8 30 89 01 e9 68 ff ff ff <0f> 0b 49 c7 c5 92 ff ff ff e9 5a ff ff ff ba ff ff ff ff be c0 0c |
| [ 549.366090] RSP: 0018:ffffacbccaaeb7d0 EFLAGS: 00010246 |
| [ 549.371315] RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000923e28 |
| [ 549.378451] RDX: 0000000000000000 RSI: 0000000055555554 RDI: ffffacbccaaeb730 |
| [ 549.385590] RBP: 0000000000000001 R08: ffff8bd14d235f70 R09: ffff8bd14d235f70 |
| [ 549.392721] R10: 0000000000000002 R11: ffff8bd14d233864 R12: 0000000000000020 |
| [ 549.399854] R13: ffffacbccaaeb818 R14: 0000000000000020 R15: ffff8bb298c67000 |
| [ 549.406988] FS: 00007f5179244740(0000) GS:ffff8bd14d200000(0000) knlGS:0000000000000000 |
| [ 549.415076] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 |
| [ 549.420829] CR2: 00007fa844000010 CR3: 00000001567dc005 CR4: 0000000000770ef0 |
| [ 549.427963] PKRU: 55555554 |
| [ 549.430672] Call Trace: |
| [ 549.433126] <TASK> |
| [ 549.435233] ? __warn+0x7f/0x130 |
| [ 549.438473] ? r535_gsp_msgq_wait+0xd0/0x190 [nvkm] |
| [ 549.443426] ? report_bug+0x18a/0x1a0 |
| [ 549.447098] ? handle_bug+0x3c/0x70 |
| [ 549.450589] ? exc_invalid_op+0x14/0x70 |
| [ 549.454430] ? asm_exc_invalid_op+0x16/0x20 |
| [ 549.458619] ? r535_gsp_msgq_wait+0xd0/0x190 [nvkm] |
| [ 549.463565] r535_gsp_msg_recv+0x46/0x230 [nvkm] |
| [ 549.468257] r535_gsp_rpc_push+0x106/0x160 [nvkm] |
| [ 549.473033] r535_gsp_rpc_rm_ctrl_push+0x40/0x130 [nvkm] |
| [ 549.478422] nvidia_grid_init_vgpu_types+0xbc/0xe0 [nvkm] |
| [ 549.483899] nvidia_grid_init+0xb1/0xd0 [nvkm] |
| [ 549.488420] ? srso_alias_return_thunk+0x5/0xfbef5 |
| [ 549.493213] nvkm_device_pci_probe+0x305/0x420 [nvkm] |
| [ 549.498338] local_pci_probe+0x46/0xa0 |
| [ 549.502096] pci_call_probe+0x56/0x170 |
| [ 549.505851] pci_device_probe+0x79/0xf0 |
| [ 549.509690] ? driver_sysfs_add+0x59/0xc0 |
| [ 549.513702] really_probe+0xd9/0x380 |
| [ 549.517282] __driver_probe_device+0x78/0x150 |
| [ 549.521640] driver_probe_device+0x1e/0x90 |
| [ 549.525746] __driver_attach+0xd2/0x1c0 |
| [ 549.529594] ? __pfx___driver_attach+0x10/0x10 |
| [ 549.534045] bus_for_each_dev+0x78/0xd0 |
| [ 549.537893] bus_add_driver+0x112/0x210 |
| [ 549.541750] driver_register+0x5c/0x120 |
| [ 549.545596] ? __pfx_nvkm_init+0x10/0x10 [nvkm] |
| [ 549.550224] do_one_initcall+0x44/0x300 |
| [ 549.554063] ? do_init_module+0x23/0x240 |
| [ 549.557989] do_init_module+0x64/0x240 |
| |
| Calculate the available buffer page before rolling back based on |
| the result from the waiting. |
| |
| The Linux kernel CVE team has assigned CVE-2024-58018 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.7 with commit 176fdcbddfd288408ce8571c1760ad618d962096 and fixed in 6.12.14 with commit 56e6c7f6d2a6b4e0aae0528c502e56825bb40598 |
| Issue introduced in 6.7 with commit 176fdcbddfd288408ce8571c1760ad618d962096 and fixed in 6.13.3 with commit 6b6b75728c86f60c1fc596f0d4542427d0e6065b |
| Issue introduced in 6.7 with commit 176fdcbddfd288408ce8571c1760ad618d962096 and fixed in 6.14 with commit 01ed662bdd6fce4f59c1804b334610d710d79fa0 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-58018 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/gpu/drm/nouveau/nvkm/subdev/gsp/r535.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/56e6c7f6d2a6b4e0aae0528c502e56825bb40598 |
| https://git.kernel.org/stable/c/6b6b75728c86f60c1fc596f0d4542427d0e6065b |
| https://git.kernel.org/stable/c/01ed662bdd6fce4f59c1804b334610d710d79fa0 |