| From bippy-1.1.0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@kernel.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2022-49781: perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| perf/x86/amd: Fix crash due to race between amd_pmu_enable_all, perf NMI and throttling |
| |
| amd_pmu_enable_all() does: |
| |
| if (!test_bit(idx, cpuc->active_mask)) |
| continue; |
| |
| amd_pmu_enable_event(cpuc->events[idx]); |
| |
| A perf NMI of another event can come between these two steps. Perf NMI |
| handler internally disables and enables _all_ events, including the one |
| which nmi-intercepted amd_pmu_enable_all() was in process of enabling. |
| If that unintentionally enabled event has very low sampling period and |
| causes immediate successive NMI, causing the event to be throttled, |
| cpuc->events[idx] and cpuc->active_mask gets cleared by x86_pmu_stop(). |
| This will result in amd_pmu_enable_event() getting called with event=NULL |
| when amd_pmu_enable_all() resumes after handling the NMIs. This causes a |
| kernel crash: |
| |
| BUG: kernel NULL pointer dereference, address: 0000000000000198 |
| #PF: supervisor read access in kernel mode |
| #PF: error_code(0x0000) - not-present page |
| [...] |
| Call Trace: |
| <TASK> |
| amd_pmu_enable_all+0x68/0xb0 |
| ctx_resched+0xd9/0x150 |
| event_function+0xb8/0x130 |
| ? hrtimer_start_range_ns+0x141/0x4a0 |
| ? perf_duration_warn+0x30/0x30 |
| remote_function+0x4d/0x60 |
| __flush_smp_call_function_queue+0xc4/0x500 |
| flush_smp_call_function_queue+0x11d/0x1b0 |
| do_idle+0x18f/0x2d0 |
| cpu_startup_entry+0x19/0x20 |
| start_secondary+0x121/0x160 |
| secondary_startup_64_no_verify+0xe5/0xeb |
| </TASK> |
| |
| amd_pmu_disable_all()/amd_pmu_enable_all() calls inside perf NMI handler |
| were recently added as part of BRS enablement but I'm not sure whether |
| we really need them. We can just disable BRS in the beginning and enable |
| it back while returning from NMI. This will solve the issue by not |
| enabling those events whose active_masks are set but are not yet enabled |
| in hw pmu. |
| |
| The Linux kernel CVE team has assigned CVE-2022-49781 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.19 with commit ada543459cab7f653dcacdaba4011a8bb19c627c and fixed in 6.0.10 with commit fd5e454b856ed86b090336e269695d9908609b71 |
| Issue introduced in 5.19 with commit ada543459cab7f653dcacdaba4011a8bb19c627c and fixed in 6.1 with commit baa014b9543c8e5e94f5d15b66abfe60750b8284 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2022-49781 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| arch/x86/events/amd/core.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/fd5e454b856ed86b090336e269695d9908609b71 |
| https://git.kernel.org/stable/c/baa014b9543c8e5e94f5d15b66abfe60750b8284 |