| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-40927: xhci: Handle TD clearing for multiple streams case |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| xhci: Handle TD clearing for multiple streams case |
| |
| When multiple streams are in use, multiple TDs might be in flight when |
| an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for |
| each, to ensure everything is reset properly and the caches cleared. |
| Change the logic so that any N>1 TDs found active for different streams |
| are deferred until after the first one is processed, calling |
| xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to |
| queue another command until we are done with all of them. Also change |
| the error/"should never happen" paths to ensure we at least clear any |
| affected TDs, even if we can't issue a command to clear the hardware |
| cache, and complain loudly with an xhci_warn() if this ever happens. |
| |
| This problem case dates back to commit e9df17eb1408 ("USB: xhci: Correct |
| assumptions about number of rings per endpoint.") early on in the XHCI |
| driver's life, when stream support was first added. |
| It was then identified but not fixed nor made into a warning in commit |
| 674f8438c121 ("xhci: split handling halted endpoints into two steps"), |
| which added a FIXME comment for the problem case (without materially |
| changing the behavior as far as I can tell, though the new logic made |
| the problem more obvious). |
| |
| Then later, in commit 94f339147fc3 ("xhci: Fix failure to give back some |
| cached cancelled URBs."), it was acknowledged again. |
| |
| [Mathias: commit 94f339147fc3 ("xhci: Fix failure to give back some cached |
| cancelled URBs.") was a targeted regression fix to the previously mentioned |
| patch. Users reported issues with usb stuck after unmounting/disconnecting |
| UAS devices. This rolled back the TD clearing of multiple streams to its |
| original state.] |
| |
| Apparently the commit author was aware of the problem (yet still chose |
| to submit it): It was still mentioned as a FIXME, an xhci_dbg() was |
| added to log the problem condition, and the remaining issue was mentioned |
| in the commit description. The choice of making the log type xhci_dbg() |
| for what is, at this point, a completely unhandled and known broken |
| condition is puzzling and unfortunate, as it guarantees that no actual |
| users would see the log in production, thereby making it nigh |
| undebuggable (indeed, even if you turn on DEBUG, the message doesn't |
| really hint at there being a problem at all). |
| |
| It took me *months* of random xHC crashes to finally find a reliable |
| repro and be able to do a deep dive debug session, which could all have |
| been avoided had this unhandled, broken condition been actually reported |
| with a warning, as it should have been as a bug intentionally left in |
| unfixed (never mind that it shouldn't have been left in at all). |
| |
| > Another fix to solve clearing the caches of all stream rings with |
| > cancelled TDs is needed, but not as urgent. |
| |
| 3 years after that statement and 14 years after the original bug was |
| introduced, I think it's finally time to fix it. And maybe next time |
| let's not leave bugs unfixed (that are actually worse than the original |
| bug), and let's actually get people to review kernel commits please. |
| |
| Fixes xHC crashes and IOMMU faults with UAS devices when handling |
| errors/faults. Easiest repro is to use `hdparm` to mark an early sector |
| (e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop. |
| At least in the case of JMicron controllers, the read errors end up |
| having to cancel two TDs (for two queued requests to different streams) |
| and the one that didn't get cleared properly ends up faulting the xHC |
| entirely when it tries to access DMA pages that have since been unmapped, |
| referred to by the stale TDs. This normally happens quickly (after two |
| or three loops). After this fix, I left the `cat` in a loop running |
| overnight and experienced no xHC failures, with all read errors |
| recovered properly. Repro'd and tested on an Apple M1 Mac Mini |
| (dwc3 host). |
| |
| On systems without an IOMMU, this bug would instead silently corrupt |
| freed memory, making this a security bug (even on systems with IOMMUs |
| this could silently corrupt memory belonging to other USB devices on the |
| same controller, so it's still a security bug). Given that the kernel |
| autoprobes partition tables, I'm pretty sure a malicious USB device |
| pretending to be a UAS device and reporting an error with the right |
| timing could deliberately trigger a UAF and write to freed memory, with |
| no user action. |
| |
| [Mathias: Commit message and code comment edit, original at:] |
| https://lore.kernel.org/linux-usb/20240524-xhci-streams-v1-1-6b1f13819bea@marcan.st/ |
| |
| The Linux kernel CVE team has assigned CVE-2024-40927 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 2.6.35 with commit e9df17eb1408cfafa3d1844bfc7f22c7237b31b8 and fixed in 5.15.162 with commit 26460c1afa311524f588e288a4941432f0de6228 |
| Issue introduced in 2.6.35 with commit e9df17eb1408cfafa3d1844bfc7f22c7237b31b8 and fixed in 6.1.95 with commit 633f72cb6124ecda97b641fbc119340bd88d51a9 |
| Issue introduced in 2.6.35 with commit e9df17eb1408cfafa3d1844bfc7f22c7237b31b8 and fixed in 6.6.35 with commit 949be4ec5835e0ccb3e2a8ab0e46179cb5512518 |
| Issue introduced in 2.6.35 with commit e9df17eb1408cfafa3d1844bfc7f22c7237b31b8 and fixed in 6.9.6 with commit 61593dc413c3655e4328a351555235bc3089486a |
| Issue introduced in 2.6.35 with commit e9df17eb1408cfafa3d1844bfc7f22c7237b31b8 and fixed in 6.10 with commit 5ceac4402f5d975e5a01c806438eb4e554771577 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-40927 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/usb/host/xhci-ring.c |
| drivers/usb/host/xhci.h |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/26460c1afa311524f588e288a4941432f0de6228 |
| https://git.kernel.org/stable/c/633f72cb6124ecda97b641fbc119340bd88d51a9 |
| https://git.kernel.org/stable/c/949be4ec5835e0ccb3e2a8ab0e46179cb5512518 |
| https://git.kernel.org/stable/c/61593dc413c3655e4328a351555235bc3089486a |
| https://git.kernel.org/stable/c/5ceac4402f5d975e5a01c806438eb4e554771577 |