cve/published/2025/CVE-2025-22022.mbox - pub/scm/linux/security/vulns - Git at Google

 From bippy-1.2.0 Mon Sep 17 00:00:00 2001
 From: Greg Kroah-Hartman <gregkh@kernel.org>
 To: <linux-cve-announce@vger.kernel.org>
 Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
 Subject: CVE-2025-22022: usb: xhci: Apply the link chain quirk on NEC isoc endpoints

 Description
 ===========

 In the Linux kernel, the following vulnerability has been resolved:

 usb: xhci: Apply the link chain quirk on NEC isoc endpoints

 Two clearly different specimens of NEC uPD720200 (one with start/stop
 bug, one without) were seen to cause IOMMU faults after some Missed
 Service Errors. Faulting address is immediately after a transfer ring
 segment and patched dynamic debug messages revealed that the MSE was
 received when waiting for a TD near the end of that segment:

 [ 1.041954] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ffa08fe0
 [ 1.042120] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09000 flags=0x0000]
 [ 1.042146] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09040 flags=0x0000]

 It gets even funnier if the next page is a ring segment accessible to
 the HC. Below, it reports MSE in segment at ff1e8000, plows through a
 zero-filled page at ff1e9000 and starts reporting events for TRBs in
 page at ff1ea000 every microframe, instead of jumping to seg ff1e6000.

 [ 7.041671] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0
 [ 7.041999] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0
 [ 7.042011] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
 [ 7.042028] xhci_hcd: All TDs skipped for slot 1 ep 2. Clear skip flag.
 [ 7.042134] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
 [ 7.042138] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31
 [ 7.042144] xhci_hcd: Looking for event-dma 00000000ff1ea040 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
 [ 7.042259] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
 [ 7.042262] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31
 [ 7.042266] xhci_hcd: Looking for event-dma 00000000ff1ea050 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820

 At some point completion events change from Isoch Buffer Overrun to
 Short Packet and the HC finally finds cycle bit mismatch in ff1ec000.

 [ 7.098130] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
 [ 7.098132] xhci_hcd: Looking for event-dma 00000000ff1ecc50 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
 [ 7.098254] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
 [ 7.098256] xhci_hcd: Looking for event-dma 00000000ff1ecc60 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
 [ 7.098379] xhci_hcd: Overrun event on slot 1 ep 2

 It's possible that data from the isochronous device were written to
 random buffers of pending TDs on other endpoints (either IN or OUT),
 other devices or even other HCs in the same IOMMU domain.

 Lastly, an error from a different USB device on another HC. Was it
 caused by the above? I don't know, but it may have been. The disk
 was working without any other issues and generated PCIe traffic to
 starve the NEC of upstream BW and trigger those MSEs. The two HCs
 shared one x1 slot by means of a commercial "PCIe splitter" board.

 [ 7.162604] usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd
 [ 7.178990] sd 9:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
 [ 7.179001] sd 9:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 04 02 ae 00 00 02 00 00
 [ 7.179004] I/O error, dev sdb, sector 67284480 op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 0

 Fortunately, it appears that this ridiculous bug is avoided by setting
 the chain bit of Link TRBs on isochronous rings. Other ancient HCs are
 known which also expect the bit to be set and they ignore Link TRBs if
 it's not. Reportedly, 0.95 spec guaranteed that the bit is set.

 The bandwidth-starved NEC HC running a 32KB/uframe UVC endpoint reports
 tens of MSEs per second and runs into the bug within seconds. Chaining
 Link TRBs allows the same workload to run for many minutes, many times.

 No negative side effects seen in UVC recording and UAC playback with a
 few devices at full speed, high speed and SuperSpeed.

 The problem doesn't reproduce on the newer Renesas uPD720201/uPD720202
 and on old Etron EJ168 and VIA VL805 (but the VL805 has other bug).

 [shorten line length of log snippets in commit messge -Mathias]

 The Linux kernel CVE team has assigned CVE-2025-22022 to this issue.


 Affected and fixed versions
 ===========================

 	Fixed in 6.12.22 with commit a4931d9fb99eb5462f3eaa231999d279c40afb21
 	Fixed in 6.13.10 with commit 43a18225150ce874d23b37761c302a5dffee1595
 	Fixed in 6.14.1 with commit 061a1683bae6ef56ab8fa392725ba7495515cd1d
 	Fixed in 6.15 with commit bb0ba4cb1065e87f9cc75db1fa454e56d0894d01

 Please see https://www.kernel.org for a full list of currently supported
 kernel versions by the kernel community.

 Unaffected versions might change over time as fixes are backported to
 older supported kernel versions.  The official CVE entry at
 	https://cve.org/CVERecord/?id=CVE-2025-22022
 will be updated if fixes are backported, please check that for the most
 up to date information about this issue.


 Affected files
 ==============

 The file(s) affected by this issue are:
 	drivers/usb/host/xhci.h


 Mitigation
 ==========

 The Linux kernel CVE team recommends that you update to the latest
 stable kernel version for this, and many other bugfixes.  Individual
 changes are never tested alone, but rather are part of a larger kernel
 release.  Cherry-picking individual commits is not recommended or
 supported by the Linux kernel community at all.  If however, updating to
 the latest release is impossible, the individual changes to resolve this
 issue can be found at these commits:
 	https://git.kernel.org/stable/c/a4931d9fb99eb5462f3eaa231999d279c40afb21
 	https://git.kernel.org/stable/c/43a18225150ce874d23b37761c302a5dffee1595
 	https://git.kernel.org/stable/c/061a1683bae6ef56ab8fa392725ba7495515cd1d
 	https://git.kernel.org/stable/c/bb0ba4cb1065e87f9cc75db1fa454e56d0894d01
	From bippy-1.2.0 Mon Sep 17 00:00:00 2001
	From: Greg Kroah-Hartman <gregkh@kernel.org>
	To: <linux-cve-announce@vger.kernel.org>
	Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
	Subject: CVE-2025-22022: usb: xhci: Apply the link chain quirk on NEC isoc endpoints

	Description
	===========

	In the Linux kernel, the following vulnerability has been resolved:

	usb: xhci: Apply the link chain quirk on NEC isoc endpoints

	Two clearly different specimens of NEC uPD720200 (one with start/stop
	bug, one without) were seen to cause IOMMU faults after some Missed
	Service Errors. Faulting address is immediately after a transfer ring
	segment and patched dynamic debug messages revealed that the MSE was
	received when waiting for a TD near the end of that segment:

	[ 1.041954] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ffa08fe0
	[ 1.042120] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09000 flags=0x0000]
	[ 1.042146] xhci_hcd: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0005 address=0xffa09040 flags=0x0000]

	It gets even funnier if the next page is a ring segment accessible to
	the HC. Below, it reports MSE in segment at ff1e8000, plows through a
	zero-filled page at ff1e9000 and starts reporting events for TRBs in
	page at ff1ea000 every microframe, instead of jumping to seg ff1e6000.

	[ 7.041671] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0
	[ 7.041999] xhci_hcd: Miss service interval error for slot 1 ep 2 expected TD DMA ff1e8fe0
	[ 7.042011] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
	[ 7.042028] xhci_hcd: All TDs skipped for slot 1 ep 2. Clear skip flag.
	[ 7.042134] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
	[ 7.042138] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31
	[ 7.042144] xhci_hcd: Looking for event-dma 00000000ff1ea040 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
	[ 7.042259] xhci_hcd: WARN: buffer overrun event for slot 1 ep 2 on endpoint
	[ 7.042262] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 31
	[ 7.042266] xhci_hcd: Looking for event-dma 00000000ff1ea050 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820

	At some point completion events change from Isoch Buffer Overrun to
	Short Packet and the HC finally finds cycle bit mismatch in ff1ec000.

	[ 7.098130] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
	[ 7.098132] xhci_hcd: Looking for event-dma 00000000ff1ecc50 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
	[ 7.098254] xhci_hcd: ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13
	[ 7.098256] xhci_hcd: Looking for event-dma 00000000ff1ecc60 trb-start 00000000ff1e6820 trb-end 00000000ff1e6820
	[ 7.098379] xhci_hcd: Overrun event on slot 1 ep 2

	It's possible that data from the isochronous device were written to
	random buffers of pending TDs on other endpoints (either IN or OUT),
	other devices or even other HCs in the same IOMMU domain.

	Lastly, an error from a different USB device on another HC. Was it
	caused by the above? I don't know, but it may have been. The disk
	was working without any other issues and generated PCIe traffic to
	starve the NEC of upstream BW and trigger those MSEs. The two HCs
	shared one x1 slot by means of a commercial "PCIe splitter" board.

	[ 7.162604] usb 10-2: reset SuperSpeed USB device number 3 using xhci_hcd
	[ 7.178990] sd 9:0:0:0: [sdb] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x07 driverbyte=DRIVER_OK cmd_age=0s
	[ 7.179001] sd 9:0:0:0: [sdb] tag#0 CDB: opcode=0x28 28 00 04 02 ae 00 00 02 00 00
	[ 7.179004] I/O error, dev sdb, sector 67284480 op 0x0:(READ) flags 0x80700 phys_seg 5 prio class 0

	Fortunately, it appears that this ridiculous bug is avoided by setting
	the chain bit of Link TRBs on isochronous rings. Other ancient HCs are
	known which also expect the bit to be set and they ignore Link TRBs if
	it's not. Reportedly, 0.95 spec guaranteed that the bit is set.

	The bandwidth-starved NEC HC running a 32KB/uframe UVC endpoint reports
	tens of MSEs per second and runs into the bug within seconds. Chaining
	Link TRBs allows the same workload to run for many minutes, many times.

	No negative side effects seen in UVC recording and UAC playback with a
	few devices at full speed, high speed and SuperSpeed.

	The problem doesn't reproduce on the newer Renesas uPD720201/uPD720202
	and on old Etron EJ168 and VIA VL805 (but the VL805 has other bug).

	[shorten line length of log snippets in commit messge -Mathias]

	The Linux kernel CVE team has assigned CVE-2025-22022 to this issue.


	Affected and fixed versions
	===========================

	Fixed in 6.12.22 with commit a4931d9fb99eb5462f3eaa231999d279c40afb21
	Fixed in 6.13.10 with commit 43a18225150ce874d23b37761c302a5dffee1595
	Fixed in 6.14.1 with commit 061a1683bae6ef56ab8fa392725ba7495515cd1d
	Fixed in 6.15 with commit bb0ba4cb1065e87f9cc75db1fa454e56d0894d01

	Please see https://www.kernel.org for a full list of currently supported
	kernel versions by the kernel community.

	Unaffected versions might change over time as fixes are backported to
	older supported kernel versions. The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2025-22022
	will be updated if fixes are backported, please check that for the most
	up to date information about this issue.


	Affected files
	==============

	The file(s) affected by this issue are:
	drivers/usb/host/xhci.h


	Mitigation
	==========

	The Linux kernel CVE team recommends that you update to the latest
	stable kernel version for this, and many other bugfixes. Individual
	changes are never tested alone, but rather are part of a larger kernel
	release. Cherry-picking individual commits is not recommended or
	supported by the Linux kernel community at all. If however, updating to
	the latest release is impossible, the individual changes to resolve this
	issue can be found at these commits:
	https://git.kernel.org/stable/c/a4931d9fb99eb5462f3eaa231999d279c40afb21
	https://git.kernel.org/stable/c/43a18225150ce874d23b37761c302a5dffee1595
	https://git.kernel.org/stable/c/061a1683bae6ef56ab8fa392725ba7495515cd1d
	https://git.kernel.org/stable/c/bb0ba4cb1065e87f9cc75db1fa454e56d0894d01