cve/published/2024/CVE-2024-40918.mbox - pub/scm/linux/security/vulns - Git at Google

 From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001
 From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 To: <linux-cve-announce@vger.kernel.org>
 Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
 Subject: CVE-2024-40918: parisc: Try to fix random segmentation faults in package builds

 Description
 ===========

 In the Linux kernel, the following vulnerability has been resolved:

 parisc: Try to fix random segmentation faults in package builds

 PA-RISC systems with PA8800 and PA8900 processors have had problems
 with random segmentation faults for many years.  Systems with earlier
 processors are much more stable.

 Systems with PA8800 and PA8900 processors have a large L2 cache which
 needs per page flushing for decent performance when a large range is
 flushed. The combined cache in these systems is also more sensitive to
 non-equivalent aliases than the caches in earlier systems.

 The majority of random segmentation faults that I have looked at
 appear to be memory corruption in memory allocated using mmap and
 malloc.

 My first attempt at fixing the random faults didn't work. On
 reviewing the cache code, I realized that there were two issues
 which the existing code didn't handle correctly. Both relate
 to cache move-in. Another issue is that the present bit in PTEs
 is racy.

 1) PA-RISC caches have a mind of their own and they can speculatively
 load data and instructions for a page as long as there is a entry in
 the TLB for the page which allows move-in. TLBs are local to each
 CPU. Thus, the TLB entry for a page must be purged before flushing
 the page. This is particularly important on SMP systems.

 In some of the flush routines, the flush routine would be called
 and then the TLB entry would be purged. This was because the flush
 routine needed the TLB entry to do the flush.

 2) My initial approach to trying the fix the random faults was to
 try and use flush_cache_page_if_present for all flush operations.
 This actually made things worse and led to a couple of hardware
 lockups. It finally dawned on me that some lines weren't being
 flushed because the pte check code was racy. This resulted in
 random inequivalent mappings to physical pages.

 The __flush_cache_page tmpalias flush sets up its own TLB entry
 and it doesn't need the existing TLB entry. As long as we can find
 the pte pointer for the vm page, we can get the pfn and physical
 address of the page. We can also purge the TLB entry for the page
 before doing the flush. Further, __flush_cache_page uses a special
 TLB entry that inhibits cache move-in.

 When switching page mappings, we need to ensure that lines are
 removed from the cache.  It is not sufficient to just flush the
 lines to memory as they may come back.

 This made it clear that we needed to implement all the required
 flush operations using tmpalias routines. This includes flushes
 for user and kernel pages.

 After modifying the code to use tmpalias flushes, it became clear
 that the random segmentation faults were not fully resolved. The
 frequency of faults was worse on systems with a 64 MB L2 (PA8900)
 and systems with more CPUs (rp4440).

 The warning that I added to flush_cache_page_if_present to detect
 pages that couldn't be flushed triggered frequently on some systems.

 Helge and I looked at the pages that couldn't be flushed and found
 that the PTE was either cleared or for a swap page. Ignoring pages
 that were swapped out seemed okay but pages with cleared PTEs seemed
 problematic.

 I looked at routines related to pte_clear and noticed ptep_clear_flush.
 The default implementation just flushes the TLB entry. However, it was
 obvious that on parisc we need to flush the cache page as well. If
 we don't flush the cache page, stale lines will be left in the cache
 and cause random corruption. Once a PTE is cleared, there is no way
 to find the physical address associated with the PTE and flush the
 associated page at a later time.

 I implemented an updated change with a parisc specific version of
 ptep_clear_flush. It fixed the random data corruption on Helge's rp4440
 and rp3440, as well as on my c8000.

 At this point, I realized that I could restore the code where we only
 flush in flush_cache_page_if_present if the page has been accessed.
 However, for this, we also need to flush the cache when the accessed
 bit is cleared in ptep_clear_flush_young to keep things synchronized.
 The default implementation only flushes the TLB entry.

 Other changes in this version are:

 1) Implement parisc specific version of ptep_get. It's identical to
 default but needed in arch/parisc/include/asm/pgtable.h.
 2) Revise parisc implementation of ptep_test_and_clear_young to use
 ptep_get (READ_ONCE).
 3) Drop parisc implementation of ptep_get_and_clear. We can use default.
 4) Revise flush_kernel_vmap_range and invalidate_kernel_vmap_range to
 use full data cache flush.
 5) Move flush_cache_vmap and flush_cache_vunmap to cache.c. Handle
 VM_IOREMAP case in flush_cache_vmap.

 At this time, I don't know whether it is better to always flush when
 the PTE present bit is set or when both the accessed and present bits
 are set. The later saves flushing pages that haven't been accessed,
 but we need to flush in ptep_clear_flush_young. It also needs a page
 table lookup to find the PTE pointer. The lpa instruction only needs
 a page table lookup when the PTE entry isn't in the TLB.

 We don't atomically handle setting and clearing the _PAGE_ACCESSED bit.
 If we miss an update, we may miss a flush and the cache may get corrupted.
 Whether the current code is effectively atomic depends on process control.

 When CONFIG_FLUSH_PAGE_ACCESSED is set to zero, the page will eventually
 be flushed when the PTE is cleared or in flush_cache_page_if_present. The
 _PAGE_ACCESSED bit is not used, so the problem is avoided.

 The flush method can be selected using the CONFIG_FLUSH_PAGE_ACCESSED
 define in cache.c. The default is 0. I didn't see a large difference
 in performance.

 The Linux kernel CVE team has assigned CVE-2024-40918 to this issue.


 Affected and fixed versions
 ===========================

 	Fixed in 6.6.35 with commit 5bf196f1936bf93df31112fbdfb78c03537c07b0
 	Fixed in 6.9.6 with commit d66f2607d89f760cdffed88b22f309c895a2af20
 	Fixed in 6.10 with commit 72d95924ee35c8cd16ef52f912483ee938a34d49

 Please see https://www.kernel.org for a full list of currently supported
 kernel versions by the kernel community.

 Unaffected versions might change over time as fixes are backported to
 older supported kernel versions.  The official CVE entry at
 	https://cve.org/CVERecord/?id=CVE-2024-40918
 will be updated if fixes are backported, please check that for the most
 up to date information about this issue.


 Affected files
 ==============

 The file(s) affected by this issue are:
 	arch/parisc/include/asm/cacheflush.h
 	arch/parisc/include/asm/pgtable.h
 	arch/parisc/kernel/cache.c


 Mitigation
 ==========

 The Linux kernel CVE team recommends that you update to the latest
 stable kernel version for this, and many other bugfixes.  Individual
 changes are never tested alone, but rather are part of a larger kernel
 release.  Cherry-picking individual commits is not recommended or
 supported by the Linux kernel community at all.  If however, updating to
 the latest release is impossible, the individual changes to resolve this
 issue can be found at these commits:
 	https://git.kernel.org/stable/c/5bf196f1936bf93df31112fbdfb78c03537c07b0
 	https://git.kernel.org/stable/c/d66f2607d89f760cdffed88b22f309c895a2af20
 	https://git.kernel.org/stable/c/72d95924ee35c8cd16ef52f912483ee938a34d49
	From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001
	From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	To: <linux-cve-announce@vger.kernel.org>
	Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
	Subject: CVE-2024-40918: parisc: Try to fix random segmentation faults in package builds

	Description
	===========

	In the Linux kernel, the following vulnerability has been resolved:

	parisc: Try to fix random segmentation faults in package builds

	PA-RISC systems with PA8800 and PA8900 processors have had problems
	with random segmentation faults for many years. Systems with earlier
	processors are much more stable.

	Systems with PA8800 and PA8900 processors have a large L2 cache which
	needs per page flushing for decent performance when a large range is
	flushed. The combined cache in these systems is also more sensitive to
	non-equivalent aliases than the caches in earlier systems.

	The majority of random segmentation faults that I have looked at
	appear to be memory corruption in memory allocated using mmap and
	malloc.

	My first attempt at fixing the random faults didn't work. On
	reviewing the cache code, I realized that there were two issues
	which the existing code didn't handle correctly. Both relate
	to cache move-in. Another issue is that the present bit in PTEs
	is racy.

	1) PA-RISC caches have a mind of their own and they can speculatively
	load data and instructions for a page as long as there is a entry in
	the TLB for the page which allows move-in. TLBs are local to each
	CPU. Thus, the TLB entry for a page must be purged before flushing
	the page. This is particularly important on SMP systems.

	In some of the flush routines, the flush routine would be called
	and then the TLB entry would be purged. This was because the flush
	routine needed the TLB entry to do the flush.

	2) My initial approach to trying the fix the random faults was to
	try and use flush_cache_page_if_present for all flush operations.
	This actually made things worse and led to a couple of hardware
	lockups. It finally dawned on me that some lines weren't being
	flushed because the pte check code was racy. This resulted in
	random inequivalent mappings to physical pages.

	The __flush_cache_page tmpalias flush sets up its own TLB entry
	and it doesn't need the existing TLB entry. As long as we can find
	the pte pointer for the vm page, we can get the pfn and physical
	address of the page. We can also purge the TLB entry for the page
	before doing the flush. Further, __flush_cache_page uses a special
	TLB entry that inhibits cache move-in.

	When switching page mappings, we need to ensure that lines are
	removed from the cache. It is not sufficient to just flush the
	lines to memory as they may come back.

	This made it clear that we needed to implement all the required
	flush operations using tmpalias routines. This includes flushes
	for user and kernel pages.

	After modifying the code to use tmpalias flushes, it became clear
	that the random segmentation faults were not fully resolved. The
	frequency of faults was worse on systems with a 64 MB L2 (PA8900)
	and systems with more CPUs (rp4440).

	The warning that I added to flush_cache_page_if_present to detect
	pages that couldn't be flushed triggered frequently on some systems.

	Helge and I looked at the pages that couldn't be flushed and found
	that the PTE was either cleared or for a swap page. Ignoring pages
	that were swapped out seemed okay but pages with cleared PTEs seemed
	problematic.

	I looked at routines related to pte_clear and noticed ptep_clear_flush.
	The default implementation just flushes the TLB entry. However, it was
	obvious that on parisc we need to flush the cache page as well. If
	we don't flush the cache page, stale lines will be left in the cache
	and cause random corruption. Once a PTE is cleared, there is no way
	to find the physical address associated with the PTE and flush the
	associated page at a later time.

	I implemented an updated change with a parisc specific version of
	ptep_clear_flush. It fixed the random data corruption on Helge's rp4440
	and rp3440, as well as on my c8000.

	At this point, I realized that I could restore the code where we only
	flush in flush_cache_page_if_present if the page has been accessed.
	However, for this, we also need to flush the cache when the accessed
	bit is cleared in ptep_clear_flush_young to keep things synchronized.
	The default implementation only flushes the TLB entry.

	Other changes in this version are:

	1) Implement parisc specific version of ptep_get. It's identical to
	default but needed in arch/parisc/include/asm/pgtable.h.
	2) Revise parisc implementation of ptep_test_and_clear_young to use
	ptep_get (READ_ONCE).
	3) Drop parisc implementation of ptep_get_and_clear. We can use default.
	4) Revise flush_kernel_vmap_range and invalidate_kernel_vmap_range to
	use full data cache flush.
	5) Move flush_cache_vmap and flush_cache_vunmap to cache.c. Handle
	VM_IOREMAP case in flush_cache_vmap.

	At this time, I don't know whether it is better to always flush when
	the PTE present bit is set or when both the accessed and present bits
	are set. The later saves flushing pages that haven't been accessed,
	but we need to flush in ptep_clear_flush_young. It also needs a page
	table lookup to find the PTE pointer. The lpa instruction only needs
	a page table lookup when the PTE entry isn't in the TLB.

	We don't atomically handle setting and clearing the _PAGE_ACCESSED bit.
	If we miss an update, we may miss a flush and the cache may get corrupted.
	Whether the current code is effectively atomic depends on process control.

	When CONFIG_FLUSH_PAGE_ACCESSED is set to zero, the page will eventually
	be flushed when the PTE is cleared or in flush_cache_page_if_present. The
	_PAGE_ACCESSED bit is not used, so the problem is avoided.

	The flush method can be selected using the CONFIG_FLUSH_PAGE_ACCESSED
	define in cache.c. The default is 0. I didn't see a large difference
	in performance.

	The Linux kernel CVE team has assigned CVE-2024-40918 to this issue.


	Affected and fixed versions
	===========================

	Fixed in 6.6.35 with commit 5bf196f1936bf93df31112fbdfb78c03537c07b0
	Fixed in 6.9.6 with commit d66f2607d89f760cdffed88b22f309c895a2af20
	Fixed in 6.10 with commit 72d95924ee35c8cd16ef52f912483ee938a34d49

	Please see https://www.kernel.org for a full list of currently supported
	kernel versions by the kernel community.

	Unaffected versions might change over time as fixes are backported to
	older supported kernel versions. The official CVE entry at
	https://cve.org/CVERecord/?id=CVE-2024-40918
	will be updated if fixes are backported, please check that for the most
	up to date information about this issue.


	Affected files
	==============

	The file(s) affected by this issue are:
	arch/parisc/include/asm/cacheflush.h
	arch/parisc/include/asm/pgtable.h
	arch/parisc/kernel/cache.c


	Mitigation
	==========

	The Linux kernel CVE team recommends that you update to the latest
	stable kernel version for this, and many other bugfixes. Individual
	changes are never tested alone, but rather are part of a larger kernel
	release. Cherry-picking individual commits is not recommended or
	supported by the Linux kernel community at all. If however, updating to
	the latest release is impossible, the individual changes to resolve this
	issue can be found at these commits:
	https://git.kernel.org/stable/c/5bf196f1936bf93df31112fbdfb78c03537c07b0
	https://git.kernel.org/stable/c/d66f2607d89f760cdffed88b22f309c895a2af20
	https://git.kernel.org/stable/c/72d95924ee35c8cd16ef52f912483ee938a34d49