| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-40918: parisc: Try to fix random segmentation faults in package builds |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| parisc: Try to fix random segmentation faults in package builds |
| |
| PA-RISC systems with PA8800 and PA8900 processors have had problems |
| with random segmentation faults for many years. Systems with earlier |
| processors are much more stable. |
| |
| Systems with PA8800 and PA8900 processors have a large L2 cache which |
| needs per page flushing for decent performance when a large range is |
| flushed. The combined cache in these systems is also more sensitive to |
| non-equivalent aliases than the caches in earlier systems. |
| |
| The majority of random segmentation faults that I have looked at |
| appear to be memory corruption in memory allocated using mmap and |
| malloc. |
| |
| My first attempt at fixing the random faults didn't work. On |
| reviewing the cache code, I realized that there were two issues |
| which the existing code didn't handle correctly. Both relate |
| to cache move-in. Another issue is that the present bit in PTEs |
| is racy. |
| |
| 1) PA-RISC caches have a mind of their own and they can speculatively |
| load data and instructions for a page as long as there is a entry in |
| the TLB for the page which allows move-in. TLBs are local to each |
| CPU. Thus, the TLB entry for a page must be purged before flushing |
| the page. This is particularly important on SMP systems. |
| |
| In some of the flush routines, the flush routine would be called |
| and then the TLB entry would be purged. This was because the flush |
| routine needed the TLB entry to do the flush. |
| |
| 2) My initial approach to trying the fix the random faults was to |
| try and use flush_cache_page_if_present for all flush operations. |
| This actually made things worse and led to a couple of hardware |
| lockups. It finally dawned on me that some lines weren't being |
| flushed because the pte check code was racy. This resulted in |
| random inequivalent mappings to physical pages. |
| |
| The __flush_cache_page tmpalias flush sets up its own TLB entry |
| and it doesn't need the existing TLB entry. As long as we can find |
| the pte pointer for the vm page, we can get the pfn and physical |
| address of the page. We can also purge the TLB entry for the page |
| before doing the flush. Further, __flush_cache_page uses a special |
| TLB entry that inhibits cache move-in. |
| |
| When switching page mappings, we need to ensure that lines are |
| removed from the cache. It is not sufficient to just flush the |
| lines to memory as they may come back. |
| |
| This made it clear that we needed to implement all the required |
| flush operations using tmpalias routines. This includes flushes |
| for user and kernel pages. |
| |
| After modifying the code to use tmpalias flushes, it became clear |
| that the random segmentation faults were not fully resolved. The |
| frequency of faults was worse on systems with a 64 MB L2 (PA8900) |
| and systems with more CPUs (rp4440). |
| |
| The warning that I added to flush_cache_page_if_present to detect |
| pages that couldn't be flushed triggered frequently on some systems. |
| |
| Helge and I looked at the pages that couldn't be flushed and found |
| that the PTE was either cleared or for a swap page. Ignoring pages |
| that were swapped out seemed okay but pages with cleared PTEs seemed |
| problematic. |
| |
| I looked at routines related to pte_clear and noticed ptep_clear_flush. |
| The default implementation just flushes the TLB entry. However, it was |
| obvious that on parisc we need to flush the cache page as well. If |
| we don't flush the cache page, stale lines will be left in the cache |
| and cause random corruption. Once a PTE is cleared, there is no way |
| to find the physical address associated with the PTE and flush the |
| associated page at a later time. |
| |
| I implemented an updated change with a parisc specific version of |
| ptep_clear_flush. It fixed the random data corruption on Helge's rp4440 |
| and rp3440, as well as on my c8000. |
| |
| At this point, I realized that I could restore the code where we only |
| flush in flush_cache_page_if_present if the page has been accessed. |
| However, for this, we also need to flush the cache when the accessed |
| bit is cleared in ptep_clear_flush_young to keep things synchronized. |
| The default implementation only flushes the TLB entry. |
| |
| Other changes in this version are: |
| |
| 1) Implement parisc specific version of ptep_get. It's identical to |
| default but needed in arch/parisc/include/asm/pgtable.h. |
| 2) Revise parisc implementation of ptep_test_and_clear_young to use |
| ptep_get (READ_ONCE). |
| 3) Drop parisc implementation of ptep_get_and_clear. We can use default. |
| 4) Revise flush_kernel_vmap_range and invalidate_kernel_vmap_range to |
| use full data cache flush. |
| 5) Move flush_cache_vmap and flush_cache_vunmap to cache.c. Handle |
| VM_IOREMAP case in flush_cache_vmap. |
| |
| At this time, I don't know whether it is better to always flush when |
| the PTE present bit is set or when both the accessed and present bits |
| are set. The later saves flushing pages that haven't been accessed, |
| but we need to flush in ptep_clear_flush_young. It also needs a page |
| table lookup to find the PTE pointer. The lpa instruction only needs |
| a page table lookup when the PTE entry isn't in the TLB. |
| |
| We don't atomically handle setting and clearing the _PAGE_ACCESSED bit. |
| If we miss an update, we may miss a flush and the cache may get corrupted. |
| Whether the current code is effectively atomic depends on process control. |
| |
| When CONFIG_FLUSH_PAGE_ACCESSED is set to zero, the page will eventually |
| be flushed when the PTE is cleared or in flush_cache_page_if_present. The |
| _PAGE_ACCESSED bit is not used, so the problem is avoided. |
| |
| The flush method can be selected using the CONFIG_FLUSH_PAGE_ACCESSED |
| define in cache.c. The default is 0. I didn't see a large difference |
| in performance. |
| |
| The Linux kernel CVE team has assigned CVE-2024-40918 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Fixed in 6.6.35 with commit 5bf196f1936bf93df31112fbdfb78c03537c07b0 |
| Fixed in 6.9.6 with commit d66f2607d89f760cdffed88b22f309c895a2af20 |
| Fixed in 6.10 with commit 72d95924ee35c8cd16ef52f912483ee938a34d49 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-40918 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| arch/parisc/include/asm/cacheflush.h |
| arch/parisc/include/asm/pgtable.h |
| arch/parisc/kernel/cache.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/5bf196f1936bf93df31112fbdfb78c03537c07b0 |
| https://git.kernel.org/stable/c/d66f2607d89f760cdffed88b22f309c895a2af20 |
| https://git.kernel.org/stable/c/72d95924ee35c8cd16ef52f912483ee938a34d49 |