| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-50022: device-dax: correct pgoff align in dax_set_mapping() |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| device-dax: correct pgoff align in dax_set_mapping() |
| |
| pgoff should be aligned using ALIGN_DOWN() instead of ALIGN(). Otherwise, |
| vmf->address not aligned to fault_size will be aligned to the next |
| alignment, that can result in memory failure getting the wrong address. |
| |
| It's a subtle situation that only can be observed in |
| page_mapped_in_vma() after the page is page fault handled by |
| dev_dax_huge_fault. Generally, there is little chance to perform |
| page_mapped_in_vma in dev-dax's page unless in specific error injection |
| to the dax device to trigger an MCE - memory-failure. In that case, |
| page_mapped_in_vma() will be triggered to determine which task is |
| accessing the failure address and kill that task in the end. |
| |
| |
| We used self-developed dax device (which is 2M aligned mapping) , to |
| perform error injection to random address. It turned out that error |
| injected to non-2M-aligned address was causing endless MCE until panic. |
| Because page_mapped_in_vma() kept resulting wrong address and the task |
| accessing the failure address was never killed properly: |
| |
| |
| [ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3784.049006] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3784.448042] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3784.792026] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3785.162502] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3785.461116] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3785.764730] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3786.042128] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3786.464293] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3786.818090] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| [ 3787.085297] mce: Uncorrected hardware memory error in user-access at |
| 200c9742380 |
| [ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page: |
| Recovered |
| |
| It took us several weeks to pinpoint this problem, but we eventually |
| used bpftrace to trace the page fault and mce address and successfully |
| identified the issue. |
| |
| |
| Joao added: |
| |
| ; Likely we never reproduce in production because we always pin |
| : device-dax regions in the region align they provide (Qemu does |
| : similarly with prealloc in hugetlb/file backed memory). I think this |
| : bug requires that we touch *unpinned* device-dax regions unaligned to |
| : the device-dax selected alignment (page size i.e. 4K/2M/1G) |
| |
| The Linux kernel CVE team has assigned CVE-2024-50022 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.17 with commit b9b5777f09be84d0de472ded2253d2f5101427f2 and fixed in 6.1.113 with commit 9c4198dfdca818c5ce19c764d90eabd156bbc6da |
| Issue introduced in 5.17 with commit b9b5777f09be84d0de472ded2253d2f5101427f2 and fixed in 6.6.57 with commit b822007e8db341d6f175c645ed79866db501ad86 |
| Issue introduced in 5.17 with commit b9b5777f09be84d0de472ded2253d2f5101427f2 and fixed in 6.11.4 with commit e877427d218159ac29c9326100920d24330c9ee6 |
| Issue introduced in 5.17 with commit b9b5777f09be84d0de472ded2253d2f5101427f2 and fixed in 6.12 with commit 7fcbd9785d4c17ea533c42f20a9083a83f301fa6 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-50022 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| drivers/dax/device.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/9c4198dfdca818c5ce19c764d90eabd156bbc6da |
| https://git.kernel.org/stable/c/b822007e8db341d6f175c645ed79866db501ad86 |
| https://git.kernel.org/stable/c/e877427d218159ac29c9326100920d24330c9ee6 |
| https://git.kernel.org/stable/c/7fcbd9785d4c17ea533c42f20a9083a83f301fa6 |