| From bippy-1.2.0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@kernel.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2025-37931: btrfs: adjust subpage bit start based on sectorsize |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| btrfs: adjust subpage bit start based on sectorsize |
| |
| When running machines with 64k page size and a 16k nodesize we started |
| seeing tree log corruption in production. This turned out to be because |
| we were not writing out dirty blocks sometimes, so this in fact affects |
| all metadata writes. |
| |
| When writing out a subpage EB we scan the subpage bitmap for a dirty |
| range. If the range isn't dirty we do |
| |
| bit_start++; |
| |
| to move onto the next bit. The problem is the bitmap is based on the |
| number of sectors that an EB has. So in this case, we have a 64k |
| pagesize, 16k nodesize, but a 4k sectorsize. This means our bitmap is 4 |
| bits for every node. With a 64k page size we end up with 4 nodes per |
| page. |
| |
| To make this easier this is how everything looks |
| |
| [0 16k 32k 48k ] logical address |
| [0 4 8 12 ] radix tree offset |
| [ 64k page ] folio |
| [ 16k eb ][ 16k eb ][ 16k eb ][ 16k eb ] extent buffers |
| [ | | | | | | | | | | | | | | | | ] bitmap |
| |
| Now we use all of our addressing based on fs_info->sectorsize_bits, so |
| as you can see the above our 16k eb->start turns into radix entry 4. |
| |
| When we find a dirty range for our eb, we correctly do bit_start += |
| sectors_per_node, because if we start at bit 0, the next bit for the |
| next eb is 4, to correspond to eb->start 16k. |
| |
| However if our range is clean, we will do bit_start++, which will now |
| put us offset from our radix tree entries. |
| |
| In our case, assume that the first time we check the bitmap the block is |
| not dirty, we increment bit_start so now it == 1, and then we loop |
| around and check again. This time it is dirty, and we go to find that |
| start using the following equation |
| |
| start = folio_start + bit_start * fs_info->sectorsize; |
| |
| so in the case above, eb->start 0 is now dirty, and we calculate start |
| as |
| |
| 0 + 1 * fs_info->sectorsize = 4096 |
| 4096 >> 12 = 1 |
| |
| Now we're looking up the radix tree for 1, and we won't find an eb. |
| What's worse is now we're using bit_start == 1, so we do bit_start += |
| sectors_per_node, which is now 5. If that eb is dirty we will run into |
| the same thing, we will look at an offset that is not populated in the |
| radix tree, and now we're skipping the writeout of dirty extent buffers. |
| |
| The best fix for this is to not use sectorsize_bits to address nodes, |
| but that's a larger change. Since this is a fs corruption problem fix |
| it simply by always using sectors_per_node to increment the start bit. |
| |
| The Linux kernel CVE team has assigned CVE-2025-37931 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.13 with commit c4aec299fa8f73f0fd10bc556f936f0da50e3e83 and fixed in 6.12.28 with commit b80db09b614cb7edec5bada1bc7c7b0eb3b453ea |
| Issue introduced in 5.13 with commit c4aec299fa8f73f0fd10bc556f936f0da50e3e83 and fixed in 6.14.6 with commit 396f4002710030ea1cfd4c789ebaf0a6969ab34f |
| Issue introduced in 5.13 with commit c4aec299fa8f73f0fd10bc556f936f0da50e3e83 and fixed in 6.15 with commit e08e49d986f82c30f42ad0ed43ebbede1e1e3739 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2025-37931 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| fs/btrfs/extent_io.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/b80db09b614cb7edec5bada1bc7c7b0eb3b453ea |
| https://git.kernel.org/stable/c/396f4002710030ea1cfd4c789ebaf0a6969ab34f |
| https://git.kernel.org/stable/c/e08e49d986f82c30f42ad0ed43ebbede1e1e3739 |