| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-42266: btrfs: make cow_file_range_inline() honor locked_page on error |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| btrfs: make cow_file_range_inline() honor locked_page on error |
| |
| The btrfs buffered write path runs through __extent_writepage() which |
| has some tricky return value handling for writepage_delalloc(). |
| Specifically, when that returns 1, we exit, but for other return values |
| we continue and end up calling btrfs_folio_end_all_writers(). If the |
| folio has been unlocked (note that we check the PageLocked bit at the |
| start of __extent_writepage()), this results in an assert panic like |
| this one from syzbot: |
| |
| BTRFS: error (device loop0 state EAL) in free_log_tree:3267: errno=-5 IO failure |
| BTRFS warning (device loop0 state EAL): Skipping commit of aborted transaction. |
| BTRFS: error (device loop0 state EAL) in cleanup_transaction:2018: errno=-5 IO failure |
| assertion failed: folio_test_locked(folio), in fs/btrfs/subpage.c:871 |
| ------------[ cut here ]------------ |
| kernel BUG at fs/btrfs/subpage.c:871! |
| Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI |
| CPU: 1 PID: 5090 Comm: syz-executor225 Not tainted |
| 6.10.0-syzkaller-05505-gb1bc554e009e #0 |
| Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS |
| Google 06/27/2024 |
| RIP: 0010:btrfs_folio_end_all_writers+0x55b/0x610 fs/btrfs/subpage.c:871 |
| Code: e9 d3 fb ff ff e8 25 22 c2 fd 48 c7 c7 c0 3c 0e 8c 48 c7 c6 80 3d |
| 0e 8c 48 c7 c2 60 3c 0e 8c b9 67 03 00 00 e8 66 47 ad 07 90 <0f> 0b e8 |
| 6e 45 b0 07 4c 89 ff be 08 00 00 00 e8 21 12 25 fe 4c 89 |
| RSP: 0018:ffffc900033d72e0 EFLAGS: 00010246 |
| RAX: 0000000000000045 RBX: 00fff0000000402c RCX: 663b7a08c50a0a00 |
| RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 |
| RBP: ffffc900033d73b0 R08: ffffffff8176b98c R09: 1ffff9200067adfc |
| R10: dffffc0000000000 R11: fffff5200067adfd R12: 0000000000000001 |
| R13: dffffc0000000000 R14: 0000000000000000 R15: ffffea0001cbee80 |
| FS: 0000000000000000(0000) GS:ffff8880b9500000(0000) |
| knlGS:0000000000000000 |
| CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 |
| CR2: 00007f5f076012f8 CR3: 000000000e134000 CR4: 00000000003506f0 |
| DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 |
| DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 |
| Call Trace: |
| <TASK> |
| __extent_writepage fs/btrfs/extent_io.c:1597 [inline] |
| extent_write_cache_pages fs/btrfs/extent_io.c:2251 [inline] |
| btrfs_writepages+0x14d7/0x2760 fs/btrfs/extent_io.c:2373 |
| do_writepages+0x359/0x870 mm/page-writeback.c:2656 |
| filemap_fdatawrite_wbc+0x125/0x180 mm/filemap.c:397 |
| __filemap_fdatawrite_range mm/filemap.c:430 [inline] |
| __filemap_fdatawrite mm/filemap.c:436 [inline] |
| filemap_flush+0xdf/0x130 mm/filemap.c:463 |
| btrfs_release_file+0x117/0x130 fs/btrfs/file.c:1547 |
| __fput+0x24a/0x8a0 fs/file_table.c:422 |
| task_work_run+0x24f/0x310 kernel/task_work.c:222 |
| exit_task_work include/linux/task_work.h:40 [inline] |
| do_exit+0xa2f/0x27f0 kernel/exit.c:877 |
| do_group_exit+0x207/0x2c0 kernel/exit.c:1026 |
| __do_sys_exit_group kernel/exit.c:1037 [inline] |
| __se_sys_exit_group kernel/exit.c:1035 [inline] |
| __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1035 |
| x64_sys_call+0x2634/0x2640 |
| arch/x86/include/generated/asm/syscalls_64.h:232 |
| do_syscall_x64 arch/x86/entry/common.c:52 [inline] |
| do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 |
| entry_SYSCALL_64_after_hwframe+0x77/0x7f |
| RIP: 0033:0x7f5f075b70c9 |
| Code: Unable to access opcode bytes at |
| 0x7f5f075b709f. |
| |
| I was hitting the same issue by doing hundreds of accelerated runs of |
| generic/475, which also hits IO errors by design. |
| |
| I instrumented that reproducer with bpftrace and found that the |
| undesirable folio_unlock was coming from the following callstack: |
| |
| folio_unlock+5 |
| __process_pages_contig+475 |
| cow_file_range_inline.constprop.0+230 |
| cow_file_range+803 |
| btrfs_run_delalloc_range+566 |
| writepage_delalloc+332 |
| __extent_writepage # inlined in my stacktrace, but I added it here |
| extent_write_cache_pages+622 |
| |
| Looking at the bisected-to patch in the syzbot report, Josef realized |
| that the logic of the cow_file_range_inline error path subtly changing. |
| In the past, on error, it jumped to out_unlock in cow_file_range(), |
| which honors the locked_page, so when we ultimately call |
| folio_end_all_writers(), the folio of interest is still locked. After |
| the change, we always unlocked ignoring the locked_page, on both success |
| and error. On the success path, this all results in returning 1 to |
| __extent_writepage(), which skips the folio_end_all_writers() call, |
| which makes it OK to have unlocked. |
| |
| Fix the bug by wiring the locked_page into cow_file_range_inline() and |
| only setting locked_page to NULL on success. |
| |
| The Linux kernel CVE team has assigned CVE-2024-42266 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.10 with commit 0586d0a89e77d717da14df42648ace4a9fd67981 and fixed in 6.10.4 with commit 061e41581606000a83ce0f0f01d6ad338f3704e9 |
| Issue introduced in 6.10 with commit 0586d0a89e77d717da14df42648ace4a9fd67981 and fixed in 6.11 with commit 478574370bef7951fbd9ef5155537d6cbed49472 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-42266 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| fs/btrfs/inode.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/061e41581606000a83ce0f0f01d6ad338f3704e9 |
| https://git.kernel.org/stable/c/478574370bef7951fbd9ef5155537d6cbed49472 |