| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2022-48903: btrfs: fix relocation crash due to premature return from btrfs_commit_transaction() |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| btrfs: fix relocation crash due to premature return from btrfs_commit_transaction() |
| |
| We are seeing crashes similar to the following trace: |
| |
| [38.969182] WARNING: CPU: 20 PID: 2105 at fs/btrfs/relocation.c:4070 btrfs_relocate_block_group+0x2dc/0x340 [btrfs] |
| [38.973556] CPU: 20 PID: 2105 Comm: btrfs Not tainted 5.17.0-rc4 #54 |
| [38.974580] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 |
| [38.976539] RIP: 0010:btrfs_relocate_block_group+0x2dc/0x340 [btrfs] |
| [38.980336] RSP: 0000:ffffb0dd42e03c20 EFLAGS: 00010206 |
| [38.981218] RAX: ffff96cfc4ede800 RBX: ffff96cfc3ce0000 RCX: 000000000002ca14 |
| [38.982560] RDX: 0000000000000000 RSI: 4cfd109a0bcb5d7f RDI: ffff96cfc3ce0360 |
| [38.983619] RBP: ffff96cfc309c000 R08: 0000000000000000 R09: 0000000000000000 |
| [38.984678] R10: ffff96cec0000001 R11: ffffe84c80000000 R12: ffff96cfc4ede800 |
| [38.985735] R13: 0000000000000000 R14: 0000000000000000 R15: ffff96cfc3ce0360 |
| [38.987146] FS: 00007f11c15218c0(0000) GS:ffff96d6dfb00000(0000) knlGS:0000000000000000 |
| [38.988662] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 |
| [38.989398] CR2: 00007ffc922c8e60 CR3: 00000001147a6001 CR4: 0000000000370ee0 |
| [38.990279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 |
| [38.991219] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 |
| [38.992528] Call Trace: |
| [38.992854] <TASK> |
| [38.993148] btrfs_relocate_chunk+0x27/0xe0 [btrfs] |
| [38.993941] btrfs_balance+0x78e/0xea0 [btrfs] |
| [38.994801] ? vsnprintf+0x33c/0x520 |
| [38.995368] ? __kmalloc_track_caller+0x351/0x440 |
| [38.996198] btrfs_ioctl_balance+0x2b9/0x3a0 [btrfs] |
| [38.997084] btrfs_ioctl+0x11b0/0x2da0 [btrfs] |
| [38.997867] ? mod_objcg_state+0xee/0x340 |
| [38.998552] ? seq_release+0x24/0x30 |
| [38.999184] ? proc_nr_files+0x30/0x30 |
| [38.999654] ? call_rcu+0xc8/0x2f0 |
| [39.000228] ? __x64_sys_ioctl+0x84/0xc0 |
| [39.000872] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] |
| [39.001973] __x64_sys_ioctl+0x84/0xc0 |
| [39.002566] do_syscall_64+0x3a/0x80 |
| [39.003011] entry_SYSCALL_64_after_hwframe+0x44/0xae |
| [39.003735] RIP: 0033:0x7f11c166959b |
| [39.007324] RSP: 002b:00007fff2543e998 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 |
| [39.008521] RAX: ffffffffffffffda RBX: 00007f11c1521698 RCX: 00007f11c166959b |
| [39.009833] RDX: 00007fff2543ea40 RSI: 00000000c4009420 RDI: 0000000000000003 |
| [39.011270] RBP: 0000000000000003 R08: 0000000000000013 R09: 00007f11c16f94e0 |
| [39.012581] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff25440df3 |
| [39.014046] R13: 0000000000000000 R14: 00007fff2543ea40 R15: 0000000000000001 |
| [39.015040] </TASK> |
| [39.015418] ---[ end trace 0000000000000000 ]--- |
| [43.131559] ------------[ cut here ]------------ |
| [43.132234] kernel BUG at fs/btrfs/extent-tree.c:2717! |
| [43.133031] invalid opcode: 0000 [#1] PREEMPT SMP PTI |
| [43.133702] CPU: 1 PID: 1839 Comm: btrfs Tainted: G W 5.17.0-rc4 #54 |
| [43.134863] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 |
| [43.136426] RIP: 0010:unpin_extent_range+0x37a/0x4f0 [btrfs] |
| [43.139913] RSP: 0000:ffffb0dd4216bc70 EFLAGS: 00010246 |
| [43.140629] RAX: 0000000000000000 RBX: ffff96cfc34490f8 RCX: 0000000000000001 |
| [43.141604] RDX: 0000000080000001 RSI: 0000000051d00000 RDI: 00000000ffffffff |
| [43.142645] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff96cfd07dca50 |
| [43.143669] R10: ffff96cfc46e8a00 R11: fffffffffffec000 R12: 0000000041d00000 |
| [43.144657] R13: ffff96cfc3ce0000 R14: ffffb0dd4216bd08 R15: 0000000000000000 |
| [43.145686] FS: 00007f7657dd68c0(0000) GS:ffff96d6df640000(0000) knlGS:0000000000000000 |
| [43.146808] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 |
| [43.147584] CR2: 00007f7fe81bf5b0 CR3: 00000001093ee004 CR4: 0000000000370ee0 |
| [43.148589] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 |
| [43.149581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 |
| [43.150559] Call Trace: |
| [43.150904] <TASK> |
| [43.151253] btrfs_finish_extent_commit+0x88/0x290 [btrfs] |
| [43.152127] btrfs_commit_transaction+0x74f/0xaa0 [btrfs] |
| [43.152932] ? btrfs_attach_transaction_barrier+0x1e/0x50 [btrfs] |
| [43.153786] btrfs_ioctl+0x1edc/0x2da0 [btrfs] |
| [43.154475] ? __check_object_size+0x150/0x170 |
| [43.155170] ? preempt_count_add+0x49/0xa0 |
| [43.155753] ? __x64_sys_ioctl+0x84/0xc0 |
| [43.156437] ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs] |
| [43.157456] __x64_sys_ioctl+0x84/0xc0 |
| [43.157980] do_syscall_64+0x3a/0x80 |
| [43.158543] entry_SYSCALL_64_after_hwframe+0x44/0xae |
| [43.159231] RIP: 0033:0x7f7657f1e59b |
| [43.161819] RSP: 002b:00007ffda5cd1658 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 |
| [43.162702] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f7657f1e59b |
| [43.163526] RDX: 0000000000000000 RSI: 0000000000009408 RDI: 0000000000000003 |
| [43.164358] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 |
| [43.165208] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 |
| [43.166029] R13: 00005621b91c3232 R14: 00005621b91ba580 R15: 00007ffda5cd1800 |
| [43.166863] </TASK> |
| [43.167125] Modules linked in: btrfs blake2b_generic xor pata_acpi ata_piix libata raid6_pq scsi_mod libcrc32c virtio_net virtio_rng net_failover rng_core failover scsi_common |
| [43.169552] ---[ end trace 0000000000000000 ]--- |
| [43.171226] RIP: 0010:unpin_extent_range+0x37a/0x4f0 [btrfs] |
| [43.174767] RSP: 0000:ffffb0dd4216bc70 EFLAGS: 00010246 |
| [43.175600] RAX: 0000000000000000 RBX: ffff96cfc34490f8 RCX: 0000000000000001 |
| [43.176468] RDX: 0000000080000001 RSI: 0000000051d00000 RDI: 00000000ffffffff |
| [43.177357] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff96cfd07dca50 |
| [43.178271] R10: ffff96cfc46e8a00 R11: fffffffffffec000 R12: 0000000041d00000 |
| [43.179178] R13: ffff96cfc3ce0000 R14: ffffb0dd4216bd08 R15: 0000000000000000 |
| [43.180071] FS: 00007f7657dd68c0(0000) GS:ffff96d6df800000(0000) knlGS:0000000000000000 |
| [43.181073] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 |
| [43.181808] CR2: 00007fe09905f010 CR3: 00000001093ee004 CR4: 0000000000370ee0 |
| [43.182706] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 |
| [43.183591] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 |
| |
| We first hit the WARN_ON(rc->block_group->pinned > 0) in |
| btrfs_relocate_block_group() and then the BUG_ON(!cache) in |
| unpin_extent_range(). This tells us that we are exiting relocation and |
| removing the block group with bytes still pinned for that block group. |
| This is supposed to be impossible: the last thing relocate_block_group() |
| does is commit the transaction to get rid of pinned extents. |
| |
| Commit d0c2f4fa555e ("btrfs: make concurrent fsyncs wait less when |
| waiting for a transaction commit") introduced an optimization so that |
| commits from fsync don't have to wait for the previous commit to unpin |
| extents. This was only intended to affect fsync, but it inadvertently |
| made it possible for any commit to skip waiting for the previous commit |
| to unpin. This is because if a call to btrfs_commit_transaction() finds |
| that another thread is already committing the transaction, it waits for |
| the other thread to complete the commit and then returns. If that other |
| thread was in fsync, then it completes the commit without completing the |
| previous commit. This makes the following sequence of events possible: |
| |
| Thread 1____________________|Thread 2 (fsync)_____________________|Thread 3 (balance)___________________ |
| btrfs_commit_transaction(N) | | |
| btrfs_run_delayed_refs | | |
| pin extents | | |
| ... | | |
| state = UNBLOCKED |btrfs_sync_file | |
| | btrfs_start_transaction(N + 1) |relocate_block_group |
| | | btrfs_join_transaction(N + 1) |
| | btrfs_commit_transaction(N + 1) | |
| ... | trans->state = COMMIT_START | |
| | | btrfs_commit_transaction(N + 1) |
| | | wait_for_commit(N + 1, COMPLETED) |
| | wait_for_commit(N, SUPER_COMMITTED)| |
| state = SUPER_COMMITTED | ... | |
| btrfs_finish_extent_commit| | |
| unpin_extent_range() | trans->state = COMPLETED | |
| | | return |
| | | |
| ... | |Thread 1 isn't done, so pinned > 0 |
| | |and we WARN |
| | | |
| | |btrfs_remove_block_group |
| unpin_extent_range() | | |
| Thread 3 removed the | | |
| block group, so we BUG| | |
| |
| There are other sequences involving SUPER_COMMITTED transactions that |
| can cause a similar outcome. |
| |
| We could fix this by making relocation explicitly wait for unpinning, |
| but there may be other cases that need it. Josef mentioned ENOSPC |
| flushing and the free space cache inode as other potential victims. |
| Rather than playing whack-a-mole, this fix is conservative and makes all |
| commits not in fsync wait for all previous transactions, which is what |
| the optimization intended. |
| |
| The Linux kernel CVE team has assigned CVE-2022-48903 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.12 with commit d0c2f4fa555e70324ec2a129b822ab58f172cc62 and fixed in 5.15.27 with commit 725a6ac389b182261af174176e561a36b0f39ffc |
| Issue introduced in 5.12 with commit d0c2f4fa555e70324ec2a129b822ab58f172cc62 and fixed in 5.16.13 with commit a4378947ae39f08c6ae4c6a87ccdebc981a7bbcb |
| Issue introduced in 5.12 with commit d0c2f4fa555e70324ec2a129b822ab58f172cc62 and fixed in 5.17 with commit 5fd76bf31ccfecc06e2e6b29f8c809e934085b99 |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2022-48903 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| fs/btrfs/transaction.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/725a6ac389b182261af174176e561a36b0f39ffc |
| https://git.kernel.org/stable/c/a4378947ae39f08c6ae4c6a87ccdebc981a7bbcb |
| https://git.kernel.org/stable/c/5fd76bf31ccfecc06e2e6b29f8c809e934085b99 |