| From b53cf7796169f16bc4770c597cd8d88a59a63a7e Mon Sep 17 00:00:00 2001 |
| From: Qu Wenruo <wqu@suse.com> |
| Date: Tue, 16 Jun 2020 10:17:34 +0800 |
| Subject: [PATCH] btrfs: don't allocate anonymous block device for user |
| invisible roots |
| |
| commit 851fd730a743e072badaf67caf39883e32439431 upstream. |
| |
| [BUG] |
| When a lot of subvolumes are created, there is a user report about |
| transaction aborted: |
| |
| BTRFS: Transaction aborted (error -24) |
| WARNING: CPU: 17 PID: 17041 at fs/btrfs/transaction.c:1576 create_pending_snapshot+0xbc4/0xd10 [btrfs] |
| RIP: 0010:create_pending_snapshot+0xbc4/0xd10 [btrfs] |
| Call Trace: |
| create_pending_snapshots+0x82/0xa0 [btrfs] |
| btrfs_commit_transaction+0x275/0x8c0 [btrfs] |
| btrfs_mksubvol+0x4b9/0x500 [btrfs] |
| btrfs_ioctl_snap_create_transid+0x174/0x180 [btrfs] |
| btrfs_ioctl_snap_create_v2+0x11c/0x180 [btrfs] |
| btrfs_ioctl+0x11a4/0x2da0 [btrfs] |
| do_vfs_ioctl+0xa9/0x640 |
| ksys_ioctl+0x67/0x90 |
| __x64_sys_ioctl+0x1a/0x20 |
| do_syscall_64+0x5a/0x110 |
| entry_SYSCALL_64_after_hwframe+0x44/0xa9 |
| ---[ end trace 33f2f83f3d5250e9 ]--- |
| BTRFS: error (device sda1) in create_pending_snapshot:1576: errno=-24 unknown |
| BTRFS info (device sda1): forced readonly |
| BTRFS warning (device sda1): Skipping commit of aborted transaction. |
| BTRFS: error (device sda1) in cleanup_transaction:1831: errno=-24 unknown |
| |
| [CAUSE] |
| The error is EMFILE (Too many files open) and comes from the anonymous |
| block device allocation. The ids are in a shared pool of size 1<<20. |
| |
| The ids are assigned to live subvolumes, ie. the root structure exists |
| in memory (eg. after creation or after the root appears in some path). |
| The pool could be exhausted if the numbers are not reclaimed fast |
| enough, after subvolume deletion or if other system component uses the |
| anon block devices. |
| |
| [WORKAROUND] |
| Since it's not possible to completely solve the problem, we can only |
| minimize the time the id is allocated to a subvolume root. |
| |
| Firstly, we can reduce the use of anon_dev by trees that are not |
| subvolume roots, like data reloc tree. |
| |
| This patch will do extra check on root objectid, to skip roots that |
| don't need anon_dev. Currently it's only data reloc tree and orphan |
| roots. |
| |
| Reported-by: Greed Rong <greedrong@gmail.com> |
| Link: https://lore.kernel.org/linux-btrfs/CA+UqX+NTrZ6boGnWHhSeZmEY5J76CTqmYjO2S+=tHJX7nb9DPw@mail.gmail.com/ |
| CC: stable@vger.kernel.org # 4.4+ |
| Reviewed-by: Josef Bacik <josef@toxicpanda.com> |
| Signed-off-by: Qu Wenruo <wqu@suse.com> |
| Reviewed-by: David Sterba <dsterba@suse.com> |
| Signed-off-by: David Sterba <dsterba@suse.com> |
| Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> |
| |
| diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c |
| index 54d9af9ddb1a..fc5c9b797e31 100644 |
| --- a/fs/btrfs/disk-io.c |
| +++ b/fs/btrfs/disk-io.c |
| @@ -1511,9 +1511,16 @@ int btrfs_init_fs_root(struct btrfs_root *root) |
| spin_lock_init(&root->ino_cache_lock); |
| init_waitqueue_head(&root->ino_cache_wait); |
| |
| - ret = get_anon_bdev(&root->anon_dev); |
| - if (ret) |
| - goto fail; |
| + /* |
| + * Don't assign anonymous block device to roots that are not exposed to |
| + * userspace, the id pool is limited to 1M |
| + */ |
| + if (is_fstree(root->root_key.objectid) && |
| + btrfs_root_refs(&root->root_item) > 0) { |
| + ret = get_anon_bdev(&root->anon_dev); |
| + if (ret) |
| + goto fail; |
| + } |
| |
| mutex_lock(&root->objectid_mutex); |
| ret = btrfs_find_highest_objectid(root, |
| -- |
| 2.27.0 |
| |