| From: Boris Burkov <boris@bur.io> |
| Subject: mm/filemap: add AS_UNCHARGED |
| Date: Mon, 18 Aug 2025 17:36:53 -0700 |
| |
| Patch series "introduce uncharged file mapped folios", v3. |
| |
| I would like to revisit Qu's proposal to not charge btrfs extent_buffer |
| allocations to the user's cgroup. |
| |
| https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com/ |
| |
| I believe it is detrimental to randomly account these global pages to the |
| cgroup using them, basically at random. More justification and |
| explanation may be found in the patches themselves. |
| |
| |
| This patch (of 4): |
| |
| Btrfs currently tracks its metadata pages in the page cache, using a fake |
| inode (fs_info->btree_inode) with offsets corresponding to where the |
| metadata is stored in the filesystem's full logical address space. |
| |
| A consequence of this is that when btrfs uses filemap_add_folio(), this |
| usage is charged to the cgroup of whichever task happens to be running at |
| the time. These folios don't belong to any particular user cgroup, so I |
| don't think it makes much sense for them to be charged in that way. Some |
| negative consequences as a result: |
| |
| - A task can be holding some important btrfs locks, then need to lookup |
| some metadata and go into reclaim, extending the duration it holds that |
| lock for, and unfairly pushing its own reclaim pain onto other cgroups. |
| |
| - If that cgroup goes into reclaim, it might reclaim these folios a |
| different non-reclaiming cgroup might need soon. This is naturally |
| offset by LRU reclaim, but still. |
| |
| A very similar proposal to use the root cgroup was previously made by Qu, |
| where he eventually proposed the idea of setting it per address_space. |
| This makes good sense for the btrfs use case, as the uncharged behavior |
| should apply to all use of the address_space, not select allocations. |
| I.e., if someone adds another filemap_add_folio() call using btrfs's |
| btree_inode, we would almost certainly want the uncharged behavior. |
| |
| Link: https://lkml.kernel.org/r/cover.1755562487.git.boris@bur.io |
| Link: https://lore.kernel.org/linux-mm/b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com/ |
| Link: https://lkml.kernel.org/r/43fed53d45910cd4fa7a71d2e92913e53eb28774.1755562487.git.boris@bur.io |
| Signed-off-by: Boris Burkov <boris@bur.io> |
| Suggested-by: Qu Wenruo <wqu@suse.com> |
| Acked-by: Shakeel Butt <shakeel.butt@linux.dev> |
| Tested-by: syzbot@syzkaller.appspotmail.com |
| Cc: Johannes Weiner <hannes@cmpxchg.org> |
| Cc: Matthew Wilcox (Oracle) <willy@infradead.org> |
| Cc: Michal Hocko <mhocko@kernel.org> |
| Cc: Muchun Song <muchun.song@linux.dev> |
| Cc: Roman Gushchin <roman.gushchin@linux.dev> |
| Cc: Chris Mason <clm@fb.com> |
| Cc: David Sterba <dsterba@suse.com> |
| Cc: Josef Bacik <josef@toxicpanda.com> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| include/linux/pagemap.h | 1 + |
| mm/filemap.c | 12 ++++++++---- |
| 2 files changed, 9 insertions(+), 4 deletions(-) |
| |
| --- a/include/linux/pagemap.h~mm-filemap-add-as_uncharged |
| +++ a/include/linux/pagemap.h |
| @@ -211,6 +211,7 @@ enum mapping_flags { |
| folio contents */ |
| AS_INACCESSIBLE = 8, /* Do not attempt direct R/W access to the mapping */ |
| AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9, |
| + AS_UNCHARGED = 10, /* Do not charge usage to a cgroup */ |
| /* Bits 16-25 are used for FOLIO_ORDER */ |
| AS_FOLIO_ORDER_BITS = 5, |
| AS_FOLIO_ORDER_MIN = 16, |
| --- a/mm/filemap.c~mm-filemap-add-as_uncharged |
| +++ a/mm/filemap.c |
| @@ -960,15 +960,19 @@ int filemap_add_folio(struct address_spa |
| { |
| void *shadow = NULL; |
| int ret; |
| + bool charge_mem_cgroup = !test_bit(AS_UNCHARGED, &mapping->flags); |
| |
| - ret = mem_cgroup_charge(folio, NULL, gfp); |
| - if (ret) |
| - return ret; |
| + if (charge_mem_cgroup) { |
| + ret = mem_cgroup_charge(folio, NULL, gfp); |
| + if (ret) |
| + return ret; |
| + } |
| |
| __folio_set_locked(folio); |
| ret = __filemap_add_folio(mapping, folio, index, gfp, &shadow); |
| if (unlikely(ret)) { |
| - mem_cgroup_uncharge(folio); |
| + if (charge_mem_cgroup) |
| + mem_cgroup_uncharge(folio); |
| __folio_clear_locked(folio); |
| } else { |
| /* |
| _ |