| From: Huan Yang <link@vivo.com> |
| Subject: mm/memcg: use kmem_cache when alloc memcg |
| Date: Fri, 25 Apr 2025 11:19:24 +0800 |
| |
| When tracing mem_cgroup_alloc() with kmalloc ftrace, we observe: |
| |
| kmalloc: call_site=mem_cgroup_css_alloc+0xd8/0x5b4 ptr=000000003e4c3799 |
| bytes_req=2312 bytes_alloc=4096 gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 |
| accounted=false |
| |
| The output indicates that while allocating mem_cgroup struct (2312 bytes), |
| the slab allocator actually provides 4096-byte chunks. This occurs because: |
| |
| 1. The slab allocator predefines bucket sizes from 64B to 8096B |
| 2. The mem_cgroup allocation size (2312B) falls between the 2KB and 4KB |
| slabs |
| 3. The allocator rounds up to the nearest larger slab (4KB), resulting in |
| ~1KB wasted memory per allocation |
| |
| This patch introduces a dedicated kmem_cache for mem_cgroup structs, |
| achieving precise memory allocation. Post-patch ftrace verification shows: |
| |
| kmem_cache_alloc: call_site=mem_cgroup_css_alloc+0xbc/0x5d4 |
| ptr=00000000695c1806 bytes_req=2312 bytes_alloc=2368 |
| gfp_flags=GFP_KERNEL|__GFP_ZERO node=-1 accounted=false |
| |
| Each memcg alloc offer 2368bytes(include hw cacheline align), compare to |
| 4096, avoid waste. |
| |
| Link: https://lkml.kernel.org/r/20250425031935.76411-3-link@vivo.com |
| Signed-off-by: Huan Yang <link@vivo.com> |
| Acked-by: Shakeel Butt <shakeel.butt@linux.dev> |
| Acked-by: Johannes Weiner <hannes@cmpxchg.org> |
| Cc: Francesco Valla <francesco@valla.it> |
| Cc: guoweikang <guoweikang.kernel@gmail.com> |
| Cc: Huang Shijie <shijie@os.amperecomputing.com> |
| Cc: KP Singh <kpsingh@kernel.org> |
| Cc: Michal Hocko <mhocko@kernel.org> |
| Cc: Muchun Song <muchun.song@linux.dev> |
| Cc: "Paul E . McKenney" <paulmck@kernel.org> |
| Cc: Petr Mladek <pmladek@suse.com> |
| Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> |
| Cc: Raul E Rangel <rrangel@chromium.org> |
| Cc: Roman Gushchin <roman.gushchin@linux.dev> |
| Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> |
| Cc: Vlastimil Babka <vbabka@suse.cz> |
| Cc: Matthew Wilcox <willy@infradead.org> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| --- |
| |
| mm/memcontrol.c | 9 ++++++++- |
| 1 file changed, 8 insertions(+), 1 deletion(-) |
| |
| --- a/mm/memcontrol.c~mm-memcg-use-kmem_cache-when-alloc-memcg |
| +++ a/mm/memcontrol.c |
| @@ -96,6 +96,8 @@ static bool cgroup_memory_nokmem __ro_af |
| /* BPF memory accounting disabled? */ |
| static bool cgroup_memory_nobpf __ro_after_init; |
| |
| +static struct kmem_cache *memcg_cachep; |
| + |
| #ifdef CONFIG_CGROUP_WRITEBACK |
| static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq); |
| #endif |
| @@ -3665,7 +3667,7 @@ static struct mem_cgroup *mem_cgroup_all |
| int __maybe_unused i; |
| long error; |
| |
| - memcg = kzalloc(struct_size(memcg, nodeinfo, nr_node_ids), GFP_KERNEL); |
| + memcg = kmem_cache_zalloc(memcg_cachep, GFP_KERNEL); |
| if (!memcg) |
| return ERR_PTR(-ENOMEM); |
| |
| @@ -5051,6 +5053,7 @@ __setup("cgroup.memory=", cgroup_memory) |
| */ |
| int __init mem_cgroup_init(void) |
| { |
| + unsigned int memcg_size; |
| int cpu; |
| |
| /* |
| @@ -5068,6 +5071,10 @@ int __init mem_cgroup_init(void) |
| INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, |
| drain_local_stock); |
| |
| + memcg_size = struct_size_t(struct mem_cgroup, nodeinfo, nr_node_ids); |
| + memcg_cachep = kmem_cache_create("mem_cgroup", memcg_size, 0, |
| + SLAB_PANIC | SLAB_HWCACHE_ALIGN, NULL); |
| + |
| return 0; |
| } |
| |
| _ |