| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2021-47011: mm: memcontrol: slab: fix obtain a reference to a freeing memcg |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| mm: memcontrol: slab: fix obtain a reference to a freeing memcg |
| |
| Patch series "Use obj_cgroup APIs to charge kmem pages", v5. |
| |
| Since Roman's series "The new cgroup slab memory controller" applied. |
| All slab objects are charged with the new APIs of obj_cgroup. The new |
| APIs introduce a struct obj_cgroup to charge slab objects. It prevents |
| long-living objects from pinning the original memory cgroup in the |
| memory. But there are still some corner objects (e.g. allocations |
| larger than order-1 page on SLUB) which are not charged with the new |
| APIs. Those objects (include the pages which are allocated from buddy |
| allocator directly) are charged as kmem pages which still hold a |
| reference to the memory cgroup. |
| |
| E.g. We know that the kernel stack is charged as kmem pages because the |
| size of the kernel stack can be greater than 2 pages (e.g. 16KB on |
| x86_64 or arm64). If we create a thread (suppose the thread stack is |
| charged to memory cgroup A) and then move it from memory cgroup A to |
| memory cgroup B. Because the kernel stack of the thread hold a |
| reference to the memory cgroup A. The thread can pin the memory cgroup |
| A in the memory even if we remove the cgroup A. If we want to see this |
| scenario by using the following script. We can see that the system has |
| added 500 dying cgroups (This is not a real world issue, just a script |
| to show that the large kmallocs are charged as kmem pages which can pin |
| the memory cgroup in the memory). |
| |
| #!/bin/bash |
| |
| cat /proc/cgroups | grep memory |
| |
| cd /sys/fs/cgroup/memory |
| echo 1 > memory.move_charge_at_immigrate |
| |
| for i in range{1..500} |
| do |
| mkdir kmem_test |
| echo $$ > kmem_test/cgroup.procs |
| sleep 3600 & |
| echo $$ > cgroup.procs |
| echo `cat kmem_test/cgroup.procs` > cgroup.procs |
| rmdir kmem_test |
| done |
| |
| cat /proc/cgroups | grep memory |
| |
| This patchset aims to make those kmem pages to drop the reference to |
| memory cgroup by using the APIs of obj_cgroup. Finally, we can see that |
| the number of the dying cgroups will not increase if we run the above test |
| script. |
| |
| This patch (of 7): |
| |
| The rcu_read_lock/unlock only can guarantee that the memcg will not be |
| freed, but it cannot guarantee the success of css_get (which is in the |
| refill_stock when cached memcg changed) to memcg. |
| |
| rcu_read_lock() |
| memcg = obj_cgroup_memcg(old) |
| __memcg_kmem_uncharge(memcg) |
| refill_stock(memcg) |
| if (stock->cached != memcg) |
| // css_get can change the ref counter from 0 back to 1. |
| css_get(&memcg->css) |
| rcu_read_unlock() |
| |
| This fix is very like the commit: |
| |
| eefbfa7fd678 ("mm: memcg/slab: fix use after free in obj_cgroup_charge") |
| |
| Fix this by holding a reference to the memcg which is passed to the |
| __memcg_kmem_uncharge() before calling __memcg_kmem_uncharge(). |
| |
| The Linux kernel CVE team has assigned CVE-2021-47011 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.10.11 with commit 26f54dac15640c65ec69867e182de7be708ea389 and fixed in 5.10.37 with commit 31df8bc4d3feca9f9c6b2cd06fd64a111ae1a0e6 |
| Issue introduced in 5.11 with commit 3de7d4f25a7438f09fef4e71ef111f1805cd8e7c and fixed in 5.11.21 with commit 89b1ed358e01e1b0417f5d3b0082359a23355552 |
| Issue introduced in 5.11 with commit 3de7d4f25a7438f09fef4e71ef111f1805cd8e7c and fixed in 5.12.4 with commit c3ae6a3f3ca4f02f6ccddf213c027302586580d0 |
| Issue introduced in 5.11 with commit 3de7d4f25a7438f09fef4e71ef111f1805cd8e7c and fixed in 5.13 with commit 9f38f03ae8d5f57371b71aa6b4275765b65454fd |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2021-47011 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| mm/memcontrol.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/31df8bc4d3feca9f9c6b2cd06fd64a111ae1a0e6 |
| https://git.kernel.org/stable/c/89b1ed358e01e1b0417f5d3b0082359a23355552 |
| https://git.kernel.org/stable/c/c3ae6a3f3ca4f02f6ccddf213c027302586580d0 |
| https://git.kernel.org/stable/c/9f38f03ae8d5f57371b71aa6b4275765b65454fd |