| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2024-41010: bpf: Fix too early release of tcx_entry |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| bpf: Fix too early release of tcx_entry |
| |
| Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported |
| an issue that the tcx_entry can be released too early leading to a use |
| after free (UAF) when an active old-style ingress or clsact qdisc with a |
| shared tc block is later replaced by another ingress or clsact instance. |
| |
| Essentially, the sequence to trigger the UAF (one example) can be as follows: |
| |
| 1. A network namespace is created |
| 2. An ingress qdisc is created. This allocates a tcx_entry, and |
| &tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the |
| same time, a tcf block with index 1 is created. |
| 3. chain0 is attached to the tcf block. chain0 must be connected to |
| the block linked to the ingress qdisc to later reach the function |
| tcf_chain0_head_change_cb_del() which triggers the UAF. |
| 4. Create and graft a clsact qdisc. This causes the ingress qdisc |
| created in step 1 to be removed, thus freeing the previously linked |
| tcx_entry: |
| |
| rtnetlink_rcv_msg() |
| => tc_modify_qdisc() |
| => qdisc_create() |
| => clsact_init() [a] |
| => qdisc_graft() |
| => qdisc_destroy() |
| => __qdisc_destroy() |
| => ingress_destroy() [b] |
| => tcx_entry_free() |
| => kfree_rcu() // tcx_entry freed |
| |
| 5. Finally, the network namespace is closed. This registers the |
| cleanup_net worker, and during the process of releasing the |
| remaining clsact qdisc, it accesses the tcx_entry that was |
| already freed in step 4, causing the UAF to occur: |
| |
| cleanup_net() |
| => ops_exit_list() |
| => default_device_exit_batch() |
| => unregister_netdevice_many() |
| => unregister_netdevice_many_notify() |
| => dev_shutdown() |
| => qdisc_put() |
| => clsact_destroy() [c] |
| => tcf_block_put_ext() |
| => tcf_chain0_head_change_cb_del() |
| => tcf_chain_head_change_item() |
| => clsact_chain_head_change() |
| => mini_qdisc_pair_swap() // UAF |
| |
| There are also other variants, the gist is to add an ingress (or clsact) |
| qdisc with a specific shared block, then to replace that qdisc, waiting |
| for the tcx_entry kfree_rcu() to be executed and subsequently accessing |
| the current active qdisc's miniq one way or another. |
| |
| The correct fix is to turn the miniq_active boolean into a counter. What |
| can be observed, at step 2 above, the counter transitions from 0->1, at |
| step [a] from 1->2 (in order for the miniq object to remain active during |
| the replacement), then in [b] from 2->1 and finally [c] 1->0 with the |
| eventual release. The reference counter in general ranges from [0,2] and |
| it does not need to be atomic since all access to the counter is protected |
| by the rtnl mutex. With this in place, there is no longer a UAF happening |
| and the tcx_entry is freed at the correct time. |
| |
| The Linux kernel CVE team has assigned CVE-2024-41010 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 6.6 with commit e420bed025071a623d2720a92bc2245c84757ecb and fixed in 6.6.41 with commit 230bb13650b0f186f540500fd5f5f7096a822a2a |
| Issue introduced in 6.6 with commit e420bed025071a623d2720a92bc2245c84757ecb and fixed in 6.9.10 with commit f61ecf1bd5b562ebfd7d430ccb31619857e80857 |
| Issue introduced in 6.6 with commit e420bed025071a623d2720a92bc2245c84757ecb and fixed in 6.10 with commit 1cb6f0bae50441f4b4b32a28315853b279c7404e |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2024-41010 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| include/net/tcx.h |
| net/sched/sch_ingress.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/230bb13650b0f186f540500fd5f5f7096a822a2a |
| https://git.kernel.org/stable/c/f61ecf1bd5b562ebfd7d430ccb31619857e80857 |
| https://git.kernel.org/stable/c/1cb6f0bae50441f4b4b32a28315853b279c7404e |