| From bippy-5f407fcff5a0 Mon Sep 17 00:00:00 2001 |
| From: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| To: <linux-cve-announce@vger.kernel.org> |
| Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org> |
| Subject: CVE-2021-47303: bpf: Track subprog poke descriptors correctly and fix use-after-free |
| |
| Description |
| =========== |
| |
| In the Linux kernel, the following vulnerability has been resolved: |
| |
| bpf: Track subprog poke descriptors correctly and fix use-after-free |
| |
| Subprograms are calling map_poke_track(), but on program release there is no |
| hook to call map_poke_untrack(). However, on program release, the aux memory |
| (and poke descriptor table) is freed even though we still have a reference to |
| it in the element list of the map aux data. When we run map_poke_run(), we then |
| end up accessing free'd memory, triggering KASAN in prog_array_map_poke_run(): |
| |
| [...] |
| [ 402.824689] BUG: KASAN: use-after-free in prog_array_map_poke_run+0xc2/0x34e |
| [ 402.824698] Read of size 4 at addr ffff8881905a7940 by task hubble-fgs/4337 |
| [ 402.824705] CPU: 1 PID: 4337 Comm: hubble-fgs Tainted: G I 5.12.0+ #399 |
| [ 402.824715] Call Trace: |
| [ 402.824719] dump_stack+0x93/0xc2 |
| [ 402.824727] print_address_description.constprop.0+0x1a/0x140 |
| [ 402.824736] ? prog_array_map_poke_run+0xc2/0x34e |
| [ 402.824740] ? prog_array_map_poke_run+0xc2/0x34e |
| [ 402.824744] kasan_report.cold+0x7c/0xd8 |
| [ 402.824752] ? prog_array_map_poke_run+0xc2/0x34e |
| [ 402.824757] prog_array_map_poke_run+0xc2/0x34e |
| [ 402.824765] bpf_fd_array_map_update_elem+0x124/0x1a0 |
| [...] |
| |
| The elements concerned are walked as follows: |
| |
| for (i = 0; i < elem->aux->size_poke_tab; i++) { |
| poke = &elem->aux->poke_tab[i]; |
| [...] |
| |
| The access to size_poke_tab is a 4 byte read, verified by checking offsets |
| in the KASAN dump: |
| |
| [ 402.825004] The buggy address belongs to the object at ffff8881905a7800 |
| which belongs to the cache kmalloc-1k of size 1024 |
| [ 402.825008] The buggy address is located 320 bytes inside of |
| 1024-byte region [ffff8881905a7800, ffff8881905a7c00) |
| |
| The pahole output of bpf_prog_aux: |
| |
| struct bpf_prog_aux { |
| [...] |
| /* --- cacheline 5 boundary (320 bytes) --- */ |
| u32 size_poke_tab; /* 320 4 */ |
| [...] |
| |
| In general, subprograms do not necessarily manage their own data structures. |
| For example, BTF func_info and linfo are just pointers to the main program |
| structure. This allows reference counting and cleanup to be done on the latter |
| which simplifies their management a bit. The aux->poke_tab struct, however, |
| did not follow this logic. The initial proposed fix for this use-after-free |
| bug further embedded poke data tracking into the subprogram with proper |
| reference counting. However, Daniel and Alexei questioned why we were treating |
| these objects special; I agree, its unnecessary. The fix here removes the per |
| subprogram poke table allocation and map tracking and instead simply points |
| the aux->poke_tab pointer at the main programs poke table. This way, map |
| tracking is simplified to the main program and we do not need to manage them |
| per subprogram. |
| |
| This also means, bpf_prog_free_deferred(), which unwinds the program reference |
| counting and kfrees objects, needs to ensure that we don't try to double free |
| the poke_tab when free'ing the subprog structures. This is easily solved by |
| NULL'ing the poke_tab pointer. The second detail is to ensure that per |
| subprogram JIT logic only does fixups on poke_tab[] entries it owns. To do |
| this, we add a pointer in the poke structure to point at the subprogram value |
| so JITs can easily check while walking the poke_tab structure if the current |
| entry belongs to the current program. The aux pointer is stable and therefore |
| suitable for such comparison. On the jit_subprogs() error path, we omit |
| cleaning up the poke->aux field because these are only ever referenced from |
| the JIT side, but on error we will never make it to the JIT, so its fine to |
| leave them dangling. Removing these pointers would complicate the error path |
| for no reason. However, we do need to untrack all poke descriptors from the |
| main program as otherwise they could race with the freeing of JIT memory from |
| the subprograms. Lastly, a748c6975dea3 ("bpf: propagate poke descriptors to |
| subprograms") had an off-by-one on the subprogram instruction index range |
| check as it was testing 'insn_idx >= subprog_start && insn_idx <= subprog_end'. |
| However, subprog_end is the next subprogram's start instruction. |
| |
| The Linux kernel CVE team has assigned CVE-2021-47303 to this issue. |
| |
| |
| Affected and fixed versions |
| =========================== |
| |
| Issue introduced in 5.10 with commit a748c6975dea325da540610c2ba9b5f332c603e6 and fixed in 5.10.53 with commit a9f36bf3613c65cb587c70fac655c775d911409b |
| Issue introduced in 5.10 with commit a748c6975dea325da540610c2ba9b5f332c603e6 and fixed in 5.13.5 with commit 599148d40366bd5d1d504a3a8fcd65e21107e500 |
| Issue introduced in 5.10 with commit a748c6975dea325da540610c2ba9b5f332c603e6 and fixed in 5.14 with commit f263a81451c12da5a342d90572e317e611846f2c |
| |
| Please see https://www.kernel.org for a full list of currently supported |
| kernel versions by the kernel community. |
| |
| Unaffected versions might change over time as fixes are backported to |
| older supported kernel versions. The official CVE entry at |
| https://cve.org/CVERecord/?id=CVE-2021-47303 |
| will be updated if fixes are backported, please check that for the most |
| up to date information about this issue. |
| |
| |
| Affected files |
| ============== |
| |
| The file(s) affected by this issue are: |
| arch/x86/net/bpf_jit_comp.c |
| include/linux/bpf.h |
| kernel/bpf/core.c |
| kernel/bpf/verifier.c |
| |
| |
| Mitigation |
| ========== |
| |
| The Linux kernel CVE team recommends that you update to the latest |
| stable kernel version for this, and many other bugfixes. Individual |
| changes are never tested alone, but rather are part of a larger kernel |
| release. Cherry-picking individual commits is not recommended or |
| supported by the Linux kernel community at all. If however, updating to |
| the latest release is impossible, the individual changes to resolve this |
| issue can be found at these commits: |
| https://git.kernel.org/stable/c/a9f36bf3613c65cb587c70fac655c775d911409b |
| https://git.kernel.org/stable/c/599148d40366bd5d1d504a3a8fcd65e21107e500 |
| https://git.kernel.org/stable/c/f263a81451c12da5a342d90572e317e611846f2c |