| From: Thomas Gleixner <tglx@linutronix.de> |
| Date: Mon, 24 Feb 2020 15:01:32 +0100 |
| Subject: [PATCH 01/22] bpf: Tighten the requirements for preallocated hash |
| maps |
| |
| The assumption that only programs attached to perf NMI events can deadlock |
| on memory allocators is wrong. Assume the following simplified callchain: |
| |
| kmalloc() from regular non BPF context |
| cache empty |
| freelist empty |
| lock(zone->lock); |
| tracepoint or kprobe |
| BPF() |
| update_elem() |
| lock(bucket) |
| kmalloc() |
| cache empty |
| freelist empty |
| lock(zone->lock); <- DEADLOCK |
| |
| There are other ways which do not involve locking to create wreckage: |
| |
| kmalloc() from regular non BPF context |
| local_irq_save(); |
| ... |
| obj = slab_first(); |
| kprobe() |
| BPF() |
| update_elem() |
| lock(bucket) |
| kmalloc() |
| local_irq_save(); |
| ... |
| obj = slab_first(); <- Same object as above ... |
| |
| So preallocation _must_ be enforced for all variants of intrusive |
| instrumentation. |
| |
| Unfortunately immediate enforcement would break backwards compatibility, so |
| for now such programs still are allowed to run, but a one time warning is |
| emitted in dmesg and the verifier emits a warning in the verifier log as |
| well so developers are made aware about this and can fix their programs |
| before the enforcement becomes mandatory. |
| |
| Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
| Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> |
| --- |
| kernel/bpf/verifier.c | 39 ++++++++++++++++++++++++++++----------- |
| 1 file changed, 28 insertions(+), 11 deletions(-) |
| |
| --- a/kernel/bpf/verifier.c |
| +++ b/kernel/bpf/verifier.c |
| @@ -8205,26 +8205,43 @@ static bool is_tracing_prog_type(enum bp |
| } |
| } |
| |
| +static bool is_preallocated_map(struct bpf_map *map) |
| +{ |
| + if (!check_map_prealloc(map)) |
| + return false; |
| + if (map->inner_map_meta && !check_map_prealloc(map->inner_map_meta)) |
| + return false; |
| + return true; |
| +} |
| + |
| static int check_map_prog_compatibility(struct bpf_verifier_env *env, |
| struct bpf_map *map, |
| struct bpf_prog *prog) |
| |
| { |
| - /* Make sure that BPF_PROG_TYPE_PERF_EVENT programs only use |
| - * preallocated hash maps, since doing memory allocation |
| - * in overflow_handler can crash depending on where nmi got |
| - * triggered. |
| + /* |
| + * Validate that trace type programs use preallocated hash maps. |
| + * |
| + * For programs attached to PERF events this is mandatory as the |
| + * perf NMI can hit any arbitrary code sequence. |
| + * |
| + * All other trace types using preallocated hash maps are unsafe as |
| + * well because tracepoint or kprobes can be inside locked regions |
| + * of the memory allocator or at a place where a recursion into the |
| + * memory allocator would see inconsistent state. |
| + * |
| + * For now running such programs is allowed for backwards |
| + * compatibility reasons, but warnings are emitted so developers are |
| + * made aware of the unsafety and can fix their programs before this |
| + * is enforced. |
| */ |
| - if (prog->type == BPF_PROG_TYPE_PERF_EVENT) { |
| - if (!check_map_prealloc(map)) { |
| + if (is_tracing_prog_type(prog->type) && !is_preallocated_map(map)) { |
| + if (prog->type == BPF_PROG_TYPE_PERF_EVENT) { |
| verbose(env, "perf_event programs can only use preallocated hash map\n"); |
| return -EINVAL; |
| } |
| - if (map->inner_map_meta && |
| - !check_map_prealloc(map->inner_map_meta)) { |
| - verbose(env, "perf_event programs can only use preallocated inner hash map\n"); |
| - return -EINVAL; |
| - } |
| + WARN_ONCE(1, "trace type BPF program uses run-time allocation\n"); |
| + verbose(env, "trace type programs with run-time allocated hash maps are unsafe. Switch to preallocated hash maps.\n"); |
| } |
| |
| if ((is_tracing_prog_type(prog->type) || |