bpf: reduce verifier memory consumption

the verifier got progressively smarter over time and size of its internal
state grew as well. Time to reduce the memory consumption.

Before:
sizeof(struct bpf_verifier_state) = 6520
After:
sizeof(struct bpf_verifier_state) = 896

It's done by observing that majority of BPF programs use little to
no stack whereas verifier kept all of 512 stack slots ready always.
Instead dynamically reallocate struct verifier state when stack
access is detected.
Besides memory savings such approach gives few % runtime perf
improvement as well and small reduction in number of processed insns:
                     before  after
bpf_lb-DLB_L3.o        2285   2043
bpf_lb-DLB_L4.o        3723   3570
bpf_lb-DUNKNOWN.o      1110   1109
bpf_lxc-DDROP_ALL.o   27954  27849
bpf_lxc-DUNKNOWN.o    38954  38724
bpf_netdev.o          16943  16131
bpf_overlay.o          7929   7733

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
3 files changed