4a9f42c9dcbfb8da40bfeaa923b6a740a64a889b - pub/scm/linux/kernel/git/bpf/bpf-next

commit	4a9f42c9dcbfb8da40bfeaa923b6a740a64a889b	[log] [tgz]
author	Alexei Starovoitov <ast@kernel.org>	Fri Sep 14 12:04:34 2018 -0700
committer	Alexei Starovoitov <ast@kernel.org>	Fri Sep 14 12:04:34 2018 -0700
tree	a80dc2e952a61a052a786792e1400190eb526815
parent	1edb6e035eb72a17462ba275fe2db36c37a62909 [diff]
parent	50b3ed57dee9cd0e06c59826cec8af14b51bab3e [diff]

Merge branch 'bpf-flow-dissector'

Petar Penkov says:

====================
This patch series hardens the RX stack by allowing flow dissection in BPF,
as previously discussed [1]. Because of the rigorous checks of the BPF
verifier, this provides significant security guarantees. In particular, the
BPF flow dissector cannot get inside of an infinite loop, as with
CVE-2013-4348, because BPF programs are guaranteed to terminate. It cannot
read outside of packet bounds, because all memory accesses are checked.
Also, with BPF the administrator can decide which protocols to support,
reducing potential attack surface. Rarely encountered protocols can be
excluded from dissection and the program can be updated without kernel
recompile or reboot if a bug is discovered.

Patch 1 adds infrastructure to execute a BPF program in __skb_flow_dissect.
This includes a new BPF program and attach type.

Patch 2 adds the new BPF flow dissector definitions to tools/uapi.

Patch 3 adds support for the new BPF program type to libbpf and bpftool.

Patch 4 adds a flow dissector program in BPF. This parses most protocols in
__skb_flow_dissect in BPF for a subset of flow keys (basic, control, ports,
and address types).

Patch 5 adds a selftest that attaches the BPF program to the flow dissector
and sends traffic with different levels of encapsulation.

Performance Evaluation:
The in-kernel implementation was compared against the demo program from
patch 4 using the test in patch 5 with IPv4/UDP traffic over 10 seconds.
	$perf record -a -C 4 taskset -c 4 ./test_flow_dissector -i 4 -f 8 \
		-t 10

In-kernel Dissector:
	__skb_flow_dissect overhead: 2.12%
	Total Packets: 3,272,597 (from output of ./test_flow_dissector)

BPF Dissector:
	__skb_flow_dissect overhead: 1.63%
	Total Packets: 3,232,356 (from output of ./test_flow_dissector)

No-op BPF Dissector:
	__skb_flow_dissect overhead: 1.52%
	Total Packets: 3,330,635 (from output of ./test_flow_dissector)

Changes since v3:
1/ struct bpf_flow_keys reorganized to remove holes in patch 1 and patch 2.

Changes since v2:
1/ Changes to tools/include/uapi pulled into a separate patch 2
2/ Changes to tools/lib and tools/bpftool pulled into a separate patch 3
3/ Changed flow_keys in __sk_buff from __u32 to struct bpf_flow_keys *
4/ Added nhoff field in struct bpf_flow_keys to pass initial offset
5/ Saving all of the modified control block, rather than just the qdisc
6/ Sample BPF program in patch 4 modified to use the changes above

Changes since v1:
1/ LD_ABS instructions now disallowed for the new BPF prog type
2/ now checks if skb is NULL in __skb_flow_dissect()
3/ fixed incorrect accesses in flow_dissector_is_valid_access()
	- writes to the flow_keys field now disallowed
	- reads/writes to tc_classid and data_meta now disallowed
4/ headers pulled with bpf_skb_load_data if direct access fails

Changes since RFC:
1/ Flow dissector hook changed from global to per-netns
2/ Defined struct bpf_flow_keys to be used in BPF flow dissector
programs instead of exposing the internal flow keys layout. Added a
function to translate from bpf_flow_keys to the internal layout after BPF
dissection is complete. The pointer to this struct is stored in
qdisc_skb_cb rather than inside of the 20 byte control block which
simplifies verification and allows access to all 20 bytes of the cb.
3/ Removed GUE parsing as it relied on a hardcoded port
4/ MPLS parsing now stops at the first label which is consistent
with the in-kernel flow dissector
5/ Refactored to use direct packet access and to write out to
struct bpf_flow_keys

[1] http://vger.kernel.org/netconf2017_files/rx_hardening_and_udp_gso.pdf
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

tree: a80dc2e952a61a052a786792e1400190eb526815