| From 9b2b73fff3f99320a0b7423ddfbf9edabbe69b2e Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Thu, 25 Jun 2020 16:13:18 -0700 |
| Subject: bpf, sockmap: RCU dereferenced psock may be used outside RCU block |
| |
| From: John Fastabend <john.fastabend@gmail.com> |
| |
| [ Upstream commit 8025751d4d55a2f32be6bdf825b6a80c299875f5 ] |
| |
| If an ingress verdict program specifies message sizes greater than |
| skb->len and there is an ENOMEM error due to memory pressure we |
| may call the rcv_msg handler outside the strp_data_ready() caller |
| context. This is because on an ENOMEM error the strparser will |
| retry from a workqueue. The caller currently protects the use of |
| psock by calling the strp_data_ready() inside a rcu_read_lock/unlock |
| block. |
| |
| But, in above workqueue error case the psock is accessed outside |
| the read_lock/unlock block of the caller. So instead of using |
| psock directly we must do a look up against the sk again to |
| ensure the psock is available. |
| |
| There is an an ugly piece here where we must handle |
| the case where we paused the strp and removed the psock. On |
| psock removal we first pause the strparser and then remove |
| the psock. If the strparser is paused while an skb is |
| scheduled on the workqueue the skb will be dropped on the |
| flow and kfree_skb() is called. If the workqueue manages |
| to get called before we pause the strparser but runs the rcvmsg |
| callback after the psock is removed we will hit the unlikely |
| case where we run the sockmap rcvmsg handler but do not have |
| a psock. For now we will follow strparser logic and drop the |
| skb on the floor with skb_kfree(). This is ugly because the |
| data is dropped. To date this has not caused problems in practice |
| because either the application controlling the sockmap is |
| coordinating with the datapath so that skbs are "flushed" |
| before removal or we simply wait for the sock to be closed before |
| removing it. |
| |
| This patch fixes the describe RCU bug and dropping the skb doesn't |
| make things worse. Future patches will improve this by allowing |
| the normal case where skbs are not merged to skip the strparser |
| altogether. In practice many (most?) use cases have no need to |
| merge skbs so its both a code complexity hit as seen above and |
| a performance issue. For example, in the Cilium case we always |
| set the strparser up to return sbks 1:1 without any merging and |
| have avoided above issues. |
| |
| Fixes: e91de6afa81c1 ("bpf: Fix running sk_skb program types with ktls") |
| Signed-off-by: John Fastabend <john.fastabend@gmail.com> |
| Signed-off-by: Alexei Starovoitov <ast@kernel.org> |
| Acked-by: Martin KaFai Lau <kafai@fb.com> |
| Link: https://lore.kernel.org/bpf/159312679888.18340.15248924071966273998.stgit@john-XPS-13-9370 |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| net/core/skmsg.c | 10 +++++++++- |
| 1 file changed, 9 insertions(+), 1 deletion(-) |
| |
| diff --git a/net/core/skmsg.c b/net/core/skmsg.c |
| index 70ea352e3a3b6..118cf1ace43a6 100644 |
| --- a/net/core/skmsg.c |
| +++ b/net/core/skmsg.c |
| @@ -785,11 +785,18 @@ static void sk_psock_verdict_apply(struct sk_psock *psock, |
| |
| static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb) |
| { |
| - struct sk_psock *psock = sk_psock_from_strp(strp); |
| + struct sk_psock *psock; |
| struct bpf_prog *prog; |
| int ret = __SK_DROP; |
| + struct sock *sk; |
| |
| rcu_read_lock(); |
| + sk = strp->sk; |
| + psock = sk_psock(sk); |
| + if (unlikely(!psock)) { |
| + kfree_skb(skb); |
| + goto out; |
| + } |
| prog = READ_ONCE(psock->progs.skb_verdict); |
| if (likely(prog)) { |
| skb_orphan(skb); |
| @@ -798,6 +805,7 @@ static void sk_psock_strp_read(struct strparser *strp, struct sk_buff *skb) |
| ret = sk_psock_map_verd(ret, tcp_skb_bpf_redirect_fetch(skb)); |
| } |
| sk_psock_verdict_apply(psock, skb, ret); |
| +out: |
| rcu_read_unlock(); |
| } |
| |
| -- |
| 2.25.1 |
| |