| From 29a73f112fd2f55a69087a8c31e60128ee2af18c Mon Sep 17 00:00:00 2001 |
| From: Sasha Levin <sashal@kernel.org> |
| Date: Fri, 15 Oct 2021 06:37:39 -0700 |
| Subject: net: stream: don't purge sk_error_queue in sk_stream_kill_queues() |
| |
| From: Jakub Kicinski <kuba@kernel.org> |
| |
| [ Upstream commit 24bcbe1cc69fa52dc4f7b5b2456678ed464724d8 ] |
| |
| sk_stream_kill_queues() can be called on close when there are |
| still outstanding skbs to transmit. Those skbs may try to queue |
| notifications to the error queue (e.g. timestamps). |
| If sk_stream_kill_queues() purges the queue without taking |
| its lock the queue may get corrupted, and skbs leaked. |
| |
| This shows up as a warning about an rmem leak: |
| |
| WARNING: CPU: 24 PID: 0 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x... |
| |
| The leak is always a multiple of 0x300 bytes (the value is in |
| %rax on my builds, so RAX: 0000000000000300). 0x300 is truesize of |
| an empty sk_buff. Indeed if we dump the socket state at the time |
| of the warning the sk_error_queue is often (but not always) |
| corrupted. The ->next pointer points back at the list head, |
| but not the ->prev pointer. Indeed we can find the leaked skb |
| by scanning the kernel memory for something that looks like |
| an skb with ->sk = socket in question, and ->truesize = 0x300. |
| The contents of ->cb[] of the skb confirms the suspicion that |
| it is indeed a timestamp notification (as generated in |
| __skb_complete_tx_timestamp()). |
| |
| Removing purging of sk_error_queue should be okay, since |
| inet_sock_destruct() does it again once all socket refs |
| are gone. Eric suggests this may cause sockets that go |
| thru disconnect() to maintain notifications from the |
| previous incarnations of the socket, but that should be |
| okay since the race was there anyway, and disconnect() |
| is not exactly dependable. |
| |
| Thanks to Jonathan Lemon and Omar Sandoval for help at various |
| stages of tracing the issue. |
| |
| Fixes: cb9eff097831 ("net: new user space API for time stamping of incoming and outgoing packets") |
| Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
| Reviewed-by: Eric Dumazet <edumazet@google.com> |
| Signed-off-by: David S. Miller <davem@davemloft.net> |
| Signed-off-by: Sasha Levin <sashal@kernel.org> |
| --- |
| net/core/stream.c | 3 --- |
| 1 file changed, 3 deletions(-) |
| |
| diff --git a/net/core/stream.c b/net/core/stream.c |
| index 4f1d4aa5fb38d..a166a32b411fa 100644 |
| --- a/net/core/stream.c |
| +++ b/net/core/stream.c |
| @@ -195,9 +195,6 @@ void sk_stream_kill_queues(struct sock *sk) |
| /* First the read buffer. */ |
| __skb_queue_purge(&sk->sk_receive_queue); |
| |
| - /* Next, the error queue. */ |
| - __skb_queue_purge(&sk->sk_error_queue); |
| - |
| /* Next, the write queue. */ |
| WARN_ON(!skb_queue_empty(&sk->sk_write_queue)); |
| |
| -- |
| 2.33.0 |
| |