releases/3.10.23/tcp-don-t-update-snd_nxt-when-a-socket-is-switched-from-repair-mode.patch - pub/scm/linux/kernel/git/stable/stable-queue - Git at Google

 From acbcf8b7cc753939940407a421c98dba9739aa67 Mon Sep 17 00:00:00 2001
 From: Andrey Vagin <avagin@openvz.org>
 Date: Tue, 19 Nov 2013 22:10:06 +0400
 Subject: tcp: don't update snd_nxt, when a socket is switched from repair mode

 From: Andrey Vagin <avagin@openvz.org>

 [ Upstream commit dbde497966804e63a38fdedc1e3815e77097efc2 ]

 snd_nxt must be updated synchronously with sk_send_head.  Otherwise
 tp->packets_out may be updated incorrectly, what may bring a kernel panic.

 Here is a kernel panic from my host.
 [  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
 [  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
 ...
 [  146.301158] Call Trace:
 [  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

 Before this panic a tcp socket was restored. This socket had sent and
 unsent data in the write queue. Sent data was restored in repair mode,
 then the socket was switched from reapair mode and unsent data was
 restored. After that the socket was switched back into repair mode.

 In that moment we had a socket where write queue looks like this:
 snd_una    snd_nxt   write_seq
    |_________|________|
              |
 	  sk_send_head

 After a second switching from repair mode the state of socket was
 changed:

 snd_una          snd_nxt, write_seq
    |_________ ________|
              |
 	  sk_send_head

 This state is inconsistent, because snd_nxt and sk_send_head are not
 synchronized.

 Bellow you can find a call trace, how packets_out can be incremented
 twice for one skb, if snd_nxt and sk_send_head are not synchronized.
 In this case packets_out will be always positive, even when
 sk_write_queue is empty.

 tcp_write_wakeup
 	skb = tcp_send_head(sk);
 	tcp_fragment
 		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
 			tcp_adjust_pcount(sk, skb, diff);
 	tcp_event_new_data_sent
 		tp->packets_out += tcp_skb_pcount(skb);

 I think update of snd_nxt isn't required, when a socket is switched from
 repair mode.  Because it's initialized in tcp_connect_init. Then when a
 write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
 so it's always is in consistent state.

 I have checked, that the bug is not reproduced with this patch and
 all tests about restoring tcp connections work fine.

 Signed-off-by: Andrey Vagin <avagin@openvz.org>
 Cc: Pavel Emelyanov <xemul@parallels.com>
 Cc: Eric Dumazet <edumazet@google.com>
 Cc: "David S. Miller" <davem@davemloft.net>
 Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
 Cc: James Morris <jmorris@namei.org>
 Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
 Cc: Patrick McHardy <kaber@trash.net>
 Acked-by: Pavel Emelyanov <xemul@parallels.com>
 Acked-by: Eric Dumazet <edumazet@google.com>
 Signed-off-by: David S. Miller <davem@davemloft.net>
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 ---
  net/ipv4/tcp_output.c |    1 -
  1 file changed, 1 deletion(-)

 --- a/net/ipv4/tcp_output.c
 +++ b/net/ipv4/tcp_output.c
 @@ -3102,7 +3102,6 @@ void tcp_send_window_probe(struct sock *
  {
  	if (sk->sk_state == TCP_ESTABLISHED) {
  		tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
 -		tcp_sk(sk)->snd_nxt = tcp_sk(sk)->write_seq;
  		tcp_xmit_probe_skb(sk, 0);
  	}
  }
	From acbcf8b7cc753939940407a421c98dba9739aa67 Mon Sep 17 00:00:00 2001
	From: Andrey Vagin <avagin@openvz.org>
	Date: Tue, 19 Nov 2013 22:10:06 +0400
	Subject: tcp: don't update snd_nxt, when a socket is switched from repair mode

	From: Andrey Vagin <avagin@openvz.org>

	[ Upstream commit dbde497966804e63a38fdedc1e3815e77097efc2 ]

	snd_nxt must be updated synchronously with sk_send_head. Otherwise
	tp->packets_out may be updated incorrectly, what may bring a kernel panic.

	Here is a kernel panic from my host.
	[ 103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
	[ 103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
	...
	[ 146.301158] Call Trace:
	[ 146.301158] [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

	Before this panic a tcp socket was restored. This socket had sent and
	unsent data in the write queue. Sent data was restored in repair mode,
	then the socket was switched from reapair mode and unsent data was
	restored. After that the socket was switched back into repair mode.

	In that moment we had a socket where write queue looks like this:
	snd_una snd_nxt write_seq
	\|_________\|________\|
	\|
	sk_send_head

	After a second switching from repair mode the state of socket was
	changed:

	snd_una snd_nxt, write_seq
	\|_________ ________\|
	\|
	sk_send_head

	This state is inconsistent, because snd_nxt and sk_send_head are not
	synchronized.

	Bellow you can find a call trace, how packets_out can be incremented
	twice for one skb, if snd_nxt and sk_send_head are not synchronized.
	In this case packets_out will be always positive, even when
	sk_write_queue is empty.

	tcp_write_wakeup
	skb = tcp_send_head(sk);
	tcp_fragment
	if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
	tcp_adjust_pcount(sk, skb, diff);
	tcp_event_new_data_sent
	tp->packets_out += tcp_skb_pcount(skb);

	I think update of snd_nxt isn't required, when a socket is switched from
	repair mode. Because it's initialized in tcp_connect_init. Then when a
	write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
	so it's always is in consistent state.

	I have checked, that the bug is not reproduced with this patch and
	all tests about restoring tcp connections work fine.

	Signed-off-by: Andrey Vagin <avagin@openvz.org>
	Cc: Pavel Emelyanov <xemul@parallels.com>
	Cc: Eric Dumazet <edumazet@google.com>
	Cc: "David S. Miller" <davem@davemloft.net>
	Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
	Cc: James Morris <jmorris@namei.org>
	Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
	Cc: Patrick McHardy <kaber@trash.net>
	Acked-by: Pavel Emelyanov <xemul@parallels.com>
	Acked-by: Eric Dumazet <edumazet@google.com>
	Signed-off-by: David S. Miller <davem@davemloft.net>
	Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	---
	net/ipv4/tcp_output.c \| 1 -
	1 file changed, 1 deletion(-)

	--- a/net/ipv4/tcp_output.c
	+++ b/net/ipv4/tcp_output.c
	@@ -3102,7 +3102,6 @@ void tcp_send_window_probe(struct sock *
	{
	if (sk->sk_state == TCP_ESTABLISHED) {
	tcp_sk(sk)->snd_wl1 = tcp_sk(sk)->rcv_nxt - 1;
	- tcp_sk(sk)->snd_nxt = tcp_sk(sk)->write_seq;
	tcp_xmit_probe_skb(sk, 0);
	}
	}