| From foo@baz Fri Jan 15 10:15:43 AM CET 2021 |
| From: Florian Westphal <fw@strlen.de> |
| Date: Wed, 6 Jan 2021 00:15:23 +0100 |
| Subject: net: ip: always refragment ip defragmented packets |
| |
| From: Florian Westphal <fw@strlen.de> |
| |
| [ Upstream commit bb4cc1a18856a73f0ff5137df0c2a31f4c50f6cf ] |
| |
| Conntrack reassembly records the largest fragment size seen in IPCB. |
| However, when this gets forwarded/transmitted, fragmentation will only |
| be forced if one of the fragmented packets had the DF bit set. |
| |
| In that case, a flag in IPCB will force fragmentation even if the |
| MTU is large enough. |
| |
| This should work fine, but this breaks with ip tunnels. |
| Consider client that sends a UDP datagram of size X to another host. |
| |
| The client fragments the datagram, so two packets, of size y and z, are |
| sent. DF bit is not set on any of these packets. |
| |
| Middlebox netfilter reassembles those packets back to single size-X |
| packet, before routing decision. |
| |
| packet-size-vs-mtu checks in ip_forward are irrelevant, because DF bit |
| isn't set. At output time, ip refragmentation is skipped as well |
| because x is still smaller than the mtu of the output device. |
| |
| If ttransmit device is an ip tunnel, the packet size increases to |
| x+overhead. |
| |
| Also, tunnel might be configured to force DF bit on outer header. |
| |
| In this case, packet will be dropped (exceeds MTU) and an ICMP error is |
| generated back to sender. |
| |
| But sender already respects the announced MTU, all the packets that |
| it sent did fit the announced mtu. |
| |
| Force refragmentation as per original sizes unconditionally so ip tunnel |
| will encapsulate the fragments instead. |
| |
| The only other solution I see is to place ip refragmentation in |
| the ip_tunnel code to handle this case. |
| |
| Fixes: d6b915e29f4ad ("ip_fragment: don't forward defragmented DF packet") |
| Reported-by: Christian Perle <christian.perle@secunet.com> |
| Signed-off-by: Florian Westphal <fw@strlen.de> |
| Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> |
| Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| net/ipv4/ip_output.c | 2 +- |
| 1 file changed, 1 insertion(+), 1 deletion(-) |
| |
| --- a/net/ipv4/ip_output.c |
| +++ b/net/ipv4/ip_output.c |
| @@ -300,7 +300,7 @@ static int ip_finish_output(struct net * |
| if (skb_is_gso(skb)) |
| return ip_finish_output_gso(net, sk, skb, mtu); |
| |
| - if (skb->len > mtu || (IPCB(skb)->flags & IPSKB_FRAG_PMTU)) |
| + if (skb->len > mtu || IPCB(skb)->frag_max_size) |
| return ip_fragment(net, sk, skb, mtu, ip_finish_output2); |
| |
| return ip_finish_output2(net, sk, skb); |