| From 97684f0970f6e112926de631fdd98d9693c7e5c1 Mon Sep 17 00:00:00 2001 |
| From: Jonathon Reinhart <jonathon.reinhart@gmail.com> |
| Date: Tue, 13 Apr 2021 03:08:48 -0400 |
| Subject: net: Make tcp_allowed_congestion_control readonly in non-init netns |
| |
| From: Jonathon Reinhart <jonathon.reinhart@gmail.com> |
| |
| commit 97684f0970f6e112926de631fdd98d9693c7e5c1 upstream. |
| |
| Currently, tcp_allowed_congestion_control is global and writable; |
| writing to it in any net namespace will leak into all other net |
| namespaces. |
| |
| tcp_available_congestion_control and tcp_allowed_congestion_control are |
| the only sysctls in ipv4_net_table (the per-netns sysctl table) with a |
| NULL data pointer; their handlers (proc_tcp_available_congestion_control |
| and proc_allowed_congestion_control) have no other way of referencing a |
| struct net. Thus, they operate globally. |
| |
| Because ipv4_net_table does not use designated initializers, there is no |
| easy way to fix up this one "bad" table entry. However, the data pointer |
| updating logic shouldn't be applied to NULL pointers anyway, so we |
| instead force these entries to be read-only. |
| |
| These sysctls used to exist in ipv4_table (init-net only), but they were |
| moved to the per-net ipv4_net_table, presumably without realizing that |
| tcp_allowed_congestion_control was writable and thus introduced a leak. |
| |
| Because the intent of that commit was only to know (i.e. read) "which |
| congestion algorithms are available or allowed", this read-only solution |
| should be sufficient. |
| |
| The logic added in recent commit |
| 31c4d2f160eb: ("net: Ensure net namespace isolation of sysctls") |
| does not and cannot check for NULL data pointers, because |
| other table entries (e.g. /proc/sys/net/netfilter/nf_log/) have |
| .data=NULL but use other methods (.extra2) to access the struct net. |
| |
| Fixes: 9cb8e048e5d9 ("net/ipv4/sysctl: show tcp_{allowed, available}_congestion_control in non-initial netns") |
| Signed-off-by: Jonathon Reinhart <jonathon.reinhart@gmail.com> |
| Signed-off-by: David S. Miller <davem@davemloft.net> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| net/ipv4/sysctl_net_ipv4.c | 16 +++++++++++++--- |
| 1 file changed, 13 insertions(+), 3 deletions(-) |
| |
| --- a/net/ipv4/sysctl_net_ipv4.c |
| +++ b/net/ipv4/sysctl_net_ipv4.c |
| @@ -1369,9 +1369,19 @@ static __net_init int ipv4_sysctl_init_n |
| if (!table) |
| goto err_alloc; |
| |
| - /* Update the variables to point into the current struct net */ |
| - for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) |
| - table[i].data += (void *)net - (void *)&init_net; |
| + for (i = 0; i < ARRAY_SIZE(ipv4_net_table) - 1; i++) { |
| + if (table[i].data) { |
| + /* Update the variables to point into |
| + * the current struct net |
| + */ |
| + table[i].data += (void *)net - (void *)&init_net; |
| + } else { |
| + /* Entries without data pointer are global; |
| + * Make them read-only in non-init_net ns |
| + */ |
| + table[i].mode &= ~0222; |
| + } |
| + } |
| } |
| |
| net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table); |