| From 2339130550f06c7e443c40a68af16ea16a8143aa Mon Sep 17 00:00:00 2001 |
| From: Michal Hocko <mhocko@suse.com> |
| Date: Sun, 6 Oct 2019 17:58:19 -0700 |
| Subject: [PATCH] kernel/sysctl.c: do not override max_threads provided by |
| userspace |
| |
| commit b0f53dbc4bc4c371f38b14c391095a3bb8a0bb40 upstream. |
| |
| Partially revert 16db3d3f1170 ("kernel/sysctl.c: threads-max observe |
| limits") because the patch is causing a regression to any workload which |
| needs to override the auto-tuning of the limit provided by kernel. |
| |
| set_max_threads is implementing a boot time guesstimate to provide a |
| sensible limit of the concurrently running threads so that runaways will |
| not deplete all the memory. This is a good thing in general but there |
| are workloads which might need to increase this limit for an application |
| to run (reportedly WebSpher MQ is affected) and that is simply not |
| possible after the mentioned change. It is also very dubious to |
| override an admin decision by an estimation that doesn't have any direct |
| relation to correctness of the kernel operation. |
| |
| Fix this by dropping set_max_threads from sysctl_max_threads so any |
| value is accepted as long as it fits into MAX_THREADS which is important |
| to check because allowing more threads could break internal robust futex |
| restriction. While at it, do not use MIN_THREADS as the lower boundary |
| because it is also only a heuristic for automatic estimation and admin |
| might have a good reason to stop new threads to be created even when |
| below this limit. |
| |
| This became more severe when we switched x86 from 4k to 8k kernel |
| stacks. Starting since 6538b8ea886e ("x86_64: expand kernel stack to |
| 16K") (3.16) we use THREAD_SIZE_ORDER = 2 and that halved the auto-tuned |
| value. |
| |
| In the particular case |
| |
| 3.12 |
| kernel.threads-max = 515561 |
| |
| 4.4 |
| kernel.threads-max = 200000 |
| |
| Neither of the two values is really insane on 32GB machine. |
| |
| I am not sure we want/need to tune the max_thread value further. If |
| anything the tuning should be removed altogether if proven not useful in |
| general. But we definitely need a way to override this auto-tuning. |
| |
| Link: http://lkml.kernel.org/r/20190922065801.GB18814@dhcp22.suse.cz |
| Fixes: 16db3d3f1170 ("kernel/sysctl.c: threads-max observe limits") |
| Signed-off-by: Michal Hocko <mhocko@suse.com> |
| Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> |
| Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> |
| Cc: <stable@vger.kernel.org> |
| Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
| Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
| Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> |
| |
| diff --git a/kernel/fork.c b/kernel/fork.c |
| index d3f006ed2f9d..d4c26c3052f3 100644 |
| --- a/kernel/fork.c |
| +++ b/kernel/fork.c |
| @@ -2773,7 +2773,7 @@ int sysctl_max_threads(struct ctl_table *table, int write, |
| struct ctl_table t; |
| int ret; |
| int threads = max_threads; |
| - int min = MIN_THREADS; |
| + int min = 1; |
| int max = MAX_THREADS; |
| |
| t = *table; |
| @@ -2785,7 +2785,7 @@ int sysctl_max_threads(struct ctl_table *table, int write, |
| if (ret || !write) |
| return ret; |
| |
| - set_max_threads(threads); |
| + max_threads = threads; |
| |
| return 0; |
| } |
| -- |
| 2.7.4 |
| |