| From 01cfcde9c26d8555f0e6e9aea9d6049f87683998 Mon Sep 17 00:00:00 2001 |
| From: Vincent Guittot <vincent.guittot@linaro.org> |
| Date: Fri, 10 Jul 2020 17:24:26 +0200 |
| Subject: sched/fair: handle case of task_h_load() returning 0 |
| |
| From: Vincent Guittot <vincent.guittot@linaro.org> |
| |
| commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 upstream. |
| |
| task_h_load() can return 0 in some situations like running stress-ng |
| mmapfork, which forks thousands of threads, in a sched group on a 224 cores |
| system. The load balance doesn't handle this correctly because |
| env->imbalance never decreases and it will stop pulling tasks only after |
| reaching loop_max, which can be equal to the number of running tasks of |
| the cfs. Make sure that imbalance will be decreased by at least 1. |
| |
| misfit task is the other feature that doesn't handle correctly such |
| situation although it's probably more difficult to face the problem |
| because of the smaller number of CPUs and running tasks on heterogenous |
| system. |
| |
| We can't simply ensure that task_h_load() returns at least one because it |
| would imply to handle underflow in other places. |
| |
| Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> |
| Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> |
| Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> |
| Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> |
| Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> |
| Cc: <stable@vger.kernel.org> # v4.4+ |
| Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| |
| |
| --- |
| kernel/sched/fair.c | 10 +++++++++- |
| 1 file changed, 9 insertions(+), 1 deletion(-) |
| |
| --- a/kernel/sched/fair.c |
| +++ b/kernel/sched/fair.c |
| @@ -6561,7 +6561,15 @@ static int detach_tasks(struct lb_env *e |
| if (!can_migrate_task(p, env)) |
| goto next; |
| |
| - load = task_h_load(p); |
| + /* |
| + * Depending of the number of CPUs and tasks and the |
| + * cgroup hierarchy, task_h_load() can return a null |
| + * value. Make sure that env->imbalance decreases |
| + * otherwise detach_tasks() will stop only after |
| + * detaching up to loop_max tasks. |
| + */ |
| + load = max_t(unsigned long, task_h_load(p), 1); |
| + |
| |
| if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed) |
| goto next; |