blob: 65b7762d4d9a215026f5ca3215f908c33ff4161c [file] [log] [blame]
From ad02531b3d8d76144bdc5c13525da103981487e3 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Thu, 21 Jan 2010 00:00:32 +0100
Subject: [PATCH] sched: Hack to make prio ceiling posix compliant
commit 28bbb2398ebbeac1d4c1fd109938441842c8f89e in tip.
POSIX scheduling semantics for SCHED_FIFO require that a thread whose
priority is changed via pthread_setschedprio() is enqueued to the head
of the new priority list when the thread is running and the new
priority is lower than the old priority. This is required to implement
user space driven priority ceiling.
The sys_sched_setscheduler() and sys_sched_setparam() semantics are
POSIX compliant as they move the thread to the tail of the priority
list.
The lack of the sys_sched_setprio syscall and the resulting usage of
sys_sched_setscheduler() resp. sys_sched_setparam() results in the
following non POSIX compliant scenario:
Task A and B are runnable and in the same priority list X
task A runs and boosts itself for priority ceiling to prio Y
task A unboost itself to its original priority X
-> task A gets dequeued from priority list Y
-> task A gets enqueued at the tail of priority list X
task B runs
Work around that to make prio ceiling work as expected:
Queue task to head when task is running and task is lowering its
priority. This works around the non-availability of a sched_setprio
syscall which was tinkered into the posix spec to make prio ceiling
work correctly.
This workaround violates the posix scheduling semantics of tail
queueing in the case that the priority was changed by anything else
than sched_setprio, but there is no other breakage lurking than some
specification fetishists going berserk on me.
Fixing this in mainline needs more thoughts.
Reported-by: Mathias Weber <mathias.weber.mw1@roche.com>
Reported-by: Carsten Emde <cbe@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
diff --git a/kernel/sched.c b/kernel/sched.c
index a94db59..2c3d872 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -6701,7 +6701,25 @@ recheck:
if (running)
p->sched_class->set_curr_task(rq);
if (on_rq) {
- activate_task(rq, p, 0, false);
+ /*
+ * Workaround to make prio ceiling work as expected:
+ *
+ * Queue task to head when task is running and task is
+ * lowering its priority. This works around the non-
+ * availability of a sched_setprio syscall which was
+ * tinkered into the posix spec to make prio ceiling
+ * work correctly.
+ *
+ * This workaround violates the posix scheduling
+ * semantics of tail queueing in the case that the
+ * priority was changed by anything else than
+ * sched_setprio, but there is no other breakage
+ * lurking than some specification fetishists going
+ * berserk on me.
+ *
+ * Fixing this in mainline needs more thoughts.
+ */
+ activate_task(rq, p, 0, running && oldprio < p->prio);
check_class_changed(rq, p, prev_class, oldprio, running);
}
--
1.7.1.1