It looks like commit 81a44c5 (sched: Queue RT tasks to head when prio drop)
made the behavior on dropping (userspace view) more sensible but I believe the
behavior is still incorrect according to POSIX.
POSIX (in volume 2 section 2.8.4 Process Scheduling) specifies two different
semantics for where the task is placed in the thread list for the new priority
8. If a thread whose policy or priority has been modified by
pthread_setschedprio() is a running thread or is runnable, the effect on its
position in the thread list depends on the direction of the modification, as
follows:
a. If the priority is raised, the thread becomes the tail of the thread
list.
b. If the priority is unchanged, the thread does not change position in
the thread list.
c. If the priority is lowered, the thread becomes the head of the
thread list.
7. If a thread whose policy or priority has been modified other than by
pthread_setschedprio() is a running thread or is runnable, it then becomes the
tail of the thread list for its new priority.
Commit 81a44c5 made all of the priority change functions behave according to
the pthread_setschedprio semantics.
It appears commit ff77e46 (sched/rt: Fix PI handling vs. sched_setscheduler())
causes changing a task's priority to its existing priority to requeue it at the
tail.
So a task settings its own priority to its current priority would be the same
as a sched_yield().
I believe the correct behavior is to have the existing priority change syscalls
(sched_setscheduler and sched_setparam) always move the changed task to the
back of the queue for the new priority.
But as far as I can tell the kernel provides no way to implement
pthread_setschedprio with the correct semantics.
It seems the best way to implement this would be adding a flag
(SCHED_SETSCHEDPRIO) to the existing sched_setattr syscall.
Any thoughts?
Thanks,
--Matt Grochowalski