* Daniel Hazelton <[EMAIL PROTECTED]> wrote: > [...] In a fair scheduler I'd expect all tasks to get the exact same > amount of time on the processor. So if there are 10 tasks running at > nice 0 and the current task has run for 20msecs before a new task is > swapped onto the CPU, the new task and *all* other tasks waiting to > get onto the CPU should get the same 20msecs. [...]
What happens in CFS is that in exchange for this task's 20 msecs the other tasks get 2 msecs each. (and not only the one that gets on the CPU next) So each task is handled equal. What i described was the first step - for each task the same step happens (whenever it gets on the CPU, and accounted/weighted for the precise period they spent waiting - so the second task would get +4 msecs credited, the third task +6 msecs, etc., etc.). but really - nothing beats first-hand experience: please just boot into a CFS kernel and test its precision a bit. You can pick it up from the usual place: http://people.redhat.com/mingo/cfs-scheduler/ For example start 10 CPU hogs at once from a shell: for (( N=0; N < 10; N++ )); do ( while :; do :; done ) & done [ type 'killall bash' in the same shell to get rid of them. ] then watch their CPU usage via 'top'. While the system is otherwise idle you should get something like this after half a minute of runtime: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2689 mingo 20 0 5968 560 276 R 10.0 0.1 0:03.45 bash 2692 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2693 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2694 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2695 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2698 mingo 20 0 5968 564 280 R 10.0 0.1 0:03.45 bash 2690 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash 2691 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash 2696 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash 2697 mingo 20 0 5968 564 280 R 9.9 0.1 0:03.45 bash with each task having exactly the same 'TIME+' field in top. (the more equal those fields, the more precise/fair the scheduler is. In the above output each got its precise share of 3.45 seconds of CPU time.) then as a next phase of testing please run various things on the system (without stopping these loops) and try to get CFS "out of balance" - you'll succeed if you manage to get an unequal 'TIME+' field for them. Please try _really_ hard to break it. You can run any workload. Or try massive_intr.c from: http://lkml.org/lkml/2007/3/26/319 which uses a much less trivial scheduling pattern to test a scheduler's precision of scheduling) $ ./massive_intr 9 10 002765 00000125 002767 00000125 002762 00000125 002769 00000125 002768 00000126 002761 00000126 002763 00000126 002766 00000126 002764 00000126 (the second column is runtime - the more equal, the more precise/fair the scheduler.) Ingo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/