Re: [RFC] Scheduler recorder and playback

Dmitry Antipov Mon, 02 Apr 2012 03:09:30 -0700

On 03/08/2012 05:20 PM, Pantelis Antoniou wrote:

The current issue is that scheduler development is not easily shared between
developers. Each developer has their own 'itch', be it Android use cases, server
workloads, VM, etc. The risk is high of optimizing for one's own use case and
causing severe degradation on most other use cases.


One way to fix this problem would be the development of a method with which one
could perform a given use-case workload in a host, record the activity in a
interchangeable portable trace format file, and then play it back on another
host via a playback application that will generate an approximately similar load
which was observed during recording.


Have you tried to investigate whether 'perf' tool with 'sched record' and 
'sched replay'
features might be useful for such a purpose?

I tried to record and replay the various types of commonly used benchmarks, 
including
CPU, I/O and network intensive workloads, and have to say that the recording and
(especially) replaying overhead is quite high, at least for the default Panda 
board
configuration (where main I/O is slow due to root file system on SD card). 
Simple
things like 'perf sched record sleep 10' works for the most of the cases (but 
still
may cause sample loss, up to 10-20%). But, when I tried to add some I/O, for 
example,
with 'find /', the total workload becomes too high and the system (almost) hangs
with a lot of messages like:

INFO: task kjournald:512 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: rcu_preempt detected stalls on CPUs/tasks: 8055ec64     0   512      2 
0x00000000
INFO: Stall ended before state dump start
INFO: task kjournald:512 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task flush-179:0:511 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task kjournald:512 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Now I'm checking whether it's possible to do some partial recording (by skipping
some kinds of unrelated samples) and offload the kernel tracing subsystem to get
more CPUs time for the user-space tasks.

Do you have any thoughts about this?

Thanks,
Dmitry

_______________________________________________
linaro-dev mailing list
linaro-dev@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-dev

Re: [RFC] Scheduler recorder and playback

Reply via email to