* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> It does include one very interesting new feature that deserves to be 
> mentioned outside of the shortlog: 'dynamic ftrace' - which is a 
> transparent kernel-image-patcher mechanism that lazily patches out 
> mcount callsites from all functions that get executed. [if tracing is 
> disabled] These patched out callsites are remembered, and are patched 
> back in when tracing is enabled.
> 
> This technique does not just accelelerate the "tracing disabled" case 
> enormously, we were in fact unable to measure _any_ performance 
> difference (within noise) between an mcount-enabled dyn-ftrace [but 
> tracing-disabled] and a vanilla kernel (!), on modern CPUs.

hot off the presses: Steve has just finished doing a complete set of 
lmbench and hbackbench runs (on 2.8GHz Xeons, dual socket). The 
comparison results are at:

    http://rostedt.homelinux.com/dyn-ftrace-lmbench/lmbench-table

the results show zero lmbench/hackbench overhead from dyn-ftrace (with 
thousands of functions instrumented) apart of the usual lmbench jitter.

The core kernel results for example [these used to be rather sensitive 
on mcount overhead]:

Host                 OS  Mhz null null      open slct sig  sig  fork exec sh
                             call  I/O stat clos TCP  inst hndl proc proc proc
--------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
ftrace-jm  Linux 2.6.24 2783 0.65 0.74 4.74 6.96 5.28 0.76 4.29 290. 866. 2626
ftrace-no  Linux 2.6.24 2784 0.65 0.74 4.75 7.14 5.31 0.76 3.92 282. 837. 2591
no-patche  Linux 2.6.24 2786 0.64 0.71 4.86 7.10 5.34 0.78 3.89 283. 829. 2572
patched-n  Linux 2.6.24 2784 0.64 0.71 4.72 6.98 5.35 0.78 3.94 284. 832. 2582

"ftrace-jm": dyn-ftrace, 2-byte JMPs patched in the mcount callsite 
"ftrace-no": dyn-ftrace, 5-byte NOPs patched in (devel patch, not in 
             this pull request)
"no-patche": no patches [vanilla Linus -git kernel]
"patched-n": patched with the ftrace tree, all tracer options off

i can see an outlier in the "Create" results - that's noise i think 
because the patched+options-off kernel is in essence the same as 
vanilla. "File reread" is noisy and unreliable as ever. 16-task 
ctx-switch results are noisy too (especially the 64k working set ones) - 
this is a HT system hence the multi-threaded results are fundamentally 
noisy due to sibling interaction.

The hackbench results look good too.

enabling full tracing of all the tens of thousands of kernel functions 
makes things measurably slower - but that is expected. (filtered 
patching of a user-selected list of function names is an upcoming 
feature)

the full raw results are at:

   http://rostedt.homelinux.com/dyn-ftrace-lmbench/

        Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to