Aurelien Jarno <aurel...@aurel32.net> writes: > On Fri, May 10, 2013 at 02:16:34PM +0200, Paolo Bonzini wrote: >> Cast debugging can have a substantial cost (20% or more, measured by >> Aurelien on qemu-system-ppc64). Instead of adding special-cased "fast >> casts" in the hot paths, we can just disable it in releases. At the >> same time, add tracing facilities that simplify the analysys of those >> problems that cast debugging would reveal. >> >> At least patches 1-7 are for 1.5. >> >> Paolo Bonzini (9): >> qom: improve documentation of cast functions >> qom: allow casting of a NULL class >> qom: add a fast path to object_class_dynamic_cast >> qom: pass file/line/function to asserting casts >> qom: trace asserting casts >> qom: allow turning cast debugging off >> build: disable QOM cast debugging for official releases >> qom: simplify object_class_dynamic_cast, part 1 >> qom: simplify object_class_dynamic_cast, part 2 >> >> configure | 20 ++++++++------ >> include/qom/object.h | 40 ++++++++++++++++++++++----- >> qom/object.c | 77 >> ++++++++++++++++++++++++++++++++++------------------ >> trace-events | 3 ++ >> 4 files changed, 99 insertions(+), 41 deletions(-) >> > > I have tested this series with qemu-system-ppc64, on a Core i7 2600 CPU. > The process was set to a single core using taskset. Inside the guest > (Debian ppc64 from debian-ports), I ran the command three times: > > lintian g++-4.8_4.8.0-6_ppc64.deb > > I used lintian as it's a perl code, that trigger the discussion about > sparc/ppc comparison. > > First of all with this patch series, the object_class_dynamic_cast calls > went down to below 0.1% when using perf top. Before the patch series was > applied, the command took in average on 3 runs 142.4s. With the patch > series, it went down to 129.8s, so almost 9% faster.
I just posted another patch which I believe will also reduce this overhead without eliminating the checks. We do a staggering number of casts... The patch I posted makes the overwhelming majority of them nothing more than a single pointer comparison and a couple derefs. Regards, Anthony Liguori > > To improve the performance a bit more, and come back to the same kind of > code as before, we should move simple accessors from qom/*.c to > include/qom/*.h and mark them as inline, so that they can be removed by > the compiler. Currently, even if the function is simple it's still a > call/ret in the hot path instead of a simple pointer addition. > > -- > Aurelien Jarno GPG: 1024D/F1BCDB73 > aurel...@aurel32.net http://www.aurel32.net