Sorry, I was responding to Karl Magdsick's point about the cost of
switch statements relating to Nathaniel G.H.'s point about the cost of
translation/generation. FDO works in the case of interpreters and
translators from my experience as code sequences are pretty predictable
things (you just don't tend to choose instructions at random). Improving
dynamically generated code is the best way to improve performance, as
that's where you spend your time. However, the time spent compiling
means you only want to optimise hot spots. Well thought out simple
just-in-time compilation is often very hard to beat with an optimising
compiler, for example, knowing your target architecture well means you
can get a really good static register mapping to the host architecture.
These lessons are well known in the literature. I'd still be interested
to know if the translator's profiled performance improved using FDO. I
have results from writing things in Java, where FDO is par for the
course, but they hardly apply to QEMU/GCC.
Regards,
Ian Rogers
-- http://www.binarytranslator.org/
Daniel Egger wrote:
On 18.04.2005, at 11:51, Ian Rogers wrote:
I'm not sure if you can get GCC to generate code sequences like this,
but you probably at least need to use the -fprofile-generate and
-fprofile-use options
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Feedback optimisation (FDO) will not work for two reasons:
a) qemu itself is something like a realtime compiler so FDO
will only speed up the compiler but not the generated code
b) FDO will only provide speed boosts if the feedback phase
has a chance to analyse a representative work pattern that
is hopefully also repetitive
After all FDO is mostly about making a tradeoff size/speed
and rearranging code (mostly branches) to avoid branch
mispredictions of the CPU.
Servus,
Daniel
_______________________________________________
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel