Re: [Qemu-devel] [sw-dev] RFC: QEMU RISC-V modular ISA decoding

Stefan O'Rear Wed, 26 Jul 2017 08:02:13 -0700

On Tue, Jul 25, 2017 at 9:37 AM, Bruce Hoult <br...@hoult.org> wrote:
> Do you have any good estimates for how much of the execution time is
> typically spent in instruction decode?
>
> RISC-V qemu is twice as fast as ARM or Aarch64 qemu, so it's doing something
> right!
>
> (I suspect it's probably mostly the lack of needing to emulate condition
> codes)


The last time I tried to profile qemu (system mode, running the Go
bootstrap, I think), I didn't get very far because no jit-map and I
wasn't able to get frame pointers working, but as far as I got none of
the translate functions showed up.

Most time spent in translated code, trampolines for entering and
exiting translated code, TLB maintenance, and the code to choose which
basic block to run next.

Making the instruction decoder a bit slower is not likely to have much
effect (but do do before and after measurements to be sure).

Significant wins would come from reducing the number of switches
between translations (e.g. by translating larger units, all code on a
page at once, functions, traces), by making switches between
translations cheaper (e.g. with inline caches), or by reducing the
cost of access translation (e.g. two accesses relative to the same
base register in the same translation often hit the same virtual page
and can share translation effort, or more speculatively by using host
translation hardware.  (I am willing to discuss any of these further
OFF-LIST ONLY.)

-s

Re: [Qemu-devel] [sw-dev] RFC: QEMU RISC-V modular ISA decoding

Reply via email to