On Mon, Sep 18, 2017 at 18:42:55 +0100, Peter Maydell wrote:
> On 18 September 2017 at 18:09, Lluís Vilanova <vilan...@ac.upc.edu> wrote:
> > It also means we won't be able to "conditionally" instrument instructions 
> > (e.g.,
> > based on their opcode, address range, etc.).
> 
> You can still do that, it's just less efficient (your
> condition-check happens in the callout to the instrumentation
> plugin). We can add "filter" options later if we need them
> (which I would rather do than have translate-time callbacks).

My initial reaction to not having translation-time hooks was that it'd
be too slow. However, thinking about it a bit more I suspect it might
not matter much. Other tools such as Pin/DynamoRIO have these hooks,
and in their case it makes sense because one can choose to, instead of
having one callback per executed instruction, get a callback per
executed "trace" (a trace is a chain of TBs). Crucially, these tools do
not go through an intermediate IR, and as a result are about 10X faster
than QEMU.

Therefore, given that QEMU is comparatively slow, we might find that
per-executed-instruction callbacks do not end up slowing things down
significantly.

I like the idea of discussing an API alone, but there is value in having
an implementation to go with it, so that we can explore the perf
trade-offs involved. I was busy with other things the last 10 days or so,
but by the end of this week I hope I'll be able to share some
perf numbers on the per-TB vs. per-instruction callback issue.

Thanks,

                Emilio

Reply via email to