On Mon, Sep 18, 2017 at 18:42:55 +0100, Peter Maydell wrote: > On 18 September 2017 at 18:09, Lluís Vilanova <vilan...@ac.upc.edu> wrote: > > It also means we won't be able to "conditionally" instrument instructions > > (e.g., > > based on their opcode, address range, etc.). > > You can still do that, it's just less efficient (your > condition-check happens in the callout to the instrumentation > plugin). We can add "filter" options later if we need them > (which I would rather do than have translate-time callbacks).
My initial reaction to not having translation-time hooks was that it'd be too slow. However, thinking about it a bit more I suspect it might not matter much. Other tools such as Pin/DynamoRIO have these hooks, and in their case it makes sense because one can choose to, instead of having one callback per executed instruction, get a callback per executed "trace" (a trace is a chain of TBs). Crucially, these tools do not go through an intermediate IR, and as a result are about 10X faster than QEMU. Therefore, given that QEMU is comparatively slow, we might find that per-executed-instruction callbacks do not end up slowing things down significantly. I like the idea of discussing an API alone, but there is value in having an implementation to go with it, so that we can explore the perf trade-offs involved. I was busy with other things the last 10 days or so, but by the end of this week I hope I'll be able to share some perf numbers on the per-TB vs. per-instruction callback issue. Thanks, Emilio