On Fri, Jul 28, 2017 at 19:05:43 +0300, LluĂs Vilanova wrote: > As for the (minimum) requirements I've collected: > > * Peek at register values and guest memory. > * Enumerate guest cpus. > * Control when events are raised to minimize overheads (e.g., avoid generating > TCG code to trace a guest code event we don't care about just now). > * When a guest code event is raised at translation time, let instrumentor > decide > if it needs to be raised at execution time too (e.g., might be decided on > instruction opcode) > * Minimum overhead when instrumenting (I think the proposed fifo/socket > approach > does not meet this; point 2 helps a lot here, which is what the current > tracing code does) > * Let instrumentor know when TBs are being freed (i.e., to free internal > per-TB > structures; I have a patch queued adding this event). > > Nice to have: > > * Avoid recompilation for each new instrumentation logic. > * Poke at register values and guest memory. > * [synchronous for sure] Let user annotate guest programs to raise additional > events with guest-specified information (the hypertrace series I sent). > * [synchronous for sure] Make guest breakpoints/watchpoints raise an event > (for > when we cannot annotate the guest code; I have some patches). > * Trigger QEMU's event tracing routine when the instrumentation code > decides. This is what this series does now, as then lazy instrumentors don't > need to write their own tracing-to-disk code. Otherwise, per-event tracing > states would need to be independent of instrumentation states (but the logic > is exactly the same). > * Let instrumentor define additional arguments to execution-time events during > translation-time (e.g., to quickly index into an instrumentor struct when > the > execution-time event comes that references info collected during > translation). > * Attach an instrumentor-specified value to each guest CPU (e.g., pointing to > the instrumentor's cpu state). Might be less necessary if the point above is > met (if we only look at overall performance).
Agreed. An additional "nice to have" would be: * Allow inlining of TCG code by the instrumenter. Example use case: the instrumenter wants to increment a counter every time a basic block is executed. Instead of calling a callback function on every block's execution, we could just have a translation-time callback to emit at the beginning of the translated block the counter increment. This would be much faster, and is something that all other tools (e.g. DynamoRIO/Pin) implement. E.