I think I'll need some help with this. I'm by no means a kernel programmer, so I'm feeling my way in the dark with this.
I want to design an interface so I can synchronise my GPU idle flags polling with batchbuffer execution. I'm imagining at a high level, doing something like this in my application (or mesa). (Hand-wavey-pseudocode) expose_event_handler () { static bool one_shot_trace = true; if (one_shot_trace) mesa_debug_i915_trace_idle (TRUE); /* RENDERING COMMANDS IN HERE */ SwapBuffers(); if (one_shot_trace) mesa_debug_i915_trace_idle (FALSE); one_shot_trace = false; } I was imagining adding a flag to the EXECBUFFER2 IOCTL, or perhaps adding a new EXECBUFFER3 IOCTL (which I'm playing with locally now). Basically I just want to flag execbuffers which I'm interested in seeing profiling data for. In order to get really high-resolution profiling, it would be advantageous to confine it to the time-period of interest otherwise the data rate is too high. I guestimated about 10MB/s for a binary representation of the data I'm currently polling in user-space. More spatial resolution would be nice too, so this could increase. I think I have a vague idea how to do the GPU and logging parts, even if I end up having to start the polling before the batchbuffer starts executing. What I've got little / no clue how to is manage allocation of memory to store the results in. Should userspace (mesa?) be passing buffers for the kernel to return profiling data? Then retrieving it somehow when it "knows" the batchbuffer is finished? This will probably require over-allocation with a guestimate of required memory space to log the given batch-buffer. What about exporting via debugfs. Assuming the above code-fragment, we could leave the last "frame" of polled data available, with the data being overwritten when the next request to start logging comes in. (That would perhaps require some kind of sequence number assigned if we have multiple batches which come under the same request... or a separate IOCTL to turn on / off logging). Also.. I'm not sure how the locking would work if userspace is reading out the debugfs file whilst another frame is being executed. (We'd probably need a secondary logging buffer allocating in that case). Thoughts? -- Peter Clifton Electrical Engineering Division, Engineering Department, University of Cambridge, 9, JJ Thomson Avenue, Cambridge CB3 0FA Tel: +44 (0)7729 980173 - (No signal in the lab!) Tel: +44 (0)1223 748328 - (Shared lab phone, ask for me) _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx