On Sat, 30 Oct 2010 14:04:11 +0100 Peter Clifton <pc...@cam.ac.uk> wrote:
> I think I'll need some help with this. I'm by no means a kernel > programmer, so I'm feeling my way in the dark with this. > > I want to design an interface so I can synchronise my GPU idle flags > polling with batchbuffer execution. I'm imagining at a high level, doing > something like this in my application (or mesa). (Hand-wavey-pseudocode) > > expose_event_handler () > { > static bool one_shot_trace = true; > > if (one_shot_trace) > mesa_debug_i915_trace_idle (TRUE); > > /* RENDERING COMMANDS IN HERE */ > SwapBuffers(); > > if (one_shot_trace) > mesa_debug_i915_trace_idle (FALSE); > > one_shot_trace = false; > } > > > I was imagining adding a flag to the EXECBUFFER2 IOCTL, or perhaps > adding a new EXECBUFFER3 IOCTL (which I'm playing with locally now). > Basically I just want to flag execbuffers which I'm interested in seeing > profiling data for. > > In order to get really high-resolution profiling, it would be > advantageous to confine it to the time-period of interest otherwise the > data rate is too high. I guestimated about 10MB/s for a binary > representation of the data I'm currently polling in user-space. More > spatial resolution would be nice too, so this could increase. Would be very cool to be able to correlate the data... > I think I have a vague idea how to do the GPU and logging parts, even if > I end up having to start the polling before the batchbuffer starts > executing. > > What I've got little / no clue how to is manage allocation of memory to > store the results in. > > Should userspace (mesa?) be passing buffers for the kernel to return > profiling data? Then retrieving it somehow when it "knows" the > batchbuffer is finished? This will probably require over-allocation with > a guestimate of required memory space to log the given batch-buffer. > > What about exporting via debugfs. Assuming the above code-fragment, we > could leave the last "frame" of polled data available, with the data > being overwritten when the next request to start logging comes in. > (That would perhaps require some kind of sequence number assigned if we > have multiple batches which come under the same request... or a separate > IOCTL to turn on / off logging). There's also relayfs, which is made for high bandwidth kernel->user communication. I'm not sure if it will make this any easier, but I think there's some documentation in the kernel about it. A ring buffer with the last N timestamps might also be a good way of exposing things. Having more than one entry available means that if userspace didn't get scheduled at the right time it would still have a good chance of getting all the data it missed since the last read. > > Also.. I'm not sure how the locking would work if userspace is reading > out the debugfs file whilst another frame is being executed. (We'd > probably need a secondary logging buffer allocating in that case). The kernel implementation of the read() side of the file could do some locking to prevent new data from corrupting a read in progress. -- Jesse Barnes, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx