Hi guys, Would a utility to produce views such as the attached help debug performance bottlenecks?
I took intel_gpu_top, ripped out all the bargraph and sorting code, then taught it to spit out a CSV file. For the graph below, I took data for a fraction of a second, sampling with no delays (just IO delays in the redirected printf), with 50 samples per data-point working out a percentage figure. For my benchmark application, I inserted a delay of approx 10ms into the benchmark loop, so frames could easily be identified in the output trace. The pdf attached shows a zoomed in portion of rendering time of one particular frame. I'll note that outside this view, there are some short stalls in the frame which probably indicates my app could do better at keeping the GPU busy with data. What I'm seeing, is: a pretty busy / stalled render cache (not always at 100%), a pretty busy / stalled windowizer (iz) (not always at 100%), a pretty busy / stalled GS unit (nearly always 100%) a pretty busy / stalled CLIP unit (nearly always 100%) a fully busy / stalled CS unit (almost always 100%) Regarding the other units, I'd _really_ love if someone at Intel could identify the meanings / functions of the other less well known acronym identified flags in the instdone registers. What is "DM" for instance? Sadly Openoffice is awful at graphing this large quantity of data, so is really slow. If anyone wants the data-file, the tool's source-code or OO Spreadsheet, let me know and I'll send them. Regards -- Peter Clifton Electrical Engineering Division, Engineering Department, University of Cambridge, 9, JJ Thomson Avenue, Cambridge CB3 0FA Tel: +44 (0)7729 980173 - (No signal in the lab!) Tel: +44 (0)1223 748328 - (Shared lab phone, ask for me)
gpu_top2.pdf
Description: Adobe PDF document
_______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx