Jason, Curro, do you have any opinion if this is worth pursuing? I need something for blorp blits at least - using blorp for texture uploads on top of current excessive flushing regresses perf.
When working on gpu hangs on SKL we also identified compute flushing caches it shouldn't. I think those could be addressed nicely here as well. I can respin if this is something we'd like to have. On Fri, Feb 17, 2017 at 09:32:03PM +0200, Topi Pohjolainen wrote: > Currently: > > 1) Blorp color clears and resolves emit unconditional render target > flush + command stream after every clear/resolve (including > regular non-fast clears). > > 2) Blorp color clears, resolves and blits emit texture and constant > cache resolves even in case only destination is dirty. This is > because brw_render_cache_set_check_flush() does both render target > flush as well as the top-of-pipe read cache flushes. > > 3) Similarly to item 2, 3D and compute paths also flush texture and > constant caches even if none of the texture surfaces are dirty. > > 4) In case of multiple surfaces needing resolves, all render paths > (blorp, 3D and compute) emit render target, texture and constant > cache flushes after each resolve instead of just once after all > resolves. > > This series addresses all four cases. Good news are that even though > the current setup isn't optimal, it doesn't actually get any better in > most cases performance wise. There is modest gain in OglDrvRes which > does heavy blorp blitting. I'm expecting this series also to make > blorp tex uploads and blorp mipmap generation more competitive. > > Bad news are in the final patch - it looks that current unconditional > flushing/stalling has been hiding bugs elsewhere. There are cases > which rely on the flushes after non-fast clears. Hunting the real > cause is, however, difficult. I only saw them in CI system within > full runs and was not able to reproduce them myself. > > As first steps the series introduces end-of-pipe synchronization. > This is a flush combined with stall and post-sync operation of > writing a double word (32 bits). Until now this wasn't really > needed as there was in many cases double flushing which in turn > looks to take long enough to hide the need for the sync. I also > noticed that one needs to be rather careful with it - performance > gets decreased noticeably when used unneeded. > > I don't really know if we want to go this way myself even. Current > logic - while not ideal - is rather simple. > > Topi Pohjolainen (16): > i965/miptree: Tell if anything got resolved > i965/gen6+: Implement end-of-pipe sync > i965: Hook end-of-pipe-sync after texture resolves > i965: Hook end-of-pipe-sync after image resolves > i965: Hook end-of-pipe-sync after framebuffer resolves > i965: Consider layered rt resolves along with other > i965: Add color resolve end-of-pipe-sync before switch to blit ring > i965/dri2: Add end-of-pipe-sync after color resolves > i965/miptree: Add color resolve end-of-pipe-sync before sharing > i965: Add end-of-pipe sync before non-gpu read of color resolves > i965/blorp: Do more fine grained flushing/syncing > i965/blorp/blit: Refactor hiz/ccs prep for blits > i965/blorp: Use conditional end-of-pipe-sync > i965: Consider surface resolves and sync after blorp ops > i965: Check if fast color clear state transition needs sync > i965/blorp: Drop unnecessary flushes after clear/resolve > > src/mesa/drivers/dri/i965/brw_blorp.c | 187 ++++++++++---- > src/mesa/drivers/dri/i965/brw_compute.c | 2 + > src/mesa/drivers/dri/i965/brw_context.c | 333 > +++++++++++++++++++------ > src/mesa/drivers/dri/i965/brw_context.h | 3 + > src/mesa/drivers/dri/i965/brw_draw.c | 36 +-- > src/mesa/drivers/dri/i965/brw_pipe_control.c | 91 +++++++ > src/mesa/drivers/dri/i965/genX_blorp_exec.c | 11 - > src/mesa/drivers/dri/i965/intel_blit.c | 16 +- > src/mesa/drivers/dri/i965/intel_copy_image.c | 10 +- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 25 +- > src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 2 +- > src/mesa/drivers/dri/i965/intel_pixel.c | 4 + > src/mesa/drivers/dri/i965/intel_pixel_bitmap.c | 5 +- > src/mesa/drivers/dri/i965/intel_pixel_read.c | 7 +- > src/mesa/drivers/dri/i965/intel_tex_image.c | 10 +- > src/mesa/drivers/dri/i965/intel_tex_subimage.c | 11 +- > 16 files changed, 557 insertions(+), 196 deletions(-) > > -- > 2.5.5 > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev