On Thu, Jul 16, 2015 at 01:45:56PM -0700, Chad Versace wrote: > On Tue 14 Jul 2015, Ben Widawsky wrote: > > This patch rewrites the logic for determining which method we using for > > mapping > > a miptree. It is my intention that that this patch, the required patches > > before > > this do not change functionality, or if they do, it's in very obscure an > > unobservable cases. > > > > I have two reasons why I decided to write this patch. The existing logic > > was way > > too tricky. In particular, the way in which it evaluated which operation to > > use > > was out of order - specifically when it checked to use the blitter in > > use_intel_mipree_map_blit(), part of the check is to determine if it will > > later > > be unable to use the GTT. The other reason is to make playing with the > > various > > operations much easier. For example, there are some theories being thrown > > around > > that we might actually want to use the blitter where we use the GTT today, > > and > > vice versa. After this patch, benchmarking those changes is much more > > straightforward. > > > > It's pretty difficult for me to prove there is no real change going on. I > > ran a > > subset of my benchmarks on this though. The following benchmarks show no > > perf > > difference on BDW with ministat with n=5 and CI=.95: > > OglBatch7 > > OglDeferred > > OglFillPixel > > OglGeomPoint > > OglGeomTriList > > OglHdrBloom > > OglPSBump2 > > OglPSPhong > > OglPSPom > > OglShMapPcf > > OglTerrainFlyInst > > OglTexMem512 > > OglVSDiffuse8 > > OglVSInstancing > > OglZBuffer > > plot3d > > trex > > > > It's important to point out that much of the changes effect non-LLC > > platform, > > and I do not yet have data for that. I'll be collecting it over the next few > > days, but I figure this patch can get some comments meanwhile. > > > > Signed-off-by: Ben Widawsky <b...@bwidawsk.net> > > --- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 76 > > +++++++++++++-------------- > > 1 file changed, 37 insertions(+), 39 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > index 2788270..545fbf3 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -2283,6 +2283,8 @@ intel_miptree_unmap_movntdqa(struct brw_context *brw, > > map->buffer = NULL; > > map->ptr = NULL; > > } > > +#else > > +#define intel_miptree_map_movntdqa(x,y,z,w,a) abort() > > #endif > > > > static void > > @@ -2621,36 +2623,6 @@ can_blit_slice(struct brw_context *brw, > > return true; > > } > > > > -static bool > > -use_intel_mipree_map_blit(struct brw_context *brw, > > - struct intel_mipmap_tree *mt, > > - GLbitfield mode, > > - unsigned int level, > > - unsigned int slice) > > -{ > > - if (brw->has_llc && > > - !(mode & GL_MAP_WRITE_BIT) && > > - can_blit_slice(brw, mt, level, slice)) > > - return true; > > - > > - if (mt->tiling != I915_TILING_NONE && > > - mt->bo->size >= brw->max_gtt_map_object_size) { > > - /* XXX: This assertion is actually the final condition for platforms > > - * without SSE4.1. Returning false is not the right thing to do with > > - * the current code. On those platforms, the goal of this function > > is to give > > - * preference to the GTT, and at this point we've determined we > > cannot use > > - * the GTT, and we cannot blit, so we are out of options. > > - * > > - * NOTE: It should be possible to actually handle the case, but > > AFAIK, we > > - * never get this assertion. > > - */ > > - assert(can_blit_slice(brw, mt, level, slice)); > > - return true; > > - } > > - > > - return false; > > -} > > - > > /** > > * Parameter \a out_stride has type ptrdiff_t not because the buffer > > stride may > > * exceed 32 bits but to diminish the likelihood subtle bugs in pointer > > @@ -2706,18 +2678,44 @@ intel_miptree_map(struct brw_context *brw, > > goto done; > > } > > > > - if (use_intel_mipree_map_blit(brw, mt, mode, level, slice)) { > > - intel_miptree_map_blit(brw, mt, map, level, slice); > > + /* First determine what the available option are, then pick from the > > best > > + * option based on the platform. > > + */ > > + bool can_hw_blit = can_blit_slice(brw, mt, level, slice); > > can_hw_blit is too weak of a condition. > > "can" is very different from "should". Before, this function chose to > call intel_miptree_map_blit() if use_intel_mipree_map_blit() recommended > it (because "use" really means "should use" in that function name). The > set of conditions that satisfies "can" are much larger. > > For example, can_blit_slice() should return true for linear buffers > (they are blittable, after all). However, intel_miptree_map() should > mmap those buffers instead of blitting them. > > > + bool can_use_gtt = mt->bo->size < brw->max_gtt_map_object_size; > > #if defined(USE_SSE41) > > - } else if (!(mode & GL_MAP_WRITE_BIT) && > > - !mt->compressed && cpu_has_sse4_1 && > > - (mt->pitch % 16 == 0)) { > > - intel_miptree_map_movntdqa(brw, mt, map, level, slice); > > + bool can_stream_map = cpu_has_sse4_1 && mt->pitch % 16 == 0; > > +#else > > + bool can_stream_map = false; > > #endif > > - } else { > > - assert(mode & GL_MAP_WRITE_BIT == 0); > > - assert(!mt->compressed); > > + > > + if (can_stream_map) { > > + /* BENCHMARK_ME: GTT maps for non-llc */ > > + intel_miptree_map_movntdqa(brw, mt, map, level, slice); > > + goto done; > > + } > > + > > + /* > > + * Hopefully we've been able to use the streaming copy, but if we really > > + * can't, make a decision based on the two things we know to matter: > > tiling, > > + * and LLC. > > + * > > + * The general thinking is that with shared cache, doing software > > detiling > > + * is always cheaper than what we'd get from the GTT, and we always > > want a > > + * cached mapping, (so we use the blitter). > > + * > > + * On non-LLC, since we don't get any benefit from using the blitter wrt > > + * caching, and both mechanism can do our detile (support was determined > > + * already), opt for GTT first to match the legacy behavior. > > + * > > + * BENCHMARK_ME: non-llc + tiled + blitter > > + */ > > + if (brw->has_llc && can_hw_blit) { > > + intel_miptree_map_blit(brw, mt, map, level, slice); > > + } else if (can_use_gtt) { > > intel_miptree_map_gtt(brw, mt, map, level, slice); > > + } else { > > + unreachable("We don't yet support slice blits"); > > } > > I find the new logic, with multiple if-trees and gotos, more difficult > to follow and verify for correctness that old style of: > > if (a) { > do_stuff; > } else if (b) { > do_stuff; > } else if (c) { > do_stuff; > } else { > do_stuff; > } > > With the old style, given a miptree and the map's readwrite mode, > I could easily walk through the if tree until I reached "true". With the > new style in this patch, in theory I could do the same, but I no longer > have confidence that my logic-walking would be correct. > > Could you accomplish a similar set of cleanups, and still preserve the > monolithic 'if' tree, by moving all the helper variables (such as > can_use_gtt) to the top of the function, before the 'if' tree begins?
Forgive my bluntness but I read what you just said as the old way is easier to read and verify (something I completely disagree with, but it's certainly a subjective thing). Moving more of the conditions into a single variable is a benefit assuming you can do it well, but it doesn't solve the complexity that I see here. I see no reason, and have no motivation to continue down this path if this is your impression of the patch. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev