Francisco Jerez <curroje...@riseup.net> writes: > Chris Wilson <ch...@chris-wilson.co.uk> writes: > >> On Sat, Oct 03, 2015 at 05:57:05PM +0300, Francisco Jerez wrote: >>> Jordan Justen <jordan.l.jus...@intel.com> writes: >>> >>> > From: Francisco Jerez <curroje...@riseup.net> >>> > >>> > Fixes >>> > arb_shader_image_load_store/execution/load-from-cleared-image.shader_test >>> > >>> > Cc: Chris Wilson <ch...@chris-wilson.co.uk> >>> > Cc: Jason Ekstrand <jason.ekstr...@intel.com> >>> > Tested-by: Jordan Justen <jordan.l.jus...@intel.com> >>> > --- >>> > RE: i965: Perform an explicit flush after doing >>> > _mesa_meta_pbo_TexSubImage >>> > >>> > curro has some concerns about potential perf impact by this and >>> > wanted it to be checked on small-core w/CPU bound apps. >>> > Unfortunately, he is on vacation now. >>> >>> I've benchmarked this on VLV and none of the CPU-bound tests in the >>> Finnish benchmarking system regress significantly, with n=6 and 95% >>> confidence level, so s/RFC/PATCH/. I'll CC mesa-stable so it probably >>> makes sense to keep this independent from Chris' VBO resolve series. >> >> I ran patch this on bsw (and repeated it afresh just to be sure) using >> synmark:Ogl*: >> >> 6994ca2 glsl: fix whitespace >> synmark:OglBatch3: 277.02 (+0.00%): min/p50/90/95/99/max/std = >> 271.902 / 277.024 / 277.952 / 278.117 / 278.233 / 278.409 / 1.27475 n=30 >> synmark:OglBatch3:cpu: 434.92 (+0.00%): min/p50/90/95/99/max/std = >> 429.869 / 434.755 / 437.818 / 438.236 / 439.022 / 439.251 / 2.19871 n=30 >> synmark:OglBatch4: 154.76 (+0.00%): min/p50/90/95/99/max/std = >> 153.394 / 154.75 / 155.636 / 155.643 / 156.089 / 156.172 / 0.721397 n=30 >> synmark:OglBatch4:cpu: 176.84 (+0.00%): min/p50/90/95/99/max/std = >> 176.239 / 176.838 / 177.068 / 177.394 / 177.451 / 177.46 / 0.273547 n=30 >> synmark:OglBatch5: 46.59 (+0.00%): min/p50/90/95/99/max/std = >> 45.9918 / 46.6053 / 46.819 / 46.842 / 46.8459 / 46.8538 / 0.26363 n=30 >> synmark:OglBatch5:cpu: 52.79 (+0.00%): min/p50/90/95/99/max/std = >> 52.1812 / 52.6714 / 53.3544 / 53.3605 / 53.3726 / 53.4059 / 0.402148 n=30 >> synmark:OglBatch6: 11.95 (+0.00%): min/p50/90/95/99/max/std = >> 11.7449 / 11.9523 / 12.0025 / 12.0026 / 12.0097 / 12.0304 / 0.0771611 n=30 >> synmark:OglBatch6:cpu: 14.17 (+0.00%): min/p50/90/95/99/max/std = >> 14.0292 / 14.169 / 14.1863 / 14.1889 / 14.1963 / 14.1999 / 0.0387371 n=30 >> synmark:OglBatch7: 3.04 (+0.00%): min/p50/90/95/99/max/std = >> 3.00939 / 3.03578 / 3.05513 / 3.05555 / 3.05582 / 3.05847 / 0.0148493 n=30 >> synmark:OglBatch7:cpu: 3.66 (+0.00%): min/p50/90/95/99/max/std = >> 3.63355 / 3.66219 / 3.66685 / 3.66706 / 3.66789 / 3.66943 / 0.00683119 n=30 >> Patched >> synmark:OglBatch3: 276.06 (-0.35%): min/p50/90/95/99/max/std = >> 269.608 / 276.098 / 277.25 / 277.354 / 277.967 / 278.013 / 2.0933 n=30 >> synmark:OglBatch3:cpu: 415.49 (-4.47%): min/p50/90/95/99/max/std = >> 412.316 / 415.554 / 417.828 / 417.9 / 418.471 / 419.22 / 1.84629 n=30 >> synmark:OglBatch4: 144.26 (-6.78%): min/p50/90/95/99/max/std = >> 143.126 / 144.188 / 145.026 / 145.114 / 145.126 / 145.356 / 0.527859 n=30 >> synmark:OglBatch4:cpu: 161.82 (-8.49%): min/p50/90/95/99/max/std = >> 161.247 / 161.82 / 162.12 / 162.169 / 162.172 / 162.222 / 0.254633 n=30 >> synmark:OglBatch5: 42.44 (-8.91%): min/p50/90/95/99/max/std = >> 41.856 / 42.4856 / 42.7209 / 42.8424 / 42.8436 / 42.8441 / 0.287101 n=30 >> synmark:OglBatch5:cpu: 47.90 (-9.27%): min/p50/90/95/99/max/std = >> 47.4268 / 47.7758 / 48.4164 / 48.4775 / 48.5086 / 48.5284 / 0.341355 n=30 >> synmark:OglBatch6: 10.86 (-9.09%): min/p50/90/95/99/max/std = >> 10.7564 / 10.8818 / 10.9238 / 10.926 / 10.9279 / 10.9535 / 0.0619808 n=30 >> synmark:OglBatch6:cpu: 12.80 (-9.62%): min/p50/90/95/99/max/std = >> 12.7149 / 12.8064 / 12.8179 / 12.8228 / 12.8249 / 12.8255 / 0.0235037 n=30 >> synmark:OglBatch7: 2.76 (-9.02%): min/p50/90/95/99/max/std = >> 2.74078 / 2.7634 / 2.77936 / 2.78025 / 2.78254 / 2.7827 / 0.0126239 n=30 >> synmark:OglBatch7:cpu: 3.29 (-10.12%): min/p50/90/95/99/max/std >> = 3.26676 / 3.29127 / 3.29445 / 3.29464 / 3.29472 / 3.29616 / 0.0072781 n=30 >> >> >> 6994ca2 glsl: fix whitespace >> synmark:OglBatch3: 276.90 (+0.00%): min/p50/90/95/99/max/std = >> 274.104 / 276.81 / 277.697 / 278.063 / 278.067 / 278.505 / 0.914328 n=30 >> synmark:OglBatch3:cpu: 434.96 (+0.00%): min/p50/90/95/99/max/std = >> 429.492 / 434.784 / 437.174 / 437.482 / 437.548 / 439.812 / 2.09205 n=30 >> synmark:OglBatch4: 154.06 (+0.00%): min/p50/90/95/99/max/std = >> 152.336 / 153.995 / 155.37 / 155.446 / 155.544 / 155.636 / 0.919322 n=30 >> synmark:OglBatch4:cpu: 176.45 (+0.00%): min/p50/90/95/99/max/std = >> 175.959 / 176.435 / 176.686 / 176.718 / 176.892 / 176.9 / 0.247188 n=30 >> synmark:OglBatch5: 45.88 (+0.00%): min/p50/90/95/99/max/std = >> 45.2706 / 45.7631 / 46.6576 / 46.6662 / 46.6929 / 46.7339 / 0.485474 n=30 >> synmark:OglBatch5:cpu: 52.65 (+0.00%): min/p50/90/95/99/max/std = >> 52.025 / 52.5863 / 53.0849 / 53.0974 / 53.1337 / 53.1717 / 0.306497 n=30 >> synmark:OglBatch6: 11.88 (+0.00%): min/p50/90/95/99/max/std = >> 11.7448 / 11.8725 / 11.9216 / 11.9337 / 11.935 / 11.9444 / 0.0534648 n=30 >> synmark:OglBatch6:cpu: 14.13 (+0.00%): min/p50/90/95/99/max/std = >> 14.0528 / 14.1313 / 14.1533 / 14.1584 / 14.1585 / 14.1721 / 0.0227182 n=30 >> synmark:OglBatch7: 3.02 (+0.00%): min/p50/90/95/99/max/std = >> 2.99852 / 3.02145 / 3.04142 / 3.04273 / 3.04422 / 3.04584 / 0.0141605 n=30 >> synmark:OglBatch7:cpu: 3.66 (+0.00%): min/p50/90/95/99/max/std = >> 3.65238 / 3.66158 / 3.66565 / 3.66662 / 3.66722 / 3.66726 / 0.00423393 n=30 >> Patched >> synmark:OglBatch3: 275.73 (-0.42%): min/p50/90/95/99/max/std = >> 269.989 / 275.696 / 277.249 / 277.262 / 277.306 / 277.577 / 1.59707 n=30 >> synmark:OglBatch3:cpu: 412.83 (-5.09%): min/p50/90/95/99/max/std = >> 410.283 / 412.828 / 415.52 / 415.523 / 416.087 / 416.527 / 1.66123 n=30 >> synmark:OglBatch4: 144.29 (-6.34%): min/p50/90/95/99/max/std = >> 142.661 / 144.363 / 145.052 / 145.105 / 145.147 / 145.276 / 0.763492 n=30 >> synmark:OglBatch4:cpu: 161.53 (-8.45%): min/p50/90/95/99/max/std = >> 160.928 / 161.522 / 161.847 / 162.003 / 162.048 / 162.116 / 0.263058 n=30 >> synmark:OglBatch5: 41.75 (-9.01%): min/p50/90/95/99/max/std = >> 41.3497 / 41.7404 / 41.9044 / 41.9791 / 42.0902 / 42.1262 / 0.166136 n=30 >> synmark:OglBatch5:cpu: 47.91 (-9.02%): min/p50/90/95/99/max/std = >> 47.53 / 47.8143 / 48.3434 / 48.3721 / 48.4509 / 48.4631 / 0.292788 n=30 >> synmark:OglBatch6: 10.89 (-8.27%): min/p50/90/95/99/max/std = >> 10.7444 / 10.8922 / 10.9323 / 10.9384 / 10.9477 / 10.9482 / 0.0535149 n=30 >> synmark:OglBatch6:cpu: 12.78 (-9.56%): min/p50/90/95/99/max/std = >> 12.6762 / 12.7798 / 12.7961 / 12.7967 / 12.7981 / 12.8059 / 0.0355064 n=30 >> synmark:OglBatch7: 2.76 (-8.65%): min/p50/90/95/99/max/std = >> 2.74121 / 2.76015 / 2.77461 / 2.77479 / 2.77892 / 2.77992 / 0.0111074 n=30 >> synmark:OglBatch7:cpu: 3.29 (-10.07%): min/p50/90/95/99/max/std >> = 3.26976 / 3.29293 / 3.29758 / 3.29878 / 3.29917 / 3.29936 / 0.00586701 n=30 >> >> nothing else stood from the noise. (The cpu variants are with INTEL_NO_HW=1.) >> -Chris > > I don't see anything like that on VLV even after going up to n=30. What > compiler options did you use to build mesa? Does the attached change > have any effect on the results? Do you see a comparable regression > after moving the image resolves to the VBO hook you added in your other > series? > > BTW please send any absolute BSW FPS results to me in private rather > than to the public mailing list... >
I've had the chance to test this on BSW and couldn't reproduce any regression. Could you send me a branch and SHA hashes of the exact revisions you tested, and the compiler options you used to build them? >> >> -- >> Chris Wilson, Intel Open Source Technology Centre > > diff --git a/src/mesa/drivers/dri/i965/brw_context.c > b/src/mesa/drivers/dri/i965/brw_context.c > index b6b8262..50788fd 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.c > +++ b/src/mesa/drivers/dri/i965/brw_context.c > @@ -191,17 +191,20 @@ intel_update_state(struct gl_context * ctx, GLuint > new_state) > > /* Resolve color for each active shader image. */ > for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { > - const struct gl_shader *shader = ctx->_Shader->CurrentProgram[i] ? > - ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL; > + const struct gl_shader_program *prog = ctx->_Shader->CurrentProgram[i]; > > - if (unlikely(shader && shader->NumImages)) { > - for (unsigned j = 0; j < shader->NumImages; j++) { > - struct gl_image_unit *u = > &ctx->ImageUnits[shader->ImageUnits[j]]; > - tex_obj = intel_texture_object(u->TexObj); > + if (unlikely(prog)) { > + const struct gl_shader *shader = prog->_LinkedShaders[i]; > > - if (tex_obj && tex_obj->mt) { > - intel_miptree_resolve_color(brw, tex_obj->mt); > - brw_render_cache_set_check_flush(brw, tex_obj->mt->bo); > + if (unlikely(shader && shader->NumImages)) { > + for (unsigned j = 0; j < shader->NumImages; j++) { > + struct gl_image_unit *u = > &ctx->ImageUnits[shader->ImageUnits[j]]; > + tex_obj = intel_texture_object(u->TexObj); > + > + if (tex_obj && tex_obj->mt) { > + intel_miptree_resolve_color(brw, tex_obj->mt); > + brw_render_cache_set_check_flush(brw, tex_obj->mt->bo); > + } > } > } > }
signature.asc
Description: PGP signature
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev