On Thu, Aug 27, 2015 at 10:05:14AM -0700, Ben Widawsky wrote: > On Thu, Aug 27, 2015 at 10:51:59AM +0300, Pohjolainen, Topi wrote: > > On Wed, Aug 26, 2015 at 03:46:05PM -0700, Ben Widawsky wrote: > > > This reverts commit 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf > > > Author: Topi Pohjolainen <topi.pohjolai...@intel.com> > > > Date: Thu Jun 25 14:00:41 2015 +0300 > > > > > > i965: Stop aux data compare preventing program binary re-use > > > > > > This fixes an intermittent failure in > > > piglit.spec.arb_pixel_buffer_object.texsubimage pbo.sklm64 (maybe other > > > platforms as well, but it is harder to reproduce). I can usually hit the > > > failure > > > within 10 runs of the test. This is a very hairy commit to debug. I'll > > > let Topi > > > handle it, or else we should go with the revert. I am open to either. I > > > got > > > lucky that Jenkins caught this on a run. > > > > > > Here was the script I used for bisect: > > > > > > i=0 > > > while [ $i -lt 40 ] ; do > > > ./bin/texsubimage pbo -auto -fbo > /dev/null 2>&1 > > > [[ $? != 0 ]] && echo fail && exit 1 > > > ((i++)) > > > done > > > > > > exit 0 > > > > Should I use different piglit version than the current master? I'm asking > > because I get this with both Mesa master and my patch reverted. > > My piglit was pretty old. I just updated that, but mesa was master as of > yesterday (output is below) > > The one bit of advice I can add is that you make sure your system is updated > to > the very latest. > > > > > testrunner@skl-y:~/topi/piglit$ ./bin/texsubimage pbo -auto -fbo > > Using test set: Core formats > > texsubimage failed > > target: GL_TEXTURE_2D > > internal format: GL_COMPRESSED_RGB_S3TC_DXT1_EXT > > region: 68, 12 32 x 48 > > texsubimage failed > > target: GL_TEXTURE_2D > > internal format: GL_COMPRESSED_RGBA_S3TC_DXT1_EXT > > region: 0, 28 116 x 20 > > texsubimage failed > > target: GL_TEXTURE_2D > > internal format: GL_COMPRESSED_RGBA_S3TC_DXT3_EXT > > region: 16, 4 60 x 36 > > texsubimage failed > > target: GL_TEXTURE_2D > > internal format: GL_COMPRESSED_RGBA_S3TC_DXT5_EXT > > region: 8, 0 104 x 60 > > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage2D(out of bounds PBO > > access) > > PIGLIT: {"result": "fail" } > > Interesting. It so happens I have a patch that purports to fix some of these > things. I did not try this patch myself for this issue. > http://patchwork.freedesktop.org/patch/54025/ > > (I've been hanging on to this since I needed to do a bit of research to > address > Matt's feedback, specifically regarding render compression). > > > > > > > Could you include the console output of the failure you get? > > > > Here are two sample failures (occurred in 7 runs) > > Using test set: Core formats > texsubimage failed > target: GL_TEXTURE_2D > internal format: GL_INTENSITY > region: 27, 1 13 x 61 > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage3D(out of bounds PBO > access) > > Using test set: Core formats > texsubimage failed > target: GL_TEXTURE_2D > internal format: GL_INTENSITY > region: 80, 38 24 x 25 > texsubimage failed > target: GL_TEXTURE_2D > internal format: GL_LUMINANCE8 > region: 50, 5 26 x 58 > Mesa: User error: GL_INVALID_OPERATION in glTexSubImage2D(out of bounds PBO > access) > PIGLIT: {"result": "fail" } > > > > > > > > Cc: <mesa-sta...@lists.freedesktop.org> > > > Cc: Kenneth Graunke <kenn...@whitecape.org> > > > Cc: Topi Pohjolainen <topi.pohjolai...@intel.com> > > > Reported-by: Mark Janes <mark.a.ja...@intel.com> (jenkins) > > > Signed-off-by: Ben Widawsky <b...@bwidawsk.net> > > > --- > > > src/mesa/drivers/dri/i965/brw_state_cache.c | 52 > > > ++++++++++++++++++----------- > > > 1 file changed, 32 insertions(+), 20 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/brw_state_cache.c > > > b/src/mesa/drivers/dri/i965/brw_state_cache.c > > > index fbc0419..e50d6a0 100644 > > > --- a/src/mesa/drivers/dri/i965/brw_state_cache.c > > > +++ b/src/mesa/drivers/dri/i965/brw_state_cache.c > > > @@ -200,23 +200,36 @@ brw_cache_new_bo(struct brw_cache *cache, uint32_t > > > new_size) > > > } > > > > > > /** > > > - * Attempts to find an item in the cache with identical data. > > > + * Attempts to find an item in the cache with identical data and aux > > > + * data to use > > > */ > > > -static const struct brw_cache_item * > > > -brw_lookup_prog(const struct brw_cache *cache, > > > - enum brw_cache_id cache_id, > > > - const void *data, unsigned data_size) > > > +static bool > > > +brw_try_upload_using_copy(struct brw_cache *cache, > > > + struct brw_cache_item *result_item, > > > + const void *data, > > > + const void *aux) > > > { > > > - const struct brw_context *brw = cache->brw; > > > + struct brw_context *brw = cache->brw; > > > unsigned i; > > > - const struct brw_cache_item *item; > > > + struct brw_cache_item *item; > > > > > > for (i = 0; i < cache->size; i++) { > > > for (item = cache->items[i]; item; item = item->next) { > > > + const void *item_aux = item->key + item->key_size; > > > int ret; > > > > > > - if (item->cache_id != cache_id || item->size != data_size) > > > + if (item->cache_id != result_item->cache_id || > > > + item->size != result_item->size || > > > + item->aux_size != result_item->aux_size) { > > > + continue; > > > + } > > > + > > > + if (cache->aux_compare[result_item->cache_id]) { > > > + if (!cache->aux_compare[result_item->cache_id](item_aux, > > > aux)) > > > + continue; > > > + } else if (memcmp(item_aux, aux, item->aux_size) != 0) { > > > continue; > > > + } > > > > > > if (!brw->has_llc) > > > drm_intel_bo_map(cache->bo, false); > > > @@ -226,11 +239,13 @@ brw_lookup_prog(const struct brw_cache *cache, > > > if (ret) > > > continue; > > > > > > - return item;
I'm getting stronger feeling that this is somehow timing related. I forced the cache lookup to always miss by replacing the line above by return NULL; And with Ben's machine against gbm backend the error observed by Ben still happens. This should be the safest mechanism as nothing is re-used. I'll keep looking. > > > + result_item->offset = item->offset; > > > + > > > + return true; > > > } > > > } > > > > > > - return NULL; > > > + return false; > > > } > > > > > > static uint32_t > > > @@ -279,8 +294,6 @@ brw_upload_cache(struct brw_cache *cache, > > > { > > > struct brw_context *brw = cache->brw; > > > struct brw_cache_item *item = CALLOC_STRUCT(brw_cache_item); > > > - const struct brw_cache_item *matching_data = > > > - brw_lookup_prog(cache, cache_id, data, data_size); > > > GLuint hash; > > > void *tmp; > > > > > > @@ -292,15 +305,14 @@ brw_upload_cache(struct brw_cache *cache, > > > hash = hash_key(item); > > > item->hash = hash; > > > > > > - /* If we can find a matching prog in the cache already, then reuse the > > > - * existing stuff without creating new copy into the underlying buffer > > > - * object. This is notably useful for programs generating shaders at > > > - * runtime, where multiple shaders may compile to the same thing in > > > our > > > - * backend. > > > + /* If we can find a matching prog/prog_data combo in the cache > > > + * already, then reuse the existing stuff. This will mean not > > > + * flagging CACHE_NEW_* when transitioning between the two > > > + * equivalent hash keys. This is notably useful for programs > > > + * generating shaders at runtime, where multiple shaders may > > > + * compile to the thing in our backend. > > > */ > > > - if (matching_data) { > > > - item->offset = matching_data->offset; > > > - } else { > > > + if (!brw_try_upload_using_copy(cache, item, data, aux)) { > > > item->offset = brw_alloc_item_data(cache, data_size); > > > > > > /* Copy data to the buffer */ > > > -- > > > 2.5.0 > > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev