On Wed, Jun 03, 2015 at 09:32:55PM +0300, Pohjolainen, Topi wrote: > On Wed, Jun 03, 2015 at 09:21:11PM +0300, Topi Pohjolainen wrote: > > Shader programs are cached per stage (FS, VS, GS) using the > > corresponding shader source identifier and compile time choices > > as key. However, one not only stores the program binary but > > a pair consisting of program binary and program data. The latter > > represents the store of constants (such as uniforms) used by > > the program. > > > > However, when programs are searched in the cache for reloading > > only the program key representing the binary is considered > > (see for example, brw_upload_wm_prog() and brw_search_cache()). > > Hence, when programs are re-loaded from cache the first program > > binary, program data pair is extracted without considering if > > the program data matches the currently in use uniform storage > > as well. > > > > My reasoning Why this actually works is because the key > > contains the identifier of the corresponding gl_program that > > represents the source code for the shader program. Hence, > > two programs having identical source code still have unique > > keys. > > And therefore brw_try_upload_using_copy() never encounters > > a case where a matching binary is found but the program data > > doesn't match. > > In fact, thinking some more I think this is possible when the > same, say fragment shader, is used with two different vertex > shaders. This results into there being matching binaries but > program data pointing to different storage. Looking at > brw_upload_cache() I still can't see how failing > brw_try_upload_using_copy() makes a difference. We only upload > the program binary again (even though that is the part that > actually matches). And then proceed the same way regardless > of the result of brw_try_upload_using_copy(). The program data > gets augmented with the key. > > But the point remains that when a program is reloaded through > the brw_search_cache() only the key (and not the program data) > is considered returning the first matching pair. > > I probably need to write a piglit test for this. > > > > > My ultimate goal is to stop storing pointers to the individual > > components of a uniform but to store only a pointer to the > > "struct gl_uniform_storage" instead, and allow > > gen6_upload_push_constants() to iterate over individual > > components and array elements. This is needed to be able to > > convert 32-bits floats to fp16 - otherwise there is only > > pointer to 32-bits without knowing its type (int, float, etc) > > let alone its target precision. > > > > No regression in jenkins. However, we talked about this with > > Ken and this doesn't really tell much as piglit doesn't really > > re-use shader sources during one execution. > > > > Signed-off-by: Topi Pohjolainen <topi.pohjolai...@intel.com> > > CC: Kenneth Graunke <kenn...@whitecape.org> > > CC: Tapani P\344lli <tapani.pa...@intel.com> > > --- > > src/mesa/drivers/dri/i965/brw_program.c | 6 ------ > > 1 file changed, 6 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_program.c > > b/src/mesa/drivers/dri/i965/brw_program.c > > index e5c0d3c..7f5fde8 100644 > > --- a/src/mesa/drivers/dri/i965/brw_program.c > > +++ b/src/mesa/drivers/dri/i965/brw_program.c > > @@ -576,12 +576,6 @@ brw_stage_prog_data_compare(const struct > > brw_stage_prog_data *a, > > if (memcmp(a, b, offsetof(struct brw_stage_prog_data, param))) > > return false; > > > > - if (memcmp(a->param, b->param, a->nr_params * sizeof(void *))) > > - return false; > > - > > - if (memcmp(a->pull_param, b->pull_param, a->nr_pull_params * > > sizeof(void *))) > > - return false; > > - > > return true; > > } > >
I am looking at a lot of this code for the first time, and I have a kind of wild guess. The first time you upload a program, the program (kinda annoying that brw_upload_item_data doesn't seem to actually do that). Malloc a pointer (tmp, item->key), store the program and aux there. Set that pointer as the key. The aux data lives at key + key_size. Indeed search_cache() only checks the key. For WM it does contain the urb_entry data that I think would change if number of uniforms differed. So for your example above with 2 VS sharing an FS, if the number of uniforms are the same, then the program should be identical in the FS, right? Similarly for the GS with input_varyings. I think generally this is the behavior you'd want. brw_try_upload_using_copy() seems correct to me as it does do the aux_compare (and falls back to memcmp). I didn't verify aux is really only uniforms and/or varyings though. Hopefully this wasn't too stupid. _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev