On Friday, September 29, 2017 4:08:17 AM PDT Iago Toral wrote: > On Fri, 2017-09-29 at 03:53 -0700, Kenneth Graunke wrote: > > On Friday, September 29, 2017 3:36:34 AM PDT Iago Toral Quiroga > > wrote: > > > We can start reading the URB at the first offset that contains > > > varyings > > > that are actually read in the URB. We still need to make sure that > > > we > > > read at least one varying to honor hardware requirements. > > > > > > This helps alleviate a problem introduced with 99df02ca26f61 for > > > separate shader objects: without separate shader objects we assign > > > locations sequentially, however, since that commit we have changed > > > the > > > method for SSO so that the VUE slot assigned depends on the number > > > of > > > builtin slots plus the location assigned to the varying. This fixed > > > layout is intended to help SSO programs by avoiding on-the-fly > > > recompiles > > > when swapping out shaders, however, it also means that if a varying > > > uses > > > a large location number close to the maximum allowed by the SF/FS > > > units > > > (31), then the offset introduced by the number of builtin slots can > > > push > > > the location outside the range and trigger an assertion. > > > > > > This problem is affecting at least the following CTS tests for > > > enhanced layouts: > > > > > > KHR-GL45.enhanced_layouts.varying_array_components > > > KHR-GL45.enhanced_layouts.varying_array_locations > > > KHR-GL45.enhanced_layouts.varying_components > > > KHR-GL45.enhanced_layouts.varying_locations > > > > > > which use SSO and the the location layout qualifier to select such > > > location numbers explicitly. > > > > > > This change helps these tests because for SSO we always have to > > > include > > > things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment > > > shader is > > > very unlikely to read them, so by doing this we free builtin slots > > > from > > > the fixed VUE layout and we avoid the tests to crash in this > > > scenario. > > > > > > Of course, this is not a proper fix, we'd still run into problems > > > if someone > > > tries to use an explicit max location and read gl_ViewportIndex, > > > gl_LayerID or > > > gl_CullDistancein in the FS, but that would be a much less common > > > bug and we > > > can probably wait to see if anyone actually runs into that > > > situation in a real > > > world scenario before making the decision that more aggresive > > > changes are > > > required to support this without reverting 99df02ca26f61. > > > > > > v2: > > > - Add a debug message when we skip clip distances (Ilia) > > > - we also need to account for this when we compute the urb setup > > > for the fragment shader stage, so add a compiler util to compute > > > the first slot that we need to read from the URB instead of > > > replicating the logic in both places. > > > > > > v3: > > > - Make the util more generic so it can account for all unused slots > > > at the beginning of the URB, that will make it more useful (Ken). > > > - Drop the debug message, it was not what Ilia was asking for. > > > > > > Suggested-by: Kenneth Graunke <kenn...@whitecape.org> > > > --- > > > src/intel/compiler/brw_compiler.h | 29 > > > +++++++++++++++++++++++++++ > > > src/intel/compiler/brw_fs.cpp | 7 +++---- > > > src/mesa/drivers/dri/i965/genX_state_upload.c | 16 ++++++--------- > > > 3 files changed, 38 insertions(+), 14 deletions(-) > > > > > > diff --git a/src/intel/compiler/brw_compiler.h > > > b/src/intel/compiler/brw_compiler.h > > > index 6753a8daf0..038f3f9551 100644 > > > --- a/src/intel/compiler/brw_compiler.h > > > +++ b/src/intel/compiler/brw_compiler.h > > > @@ -1222,6 +1222,35 @@ brw_stage_has_packed_dispatch(const struct > > > gen_device_info *devinfo, > > > } > > > } > > > > > > +/** > > > + * Computes the first varying slot in the URB produced by the > > > previous stage > > > + * that is used in the next stage. We do this by testing the > > > varying slots in > > > + * the previous stage's vue map against the inputs read in the > > > next stage. > > > + * > > > + * Note that: > > > + * > > > + * - Each URB offset contains two varying slots and we can only > > > skip a > > > + * full offset if both slots are unused, so the value we return > > > here is always > > > + * rounded down to the closest multiple of two. > > > + * > > > + * - gl_Layer and gl_ViewportIndex don't have their own varying > > > slots, they are > > > + * part of the vue header, so if these are read we can't skip > > > anything. > > > + */ > > > +static inline int > > > +brw_compute_first_urb_slot_required(uint64_t inputs_read, > > > + const struct brw_vue_map > > > *prev_stage_vue_map) > > > +{ > > > + if ((inputs_read & (VARYING_BIT_LAYER | VARYING_BIT_VIEWPORT)) > > > == 0) { > > > > You don't actually need to do this, they will be in slot 0, so the > > first > > iteration of the loop will pick them up. :) > > Actually, we need this, there are regressions without this condition. > > The problem is that these two varyings correspond to varying slots 22 > and 23, but we have them in the vue header, so slot 0, so we remove the > slots when we setup the vue map and we need to check for them > explicitly, see this code from brw_vue_map.c: > > /* gl_Layer and gl_ViewportIndex don't get their own varying slots -- > they > * are stored in the first VUE slot (VARYING_SLOT_PSIZ). > */ > slots_valid &= ~(VARYING_BIT_LAYER | VARYING_BIT_VIEWPORT);
Gah. That makes sense. As is, Reviewed-by: Kenneth Graunke <kenn...@whitecape.org>
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev