----- Original Message ----- > On 12/14/2011 12:58 AM, Ian Romanick wrote: > > On 12/13/2011 01:25 PM, Jose Fonseca wrote: > >> > >> > >> ----- Original Message ----- > >>> On 12/13/2011 03:09 PM, Jose Fonseca wrote: > >>>> > >>>> ----- Original Message ----- > >>>>> On 12/13/2011 12:26 PM, Bryan Cain wrote: > >>>>>> On 12/13/2011 02:11 PM, Jose Fonseca wrote: > >>>>>>> ----- Original Message ----- > >>>>>>>> This is an updated version of the patch set I sent to the > >>>>>>>> list > >>>>>>>> a > >>>>>>>> few > >>>>>>>> hours > >>>>>>>> ago. > >>>>>>>> There is now a TGSI property called > >>>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES > >>>>>>>> that drivers can use to determine how many of the 8 > >>>>>>>> available > >>>>>>>> clip > >>>>>>>> distances > >>>>>>>> are actually used by a shader. > >>>>>>> Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily > >>>>>>> derived from the shader, and queried through > >>>>>>> src/gallium/auxiliary/tgsi/tgsi_scan.h ? > >>>>>> No. The clip distances can be indirectly addressed (there are > >>>>>> up > >>>>>> to 2 > >>>>>> of them in vec4 form for a total of 8 floats), which makes it > >>>>>> impossible > >>>>>> to determine which ones are used by analyzing the shader. > >>>>> The description is almost complete. :) The issue is that the > >>>>> shader > >>>>> may > >>>>> declare > >>>>> > >>>>> out float gl_ClipDistance[4]; > >>>>> > >>>>> the use non-constant addressing of the array. The compiler > >>>>> knows > >>>>> that > >>>>> gl_ClipDistance has at most 4 elements, but post-hoc analysis > >>>>> would > >>>>> not > >>>>> be able to determine that. Often the fixed-function hardware > >>>>> (see > >>>>> below) needs to know which clip distance values are actually > >>>>> written. > >>>> But don't all the clip distances written by the shader need to > >>>> be > >>>> declared? > >>>> > >>>> E.g.: > >>>> > >>>> DCL OUT[0], CLIPDIST[0] > >>>> DCL OUT[1], CLIPDIST[1] > >>>> DCL OUT[2], CLIPDIST[2] > >>>> DCL OUT[3], CLIPDIST[3] > >>>> > >>>> therefore a trivial analysis of the declarations convey that? > >>> > >>> No. Clip distance is an array of up to 8 floats in GLSL, but > >>> it's > >>> represented in the hardware as 2 vec4s. You can tell by > >>> analyzing > >>> the > >>> declarations whether there are more than 4 clip distances in use, > >>> but > >>> not which components the shader writes to. > >>> TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in > >>> use, > >>> not > >>> the number of full vectors. > >> > >> Lets imagine > >> > >> out float gl_ClipDistance[6]; > >> > >> Each a clip distance is a scalar float. > >> > >> Either all hardware represents the 8 clip distances as two 4 > >> vectors, > >> and we do: > >> > >> DCL OUT[0].xywz, CLIPDIST[0] > >> DCL OUT[1].xy, CLIPDIST[1] > >> > >> using the full range of struct tgsi_declaration::UsageMask [1] or > >> we > >> represent them as as scalars: > >> > >> DCL OUT[0].x, CLIPDIST[0] > >> DCL OUT[1].x, CLIPDIST[1] > >> DCL OUT[2].x, CLIPDIST[2] > >> DCL OUT[3].x, CLIPDIST[3] > >> DCL OUT[4].x, CLIPDIST[4] > >> DCL OUT[5].x, CLIPDIST[5] > >> > >> If indirect addressing is allowed as I read bore, then maybe the > >> later > >> is better. > > > > As far as I'm aware, all hardware represents it as the former, and > > we > > have a lowering pass to fix-up the float[] accesses to be vec4[] > > accesses. > > GeForce8+ = scalar architecture, no vectors, addresses are byte > based, > can access individual components just fine.
Ok. So we should avoid baking this vec4 assumption in TGSI semantics. > Something like: > > gl_ClipDistance[i - 12] = some_value; > > DCL OUT[0].xyzw, POSITION > DCL OUT[1-8].x, CLIPDIST[0-7] > > MOV OUT<1>[ADDR[0].x - 12].x, TEMP[0].xxxx > * ** > > * - tgsi_dimension.Index specifying the base address by referencing > a > declaration > ** - tgsi_src_register.Index > > is the only way I see to make this work nicely on all hardware. > (This is also needed if OUT[i] and OUT[i + 1] cannot be assigned to > contiguous hardware resources because of semantic.) I think that having indexable temps, like D3D10, would be a cleaner: DCL OUT[0].xyzw, POSITION DCL OUT[1][0-7].x, CLIPDIST[0-7] MOV OUT[1][ADDR[0].x - 12].x, TEMP[0].xxxx I propose we first add this new kind of temp at a first stage, then prohibit indirect addressing of all but this kind of temps. > For constrained hardware the driver can build the clunky > > c := ADDR[0].x % 4 > i := ADDR[0].x / 4 > IF [c == 0] > MOV OUT[i].x, TEMP[0].xxxx > ELSE > IF [c == 1] > MOV OUT[i].y, TEMP[0].xxxx > ELSE > IF [c == 2] > MOV OUT[i].z, TEMP[0].xxxx > ELSE > MOV OUT[i].w, TEMP[0].xxxx > ENDIF > > itself. Sounds good plan to me. BTW, I took a look at inputs/outputs UsageMasks and although we don't use them, I really think we really should, as having that info readily accessible would allow to avoid wasting time/bandwidth copying attributes which are not needed. Jose _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev