On 12/13/2011 05:08 PM, Christoph Bumiller wrote:
On 12/14/2011 12:58 AM, Ian Romanick wrote:
On 12/13/2011 01:25 PM, Jose Fonseca wrote:
----- Original Message -----
On 12/13/2011 03:09 PM, Jose Fonseca wrote:
----- Original Message -----
On 12/13/2011 12:26 PM, Bryan Cain wrote:
On 12/13/2011 02:11 PM, Jose Fonseca wrote:
----- Original Message -----
This is an updated version of the patch set I sent to the list
a
few
hours
ago.
There is now a TGSI property called
TGSI_PROPERTY_NUM_CLIP_DISTANCES
that drivers can use to determine how many of the 8 available
clip
distances
are actually used by a shader.
Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily
derived from the shader, and queried through
src/gallium/auxiliary/tgsi/tgsi_scan.h ?
No. The clip distances can be indirectly addressed (there are up
to 2
of them in vec4 form for a total of 8 floats), which makes it
impossible
to determine which ones are used by analyzing the shader.
The description is almost complete. :) The issue is that the
shader
may
declare
out float gl_ClipDistance[4];
the use non-constant addressing of the array. The compiler knows
that
gl_ClipDistance has at most 4 elements, but post-hoc analysis
would
not
be able to determine that. Often the fixed-function hardware (see
below) needs to know which clip distance values are actually
written.
But don't all the clip distances written by the shader need to be
declared?
E.g.:
DCL OUT[0], CLIPDIST[0]
DCL OUT[1], CLIPDIST[1]
DCL OUT[2], CLIPDIST[2]
DCL OUT[3], CLIPDIST[3]
therefore a trivial analysis of the declarations convey that?
No. Clip distance is an array of up to 8 floats in GLSL, but it's
represented in the hardware as 2 vec4s. You can tell by analyzing
the
declarations whether there are more than 4 clip distances in use, but
not which components the shader writes to.
TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in use,
not
the number of full vectors.
Lets imagine
out float gl_ClipDistance[6];
Each a clip distance is a scalar float.
Either all hardware represents the 8 clip distances as two 4 vectors,
and we do:
DCL OUT[0].xywz, CLIPDIST[0]
DCL OUT[1].xy, CLIPDIST[1]
using the full range of struct tgsi_declaration::UsageMask [1] or we
represent them as as scalars:
DCL OUT[0].x, CLIPDIST[0]
DCL OUT[1].x, CLIPDIST[1]
DCL OUT[2].x, CLIPDIST[2]
DCL OUT[3].x, CLIPDIST[3]
DCL OUT[4].x, CLIPDIST[4]
DCL OUT[5].x, CLIPDIST[5]
If indirect addressing is allowed as I read bore, then maybe the later
is better.
As far as I'm aware, all hardware represents it as the former, and we
have a lowering pass to fix-up the float[] accesses to be vec4[] accesses.
GeForce8+ = scalar architecture, no vectors, addresses are byte based,
can access individual components just fine.
Something like:
gl_ClipDistance[i - 12] = some_value;
DCL OUT[0].xyzw, POSITION
DCL OUT[1-8].x, CLIPDIST[0-7]
MOV OUT<1>[ADDR[0].x - 12].x, TEMP[0].xxxx
* **
* - tgsi_dimension.Index specifying the base address by referencing a
declaration
** - tgsi_src_register.Index
is the only way I see to make this work nicely on all hardware.
(This is also needed if OUT[i] and OUT[i + 1] cannot be assigned to
contiguous hardware resources because of semantic.)
For constrained hardware the driver can build the clunky
c := ADDR[0].x % 4
i := ADDR[0].x / 4
IF [c == 0]
MOV OUT[i].x, TEMP[0].xxxx
ELSE
IF [c == 1]
MOV OUT[i].y, TEMP[0].xxxx
ELSE
IF [c == 2]
MOV OUT[i].z, TEMP[0].xxxx
ELSE
MOV OUT[i].w, TEMP[0].xxxx
ENDIF
itself.
Doing it at that low-level has a number of significant drawbacks. The
worst is that it's long after any high-level optimizations can be done
on the code. It also means that it has to be reimplemented in every
driver that needs. This really belongs at a higher level in the code.
Note that lowering pass that already exists changes the accesses to
'float gl_ClipDistance[8]' to 'vec4 gl_ClipDistanceMESA[2]'. Is there a
compelling reason to not do the same at the lower level?
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev