On 11/14/2011 07:16 AM, Marek Olšák wrote:
On Mon, Nov 14, 2011 at 2:49 PM, Vadim Girlin<vadimgir...@gmail.com> wrote:
On Mon, 2011-11-14 at 05:22 -0800, Jose Fonseca wrote:
----- Original Message -----
Hi,
I found some problem with glsl_to_tgsi: remove_output_reads function.
It's replacing outputs with temps, producing incorrect results with
relative addressing. You can see it e.g. with
"vs-varying-array-mat2-col-rd.shader_test". Here is a dump:
VERT
DCL IN[0]
DCL OUT[0], POSITION
DCL OUT[1], GENERIC[12]
DCL OUT[2], GENERIC[13]
DCL OUT[3], GENERIC[14]
DCL OUT[4], GENERIC[15]
DCL OUT[5], GENERIC[16]
DCL OUT[6], GENERIC[17]
DCL OUT[7], GENERIC[18]
DCL CONST[0..5]
DCL TEMP[0..1]
DCL ADDR[0]
IMM FLT32 { 1.0000, 2.0000, 3.0000, 4.0000}
IMM FLT32 { 5.0000, 6.0000, 7.0000, 8.0000}
IMM FLT32 { 9.0000, 10.0000, 11.0000, 12.0000}
IMM FLT32 { 0.0000, 1.0000, 0.0000, 0.0000}
0: MUL TEMP[0], CONST[2], IN[0].xxxx
1: MAD TEMP[1], CONST[3], IN[0].yyyy, TEMP[0]
2: MAD TEMP[1], CONST[4], IN[0].zzzz, TEMP[1]
3: MAD OUT[0], CONST[5], IN[0].wwww, TEMP[1]
4: MOV OUT[2], IMM[0].xyyy
5: MOV OUT[3], IMM[0].zwww
6: MOV TEMP[0], IMM[1].xyyy
OUT[2-7] is a "varying mat2x2[3] m;", OUT[4] is replaced with the
temp
in the instruction 6.
7: MOV OUT[5], IMM[1].zwww
8: MOV OUT[6], IMM[2].xyyy
9: MOV OUT[7], IMM[2].zwww
10: ARL ADDR[0].x, CONST[1].xxxx
11: SNE TEMP[1], TEMP[ADDR[0].x].xyyy, CONST[0].xyyy
Instruction 11 contains the read with the relative addressing using
this temp, which is incorrect.
I'm not sure how to fix it,
The way to fix this is to allocate a consecutive range of temps at start, when
there are indirect writes to output registers and at least one read.
but AFAICS at least for r600g this step
could be skipped completely - r600 can read outputs without any
problem, they are located in the general-purpose registers. Removing
calls to remove_output_reads and assert(src.File != TGSI_FILE_OUTPUT)
in the ureg_emit_src produces correct result and test passes on
evergreen (total number of fixed tests is about 60).
Probably it makes sense to make this step optional and ask the driver
whether to use it, if I'm not missing something?
The drawback of doing this is that TGSI will look even more different between
drivers. This means that when somebody makes a change to the state tracker, and
tested with one driver, it may break other drivers. This also means that
comparing a driver to llvmpipe/softpipe will be less meaningfull, as different
paths will be taken in the state tracker.
I think if it's needed to compare the drivers, than it's possible to
switch of the the cap for debugging. I see the drawbacks, but this is
also about performance. Currently with r600g (and possibly with other
drivers) we have to spend some time for unneccessary shader
modification, to get less efficient shader code as a result.
I am 100% sure nobody will turn on/off CAPs just to test something.
I gotta agree with José on this one.
I guess I don't follow. Different hardware can do different things, and
the code for that hardware will look different. What's the problem? It
seems silly to spend CPU time rearranging the code and then hoping the
driver will spend more CPU time to put it back the way it was. People
using these drivers *do* care about CPU performance, after all. :)
The shader backend in a driver is primarily responsible for the
quality of final shader code, not the state tracker, although the
state tracker can help a lot.
r600g already allocates outputs in the temporary area, so it's not
like remove_output_reads would make any difference for that driver,
right?* Why not fix remove_output_reads such that it works with
indirect addressing? That would also fix _all the other drivers_ which
don't support output reads.
* Lack of proper register allocation is not an excuse, especially if
new temporaries are allocated by a driver for whatever reasons.
Marek
By the way, which drivers do not support reading outputs? I haven't done
a full piglit run with llvmpipe, but IIRR the single test mentioned
above was also fixed for llvmpipe without this output replacement.
Vadim
I'm not sure if for this particular issue a new cap is worth or not. Just
pointing out that there are downsides of breaking orthogonality between state
tracker and driver. It should not be done lightly.
Jose
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev