On 02/24/2013 11:25 PM, Vadim Girlin wrote:
On 02/24/2013 11:01 PM, Henri Verbeet wrote:
On 24 February 2013 17:07, Vadim Girlin <vadimgir...@gmail.com> wrote:
If you'd like to help me with debugging the issues on cayman, then
please
read the "regression debugging" section in the r600-sb branch notes
[1] (or
possibly more up-to-date source here - [2]), it explains how to find the
exact shader that causes regression. After locating the guilty
shader, you
only need to prepend

     R600_DUMP_SHADERS=2 R600_SB_DUMP=3

to the command line to produce the full dump for that shader, then
please
send it to me, and I'll do my best to fix the issue.

I briefly looked at it yesterday, but ran out of time. I did find out
that all of the failures except one (in loop_index_test()) go away if
I skip the if-conversion pass. I have the impression that the
PRED_SET* instructions like PRED_SETNE_INT don't behave quite the way
the ISA docs claim they do wrt. the value written to the destination
register, but instead seem to behave more like the regular SET*
instructions in that regard. I didn't properly test this, so it's more
of a theory at this point, and may just be wrong. Of course we don't
really want to use the PRED_SET* instructions to begin with if we're
not going to actually use the predicate, so we'll probably just want
to convert them to SET* instructions anyway.

Yeah, I see now, I missed the fact that PRED_SETcc instructions on
cayman can't be used as simple ALU instructions without side effects,
they always update exec_mask.

Well, it seems I misunderstood the docs, in fact PRED_SET instructions can be used when UPDATE_EXEC_MASK=0. I just thought that ALU_WORD1_OP2_EXECUTE_MASK encoding is always used with them, and in the description of this encoding value 0 for UPDATE_EXEC_MASK is marked as "reserved", so I thought that they should always have UPDATE_EXEC_MASK=1.

Now I think you are right and probably the cause is that the value written to dst register is not what I expect. Anyway, I think the best solution is still the same as you mentioned - get rid of PRED_SETcc instructions when the update of exec mask or predicate is not required.

Vadim


Somewhat related to that, wined3d generates GLSL of the form "dst =
(src0 < src1) ? 1.0 : 0.0;" for the D3D "slt" instruction (and similar
code for instructions like sge, etc.). The TGSI for that looks pretty
awful, first doing a SLT, converting the result to an integer, and
then branching on that to assign either 1.0 or 0.0 again. The
if-conversion pass is fairly helpful there in the sense that it at
least gets rid of the branches, but you still end up with a sequence
like SETGE, FLT_TO_INT, PRED_SETNE_INT, CNDE_INT, while all that's
really needed is the SETGE. That's probably best addressed in either
the GLSL compiler or the GLSL -> TGSI stage though.


I just implemented the simplest way of if-conversion for now, but the
first thing that I'm going to add to the (currently empty) peephole pass
is the optimization of SETcc, PRED_SETcc, etc, including the conversions
between float/int bools, to get rid of all unnecessary operations.

Unfortunately I won't be able to test with that system again until at
least Thursday, so it'll be a while before I can actually do anything
about it.

Well, I see the problem now, so there is no need for any special testing
until I'll implement something to fix it.

Anyway, you helped me to understand what's wrong, thanks.

Vadim

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to