Re: [Mesa-dev] Allowing the reading of outputs for some drivers

Christoph Bumiller Tue, 15 Nov 2011 07:22:57 -0800

On 11/15/2011 04:11 PM, Vadim Girlin wrote:
> On Tue, 2011-11-15 at 06:48 -0800, Jose Fonseca wrote:
>>
>> ----- Original Message -----
>>> On Tue, 2011-11-15 at 05:52 -0800, Jose Fonseca wrote:
>>>>
>>>> ----- Original Message -----
>>>>> On 11/14/2011 07:16 AM, Marek Olšák wrote:
>>>>>> On Mon, Nov 14, 2011 at 2:49 PM, Vadim
>>>>>> Girlin<vadimgir...@gmail.com>  wrote:
>>>>>>> On Mon, 2011-11-14 at 05:22 -0800, Jose Fonseca wrote:
>>>>>>>>
>>>>>>>> ----- Original Message -----
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I found some problem with glsl_to_tgsi: remove_output_reads
>>>>>>>>> function.
>>>>>>>>> It's replacing outputs with temps, producing incorrect
>>>>>>>>> results
>>>>>>>>> with
>>>>>>>>> relative addressing. You can see it e.g. with
>>>>>>>>> "vs-varying-array-mat2-col-rd.shader_test". Here is a dump:
>>>>>>>>>
>>>>>>>>>> VERT
>>>>>>>>>> DCL IN[0]
>>>>>>>>>> DCL OUT[0], POSITION
>>>>>>>>>> DCL OUT[1], GENERIC[12]
>>>>>>>>>> DCL OUT[2], GENERIC[13]
>>>>>>>>>> DCL OUT[3], GENERIC[14]
>>>>>>>>>> DCL OUT[4], GENERIC[15]
>>>>>>>>>> DCL OUT[5], GENERIC[16]
>>>>>>>>>> DCL OUT[6], GENERIC[17]
>>>>>>>>>> DCL OUT[7], GENERIC[18]
>>>>>>>>>> DCL CONST[0..5]
>>>>>>>>>> DCL TEMP[0..1]
>>>>>>>>>> DCL ADDR[0]
>>>>>>>>>> IMM FLT32 {    1.0000,     2.0000,     3.0000,     4.0000}
>>>>>>>>>> IMM FLT32 {    5.0000,     6.0000,     7.0000,     8.0000}
>>>>>>>>>> IMM FLT32 {    9.0000,    10.0000,    11.0000,    12.0000}
>>>>>>>>>> IMM FLT32 {    0.0000,     1.0000,     0.0000,     0.0000}
>>>>>>>>>>    0: MUL TEMP[0], CONST[2], IN[0].xxxx
>>>>>>>>>>    1: MAD TEMP[1], CONST[3], IN[0].yyyy, TEMP[0]
>>>>>>>>>>    2: MAD TEMP[1], CONST[4], IN[0].zzzz, TEMP[1]
>>>>>>>>>>    3: MAD OUT[0], CONST[5], IN[0].wwww, TEMP[1]
>>>>>>>>>>    4: MOV OUT[2], IMM[0].xyyy
>>>>>>>>>>    5: MOV OUT[3], IMM[0].zwww
>>>>>>>>>>    6: MOV TEMP[0], IMM[1].xyyy
>>>>>>>>>
>>>>>>>>> OUT[2-7] is a "varying mat2x2[3] m;", OUT[4] is replaced
>>>>>>>>> with
>>>>>>>>> the
>>>>>>>>> temp
>>>>>>>>> in the instruction 6.
>>>>>>>>>
>>>>>>>>>>    7: MOV OUT[5], IMM[1].zwww
>>>>>>>>>>    8: MOV OUT[6], IMM[2].xyyy
>>>>>>>>>>    9: MOV OUT[7], IMM[2].zwww
>>>>>>>>>>   10: ARL ADDR[0].x, CONST[1].xxxx
>>>>>>>>>>   11: SNE TEMP[1], TEMP[ADDR[0].x].xyyy, CONST[0].xyyy
>>>>>>>>>
>>>>>>>>> Instruction 11 contains the read with the relative
>>>>>>>>> addressing
>>>>>>>>> using
>>>>>>>>> this temp, which is incorrect.
>>>>>>>>>
>>>>>>>>> I'm not sure how to fix it,
>>>>>>>>
>>>>>>>> The way to fix this is to allocate a consecutive range of
>>>>>>>> temps
>>>>>>>> at start, when there are indirect writes to output registers
>>>>>>>> and
>>>>>>>> at least one read.
>>>>>>>>
>>>>>>>>> but AFAICS at least for r600g this step
>>>>>>>>> could be skipped completely - r600 can read outputs without
>>>>>>>>> any
>>>>>>>>> problem, they are located in the general-purpose registers.
>>>>>>>>> Removing
>>>>>>>>> calls to remove_output_reads and assert(src.File !=
>>>>>>>>> TGSI_FILE_OUTPUT)
>>>>>>>>> in the ureg_emit_src produces correct result and test passes
>>>>>>>>> on
>>>>>>>>> evergreen (total number of fixed tests is about 60).
>>>>>>>>>
>>>>>>>>> Probably it makes sense to make this step optional and ask
>>>>>>>>> the
>>>>>>>>> driver
>>>>>>>>> whether to use it, if I'm not missing something?
>>>>>>>>
>>>>>>>> The drawback of doing this is that TGSI will look even more
>>>>>>>> different between drivers. This means that when somebody
>>>>>>>> makes a
>>>>>>>> change to the state tracker, and tested with one driver, it
>>>>>>>> may
>>>>>>>> break other drivers. This also means that comparing a driver
>>>>>>>> to
>>>>>>>> llvmpipe/softpipe will be less meaningfull, as different
>>>>>>>> paths
>>>>>>>> will be taken in the state tracker.
>>>>>>>>
>>>>>>>
>>>>>>> I think if it's needed to compare the drivers, than it's
>>>>>>> possible
>>>>>>> to
>>>>>>> switch of the the cap for debugging. I see the drawbacks, but
>>>>>>> this
>>>>>>> is
>>>>>>> also about performance. Currently with r600g (and possibly
>>>>>>> with
>>>>>>> other
>>>>>>> drivers) we have to spend some time for unneccessary shader
>>>>>>> modification, to get less efficient shader code as a result.
>>>>>>
>>>>>> I am 100% sure nobody will turn on/off CAPs just to test
>>>>>> something.
>>>>>>
>>>>>> I gotta agree with José on this one.
>>>>>
>>>>> I guess I don't follow.  Different hardware can do different
>>>>> things,
>>>>> and
>>>>> the code for that hardware will look different.  What's the
>>>>> problem?
>>>>>  It
>>>>> seems silly to spend CPU time rearranging the code and then
>>>>> hoping
>>>>> the
>>>>> driver will spend more CPU time to put it back the way it was.
>>>>>  People
>>>>> using these drivers *do* care about CPU performance, after all.
>>>>> :)
>>>>
>>>> Developer time is important too. And having more code paths shared
>>>> with other drivers (even at the expense of a few extra CPU cycles
>>>> every time a shader is created) means that developers has more time
>>>> to
>>>> focus on features that can yield substantial improvements on true
>>>> hotspots (e.g., every time a pixel is rendered).
>>>>
>>>> This particular case may not be the best example. But there is a
>>>> trade
>>>> off: more specialization means more maintenance burden.
>>>>
>>>> And merely sharing the C code is not enough: if one is not sharing
>>>> the
>>>> same code paths, then's not really sharing the code. It's looks
>>>> like
>>>> but it's not: it may be using an unique path to its driver, which
>>>> means that other developers may never test it so it can get easily
>>>> broken, and vice versa.
>>>>
>>>> The gallium interface often makes such compromises.  And every now
>>>> and
>>>> then I see commits from Intel yanking some obscure optimization
>>>> code
>>>> path on i965 that's not worth trouble of maintaining, so I suppose
>>>> you
>>>> do to.
>>>>
>>> Again, I understand what you are talking about, but these patches are
>>> adding about 5 lines of code, I defininitely can't imagine any
>>> troubles
>>> of maintaining them. I'm not trying to pull into state tracker huge
>>> hardware-specific patch, e.g. implement 5-way vliw scheduling here,
>>> or
>>> some shader transformation because my hardware can't read something.
>>> I'm
>>> just trying to _avoid_ some hardware-specific code in the state
>>> tracker,
>>> which is created for another hardware and is just a trouble for for
>>> my
>>> hardware.
>>>
>>> If everybody wants to make state tracker less hardware-specific, then
>>> I
>>> don't understand why this hardware-specific transformation code is in
>>> the state tracker at all?
>>
>>> I'm not going to insist on my proposal, but if you're trying to
>>> expalin
>>> why these patches are evil, you probably need a better explanation.
>>
>> I'm pointing out the drawbacks, but I never said these patches are evil.
>>
>> I did not spend enough time to make my own mind about this patches, and 
>> whomever maintains r600g is free to decide either way as far as I'm 
>> concerned.  From the beginning I wrote:
>>
>>   "I'm not sure if for this particular issue a new cap is worth or not. Just 
>> pointing out that there are downsides of breaking orthogonality between 
>> state tracker and driver. It should not be done lightly."
>>
>> I feel compelled to reply whenever somebody says "there's no loss with 
>> adding this cap here" with "yes there is, and here's why...".
>>
>> But of course I would be both thumbs up if it was a patch that would fix the 
>> fundamental bug for all drivers.
>>
> 
> It seems you are mixing two _different_ issues - the first is incorrect
> handling of the relative addressing, and the second is unnecessary for
> some drivers code which is spending CPU time for nothing, not to mention
> additional bugs. My proposal was intended to fix the second, I'm not
> proposing to never fix the first.
>


So let's just have the patches, I'll even enable it for nv50+ as well at
some point to have one less worry about broken state tracker code.

And I've always been in favour of not destroying information at the
gallium barrier ...

> Sorry if I misunderstood something, english is not my native language. 
> 
> Vadim
> 
>> Jose
> 
> 
> 
> _______________________________________________
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Allowing the reading of outputs for some drivers

Reply via email to