On January 19, 2018 3:15:45 PM GMT+01:00, Tom de Vries <tom_devr...@mentor.com> 
wrote:
>On 01/18/2018 02:27 PM, Tom de Vries wrote:
>> On 01/18/2018 12:40 AM, Cesar Philippidis wrote:
>>> In PR83920, I encountered a nvptx bug where live predicate variables
>>> were clobbered before their value was broadcasted. 
>> 
>> Hi,
>> 
>> I've managed to reproduce the problem based on the description in the
>PR.
>
>> I think the way to address it is using a tmp .pred reg like so:
>> ...
>> {
>>    .reg .u32 %x;
>>    mov.u32 %x,%tid.x;
>>    setp.ne.u32 %rnotvzero,%x,0;
>> }
>> 
>> {
>>    .reg .pred %rcond2;
>>    setp.eq.u32 %rcond2, 1, 0; // workaround
>> 
>>    @%rnotvzero bra Lskip;
>>    ...
>>    setp.<op>.<type> %rcond,op1,op2; // could be here, could be
>earlier
>>    mov.b1 %rcond2, %rcond; // used pseudo opcode mov.b1 for
>convenience
>>   Lskip:
>>    selp.u32 %rcondu32,1,0,%rcond2;
>>    shfl.idx.b32 %rcondu32,%rcondu32,0,31;
>>    setp.ne.u32 %rcond,%rcondu32,0;
>> }
>> ...
>> 
>
>Hi,
>
>this is the fix that I plan to commit (similar to the scheme listed 
>above, but modified to keep the selp.u32 using rcond, which is easier
>in 
>code generation).
>
>Build and reg-tested on x86_64 with nvptx accelerator.
>
>Richard, this is an 8 regression for the nvptx target. OK for stage 4
>or 
>defer to stage1?

OK for stage 4.

Richard. 

>Thanks,
>- Tom

Reply via email to