Hi Carl,

on 2023/7/3 23:57, Carl Love wrote:
> Kewen:
> 
> On Fri, 2023-06-30 at 15:20 -0700, Carl Love wrote:
>> Segher never liked the above way of looking at the assembly.  He
>> prefers:
>>   gcc -S -g -mcpu=power8 -o vsx-vector-6-func-2lop.s vsx-vector-6-
>> func-
>> 2lop.c
>>
>>   grep xxlor vsx-vector-6-func-2lop.s | wc
>>      34      68     516
>>
>> So, again, I get the same count of 34 on both makalu and genoa.  But
>> again, that doesn't agree with what make script/scan-assembler thinks
>> the counts should be.
>>
>> When I looked at the vsx-vector-6-func-2lop.s I see on BE:
>>
>>      ....
>>     lxvd2x 0,10,9
>>     xxlor 0,12,0
>>     xxlnor 0,0,0
>>      ...
>>
>> I was guessing that it was adjusting the data layout from the load. 
>> But looking again more carefully versus LE:
>>
>>     ....
>>     lxvd2x 0,31,9 
>>    xxpermdi 0,0,0,2 
>>    xxlor 0,12,0  
>>    xxlnor 0,0,0  
>>    xxpermdi 0,0,0,2     
>>     ....
>>
>> the xxpermdi is probably what is really doing the data layout change.
>>
>> So, we have the issue that looking at the assembly gives different
>> instruction counts then what 
>>
>>    dg-final { scan-assembler-times {\mxxlor\M} }
>>
>> comes up with???  Now I am really confused.  I don't know how the
>> scan-
>> assembler-times works but I will go see if I can find it and see if I
>> can figure out what the issue is.  I would expect that the scan-
>> assembler is working off the --save-temp files, which get deleted as
>> part of the run.  I would guess that scan-assembler does a grep to
>> find
>> the instructions and then maybe uses wc to count them??? I will go
>> see
>> if I can figure out how scan-assembler-times works.
> 
> OK, I figured out why I was getting 34 xxlor instructions instead of
> the 22 that the scan-assembler-times was getting.  The difference was
> when I compiled the program I forgot to use -O2.  So with -O2 I get the
> same number of xxlor instructins as scan-assembler-instructions.  I get
> 34 if I do not specify optimization.

OK, thanks for looking into it.  When you run a test case with RUNTESTFLAGS,
you can add the "-v" (and even more times) to RUNTESTFLAGS, then you can find
the exact compiling commands in the dumping, I usually used this way for
reproducing and I hope it can avoid some inconsistency for reproduction.

> 
> So, I think the scan-assembler-times are all correct.
> 
> As Peter says, counting xxlor is a bit problematic in general.  We
> could just drop counting xxlor or have the LE/BE count qualifier for
> the instructions.  Your call.

Yeah, I agree that counting xxlor in the checking code (from function main)
is fragile, but as you said we still want to check expected xxlor generated
for bif vec_or, so I'd prefer to separate the existing case into the
compiling part and run part, I'll reply with more details to your latest v3.

Thanks,
Kewen

Reply via email to