Kewen:

On Fri, 2023-06-30 at 15:20 -0700, Carl Love wrote:
> Segher never liked the above way of looking at the assembly.  He
> prefers:
>   gcc -S -g -mcpu=power8 -o vsx-vector-6-func-2lop.s vsx-vector-6-
> func-
> 2lop.c
> 
>   grep xxlor vsx-vector-6-func-2lop.s | wc
>      34      68     516
> 
> So, again, I get the same count of 34 on both makalu and genoa.  But
> again, that doesn't agree with what make script/scan-assembler thinks
> the counts should be.
> 
> When I looked at the vsx-vector-6-func-2lop.s I see on BE:
> 
>      ....
>     lxvd2x 0,10,9
>     xxlor 0,12,0
>     xxlnor 0,0,0
>      ...
> 
> I was guessing that it was adjusting the data layout from the load. 
> But looking again more carefully versus LE:
> 
>     ....
>     lxvd2x 0,31,9 
>    xxpermdi 0,0,0,2 
>    xxlor 0,12,0  
>    xxlnor 0,0,0  
>    xxpermdi 0,0,0,2     
>     ....
> 
> the xxpermdi is probably what is really doing the data layout change.
> 
> So, we have the issue that looking at the assembly gives different
> instruction counts then what 
> 
>    dg-final { scan-assembler-times {\mxxlor\M} }
> 
> comes up with???  Now I am really confused.  I don't know how the
> scan-
> assembler-times works but I will go see if I can find it and see if I
> can figure out what the issue is.  I would expect that the scan-
> assembler is working off the --save-temp files, which get deleted as
> part of the run.  I would guess that scan-assembler does a grep to
> find
> the instructions and then maybe uses wc to count them??? I will go
> see
> if I can figure out how scan-assembler-times works.

OK, I figured out why I was getting 34 xxlor instructions instead of
the 22 that the scan-assembler-times was getting.  The difference was
when I compiled the program I forgot to use -O2.  So with -O2 I get the
same number of xxlor instructins as scan-assembler-instructions.  I get
34 if I do not specify optimization.

So, I think the scan-assembler-times are all correct.

As Peter says, counting xxlor is a bit problematic in general.  We
could just drop counting xxlor or have the LE/BE count qualifier for
the instructions.  Your call.

                         Carl 

Reply via email to