Hi Carl, on 2023/7/3 23:57, Carl Love wrote: > Kewen: > > On Fri, 2023-06-30 at 15:20 -0700, Carl Love wrote: >> Segher never liked the above way of looking at the assembly. He >> prefers: >> gcc -S -g -mcpu=power8 -o vsx-vector-6-func-2lop.s vsx-vector-6- >> func- >> 2lop.c >> >> grep xxlor vsx-vector-6-func-2lop.s | wc >> 34 68 516 >> >> So, again, I get the same count of 34 on both makalu and genoa. But >> again, that doesn't agree with what make script/scan-assembler thinks >> the counts should be. >> >> When I looked at the vsx-vector-6-func-2lop.s I see on BE: >> >> .... >> lxvd2x 0,10,9 >> xxlor 0,12,0 >> xxlnor 0,0,0 >> ... >> >> I was guessing that it was adjusting the data layout from the load. >> But looking again more carefully versus LE: >> >> .... >> lxvd2x 0,31,9 >> xxpermdi 0,0,0,2 >> xxlor 0,12,0 >> xxlnor 0,0,0 >> xxpermdi 0,0,0,2 >> .... >> >> the xxpermdi is probably what is really doing the data layout change. >> >> So, we have the issue that looking at the assembly gives different >> instruction counts then what >> >> dg-final { scan-assembler-times {\mxxlor\M} } >> >> comes up with??? Now I am really confused. I don't know how the >> scan- >> assembler-times works but I will go see if I can find it and see if I >> can figure out what the issue is. I would expect that the scan- >> assembler is working off the --save-temp files, which get deleted as >> part of the run. I would guess that scan-assembler does a grep to >> find >> the instructions and then maybe uses wc to count them??? I will go >> see >> if I can figure out how scan-assembler-times works. > > OK, I figured out why I was getting 34 xxlor instructions instead of > the 22 that the scan-assembler-times was getting. The difference was > when I compiled the program I forgot to use -O2. So with -O2 I get the > same number of xxlor instructins as scan-assembler-instructions. I get > 34 if I do not specify optimization.
OK, thanks for looking into it. When you run a test case with RUNTESTFLAGS, you can add the "-v" (and even more times) to RUNTESTFLAGS, then you can find the exact compiling commands in the dumping, I usually used this way for reproducing and I hope it can avoid some inconsistency for reproduction. > > So, I think the scan-assembler-times are all correct. > > As Peter says, counting xxlor is a bit problematic in general. We > could just drop counting xxlor or have the LE/BE count qualifier for > the instructions. Your call. Yeah, I agree that counting xxlor in the checking code (from function main) is fragile, but as you said we still want to check expected xxlor generated for bif vec_or, so I'd prefer to separate the existing case into the compiling part and run part, I'll reply with more details to your latest v3. Thanks, Kewen