https://llvm.org/bugs/show_bug.cgi?id=24850
Bug ID: 24850 Summary: LLVM built 445.gobmk is 17% slower than gcc on power8 Product: libraries Version: trunk Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P Component: Backend: PowerPC Assignee: unassignedb...@nondot.org Reporter: car...@google.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified LLVM built 445.gobmk is 17% slower than gcc built binary on power8. gcc 438s llvm 512s For input data trevord.tst, llvm is 18% slower. The problem is in function popgo. In gcc built binary it consumes 4.11% of time, in llvm built binary it consumes 13.98% of time. The related code snippet is in engine/board.c: struct change_stack_entry { int *address; int value; }; static struct change_stack_entry *change_stack_pointer; #define POP_MOVE()\ while ((--change_stack_pointer)->address)\ *(change_stack_pointer->address) =\ change_stack_pointer->value LLVM generated code sequence is: 68.05 : 1000a9f0: ld r3,-22832(r29) // A 0.66 : 1000a9f4: addi r4,r3,-16 0.17 : 1000a9f8: std r4,-22832(r29) // B 0.02 : 1000a9fc: ori r2,r2,0 14.30 : 1000aa00: ld r4,-16(r3) 0.00 : 1000aa04: cmpldi r4,0 0.00 : 1000aa08: beq 1000aa18 <popgo+0xa8> 0.53 : 1000aa0c: lwz r3,-8(r3) 0.11 : 1000aa10: stw r3,0(r4) 0.00 : 1000aa14: b 1000a9f0 <popgo+0x80> Instruction A reads variable change_stack_pointer, instruction B writes change_stack_pointer. GCC generated code sequence is: 48.30 : 10010280: lwz r8,24(r9) 0.00 : 10010284: mr r7,r9 0.00 : 10010288: addi r9,r9,-16 0.63 : 1001028c: stw r8,0(r10) 0.00 : 10010290: ld r10,16(r9) 0.00 : 10010294: cmpdi cr7,r10,0 0.00 : 10010298: bne cr7,10010280 <popgo+0x90> 15.54 : 1001029c: nop Note that variable change_stack_pointer is in register r9, it reads it at the start of the function, and writes it after the loop. Since the address of change_stack_pointer is never assigned to another variable, and it's a static variable, so it can't be aliased with any other pointer, so it is safe to do this optimization. Even if I add -fstrict-aliasing explicitly to llvm command line, it can move the read of change_stack_pointer out of the loop, but still contains write of change_stack_pointer in the loop. Command line options are: -DSPEC_CPU -DNDEBUG -DHAVE_CONFIG_H -I. -I.. -I../include -I./include -fno-strict-aliasing -O2 -m64 -mvsx -mcpu=power8 -DSPEC_CPU_LP64 -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs