https://llvm.org/bugs/show_bug.cgi?id=25782
Bug ID: 25782 Summary: [ppc] bad code layout causes slower than gcc in 403.gcc Product: libraries Version: trunk Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P Component: Backend: PowerPC Assignee: unassignedb...@nondot.org Reporter: car...@google.com CC: llvm-bugs@lists.llvm.org Classification: Unclassified LLVM generated 403.gcc is 1.7% slower than gcc generated code on power8, for input data g23.i, llvm is 8.68% slower. In the perf result, in gcc generated code 10.03% of time is consumed by htab_traverse, but in llvm generated code 16.48% of time is consumed by the same function. GCC generated following code for the loop body of function htab_traverse: 99.87 : 1032c050: addi r31,r31,8 // HOT 0.00 : 1032c054: cmpld cr7,r30,r31 // HOT 0.00 : 1032c058: ble cr7,1032c08c <htab_traverse+0x7c> // HOT 0.01 : 1032c05c: ld r9,0(r31) // HOT 0.03 : 1032c060: cmpldi cr7,r9,1 // HOT 0.00 : 1032c064: ble cr7,1032c050 <htab_traverse+0x40> // HOT 0.06 : 1032c068: mtctr r29 0.00 : 1032c06c: mr r3,r31 0.00 : 1032c070: std r2,24(r1) 0.00 : 1032c074: mr r4,r28 0.00 : 1032c078: mr r12,r29 0.00 : 1032c07c: bctrl 0.01 : 1032c080: ld r2,24(r1) 0.00 : 1032c084: cmpdi cr7,r3,0 0.00 : 1032c088: bne cr7,1032c050 <htab_traverse+0x40> LLVM generated following corresponding code: 66.56 : 10306b20: ldu r3,8(r26) // HOT 0.00 : 10306b24: cmpldi r3,2 // HOT 0.00 : 10306b28: blt 10306b50 <htab_traverse+0x80> // HOT 0.03 : 10306b2c: mtctr r28 0.00 : 10306b30: mr r3,r30 0.00 : 10306b34: mr r4,r29 0.00 : 10306b38: mr r12,r28 0.00 : 10306b3c: std r2,24(r1) 0.01 : 10306b40: bctrl 0.01 : 10306b44: ld r2,24(r1) 0.00 : 10306b48: cmplwi r3,0 0.00 : 10306b4c: beq 10306b5c <htab_traverse+0x8c> 33.38 : 10306b50: addi r30,r30,8 // HOT 0.00 : 10306b54: cmpld r30,r27 // HOT 0.00 : 10306b58: blt 10306b20 <htab_traverse+0x50> // HOT So we can see that both compiler generate similar instructions, but with different code layout. In gcc's code, all hot BBs are put together, but in llvm's code hot BBs are separated, the taken branch causes slower performance. So this is a code layout problem. -- You are receiving this mail because: You are on the CC list for the bug.
_______________________________________________ llvm-bugs mailing list llvm-bugs@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs