------- Comment #10 from mkuvyrkov at gcc dot gnu dot org 2010-06-08 15:24 ------- Steven, I'm shamelessly stealing this PR from you.
There are two sides to this missed optimization: 1. Calculation of PIC address is not CSE'd; this is the same as PR42495 and will be fixed there. 2. Constant "400", which expands to 2 instructions is not CSE'd. After addressing the first issue, the second problem can be fixed by asking hoist to CSE "complicated" constants and making sure that RA will rematerialize them instead of spilling under high register pressure. We can define a constant "complicated" if it takes a PARALLEL -- (parallel [(set (reg1) (const)) (clobber reg2)]) -- to set it; this is a common way of defining instructions that should be split later and require a temporary register to hold intermediate value. I'm now testing a patch that makes ARM backend to expand constants into parallels instead of sequences of two instructions and tweaking hoist to gcse "complicated" const_int's. The result is the following: test: push {r4, r5, r6, lr} ldr r3, .L3 ldr r2, .L3+4 .LPIC0: add r3, pc ldr r5, [r3, r2] mov r4, #200 lsl r4, r4, #1 mov r6, r0 ldr r0, [r5, r4] bl func1 ldr r0, [r5, r4] mov r1, r6 bl func2 cmp r0, #0 beq .L2 bl func .L2: ldr r0, [r5, r4] bl func3 @ sp needed for prologue pop {r4, r5, r6, pc} .L4: .align 2 .L3: .word _GLOBAL_OFFSET_TABLE_-(.LPIC0+4) .word glob(GOT) -- mkuvyrkov at gcc dot gnu dot org changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|steven at gcc dot gnu dot |mkuvyrkov at gcc dot gnu dot |org |org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42574