RE: [RFC, ARM] cortex_a8_call cost

Ramana Radhakrishnan Tue, 03 Jan 2012 03:55:21 -0800

Hi Dmitry,

Sorry about the late reply - been on vacation and catching up today onemail.

Here's why I'm asking. In the following example, dependence cost of 32
for cortex_a8_call causes insns 464 and 575 to be separated by 308 (in
spite having same priority), because 575 is not ready at tick 12, which
causes generation of separate IT-blocks for them on Thumb-2.

;;<---->  9-->   300 r0=call [`spec_putc']        :cortex_a8_issue_branch
;;<---->  9-->   306 r3=sl 0>>0x18^r8                  :cortex_a8_default
;;<----> 10-->   309 cc=cmp(r5,r8)                     :cortex_a8_default
;;<----> 11-->   307 r3=[r3*0x4+r9] :cortex_a8_load_store_1
;;<----> 12-->   464 (cc) r2=0x1                       :cortex_a8_default
;;<----> 13-->   308 sl=sl<<0x8^r3                     :cortex_a8_default
;;<----> 41-->   575 (cc) [sp+0x4]=r2 :cortex_a8_load_store_1

Insn 575 has true dependency with call insn 300 on r2, which is
CALL_USED_REG, and as 464 is conditional, 575 retains true dependency
with 300.

Setting cortex_a8_call cost to 1 saves 186 bytes on SPEC2000 INT (but
I'm not sure whether it's only because of less IT-block splitting).

I doubt that the size savings you are seeing are just because of lesserIT block splitting. If you measured the number of spills my suspicion isthat you'd be seeing fewer spills because of this change and any changeyou see in IT block splitting is a consequence of that.

For the A8 this should be OK - try a few benchmarks to be sure itdoesn't spring any surprises performance wise.


cheers
Ramana

---
Ramana Radhakrishnan
PDSW Tools
ARM Ltd.

RE: [RFC, ARM] cortex_a8_call cost

Reply via email to