Re: Handling labels in delay-slot scheduling

Tom de Vries Thu, 18 Nov 2010 15:29:40 -0800

Hi Jeff,

However, that doesn't work for the second example:
...
    beq    $3,$0,$L14
    nop
$L7:
    andi    $2,$2,0xffff
    ...
    bne    $3,$0,$L7
    nop
$L14:
    andi    $2,$2,0xffff
    ...
...

What is different from the first example, is that here the beq ownsneither thefall-through thread ($L7) nor the target thread ($L14). Same for thebne. In the

first example, the jump owns both threads.


we can think of this transformation:
...
    beq    $3,$0,$L14new
$L7:
    andi    $2,$2,0xffff
    ...
    bne    $3,$0,$L7
    nop
    andi    $2,$2,0xffff
$L14new:

Could you instead make it:

    beq    $3,$0,$L14a
    andi    $2,$2,0xffff
$L7:
    andi    $2,$2,0xffff
    ...
    bne    $3,$0,$L7
    nop
$L14:
    andi    $2,$2,0xffff
L$14a:
    ...

[ Copy the insn from the L14 target into the delay slot of firstbranch. ]


That is indeed possible in this specific example, because executing

'andi $2,$2, 0xffff' once more does not change the value of $2, but thatdoes

not always work (f.i., not for addi $2,$2,1). This might be an ok
intermediate solution though, thanks for the idea.

Step #2

    beq    $3,$0,$L14a
    andi    $2,$2,0xffff
$L7:
    andi    $2,$2,0xffff
$L7a:
    ...
    bne    $3,$0,$L7a
    andi    $2,$2,0xffff
$L14:
    andi    $2,$2,0xffff
L$14a:
    ...
Same transformation copying the insn from the L7 target into the delayslot of the second branch.
Then after reorg has completed (so you don't have to teach reorg aboutcode labels in sequences), squish the redundant insns together andinsert the code label into the SEQUENCE resulting in
    beq    $3,$0,$L14a
$L7:
    andi    $2,$2,0xffff
$L7a:
    ...
    bne    $3,$0,$L7a
$L14:
    andi    $2,$2,0xffff
L$14a:
    ...
You'd still have to deal with fallout of code labels in sequencespost-reorg, so maybe it's not that big of a win to delay having thecode label appear in the sequence until after reorg.c has completed.


Right.

The other question I'd ask is what's the real penalty these days innot filling hte slots? I know that on later out-of-order PA chipsfilling slots was barely worth the effort, I guess it's stillprofitable on the low-end embedded MIPS chips?

About the penalty, I don't really know. But since the optimization isboth filling delay slots and removing

duplicate code, it looks like a good idea to me.

Thanks,
- Tom

Re: Handling labels in delay-slot scheduling

Reply via email to