Re: Question about PR 48814 and ivopts and post-increment

Jeff Law Tue, 01 Dec 2015 14:51:12 -0800

On 12/01/2015 02:11 PM, Steve Ellcey  wrote:

With the current top-of-tree we now generate:


        addiu   $4,$4,1
$L8:
        lbu     $3,-1($4)
        addiu   $5,$5,1
        beq     $3,$0,$L7
        lbu     $2,-1($5)  # This is a branch delay slot
        beq     $3,$2,$L8
        addiu   $4,$4,1    # This is a branch delay slot

        subu    $2,$3,$2   # Done only once now after exiting loop.

The main problem with the new loop is that the beq comparing $2 and $3
is right before the load of $2 so there can be a delay due to the time
that the load takes.  The ideal code would probably be:

I'd start by looking at the code prior to reorg/delay slot scheduling.It may be the case that you're running into the well known issue thatwhen reorg knows nothing about latency/scheduling issues and happilypicks whatever insn can safely fill the delay slot. In doing so, reorgmay muck up the schedule badly.

If that's the case you might test disallowing operations with > 1 cyclelatency in delay slots and see how that effects a wider range of benchmarks.


Jeff

Re: Question about PR 48814 and ivopts and post-increment

Reply via email to