Issue 90033
Summary [RISCV] Regsiter copies in a loop should be eliminated
Labels new issue
Assignees
Reporter mgudim
    The following happens in gcc benchmark:

For this code in gcc benchmark:

```
typedef struct simple_bitmap_def
{
  unsigned char *popcount;
  unsigned int n_bits;
  unsigned int size;
  unsigned long elms[1];
} *sbitmap;
typedef const struct simple_bitmap_def *const_sbitmap;

typedef unsigned long *sbitmap_ptr;
typedef const unsigned long *const_sbitmap_ptr;
static unsigned long sbitmap_elt_popcount (unsigned long);

void
sbitmap_a_or_b (sbitmap dst, const_sbitmap a, const_sbitmap b)
{
  unsigned int i, n = dst->size;
  sbitmap_ptr dstp = dst->elms;
  const_sbitmap_ptr ap = a->elms;
  const_sbitmap_ptr bp = b->elms;
  unsigned char has_popcount = dst->popcount != ((void *) 0);

  for (i = 0; i < n; i++)
 {
      const unsigned long tmp = *ap++ | *bp++;
      *dstp++ = tmp;
    }
}
```

We get copies in the loop body:

```
  ld  a4, 0(a3)
  ld  a5, 0(a2) 
  addi  a1, a3, 8
  addi  a2, a2, 8
  or  a4, a4, a5
  addi  a3, a0, 8
  sd  a4, 0(a0)
  mv  a0, a3
  mv  a3, a1
  bne a1, a6, .LBB0_2

```

Copies are introduced by PHI Elimination. The code before PHI Elimination:

```
bb.2.for.body:
; predecessors: %bb.1, %bb.2
  successors: %bb.3(0x04000000), %bb.2(0x7c000000); %bb.3(3.12%), %bb.2(96.88%)

  %5:gpr = PHI %3:gpr, %bb.1, %10:gpr, %bb.2
  %6:gpr = PHI %1:gpr, %bb.1, %9:gpr, %bb.2
  %7:gpr = PHI %2:gpr, %bb.1, %8:gpr, %bb.2
  %8:gpr = ADDI %7:gpr, 8
  %16:gpr = LD killed %7:gpr, 0 :: (load (s64) from %ir.ap.014, !tbaa !15)
  %9:gpr = nuw ADDI %6:gpr, 8
  %17:gpr = LD killed %6:gpr, 0 :: (load (s64) from %ir.bp.015, !tbaa !15)
  %18:gpr = OR killed %17:gpr, killed %16:gpr
 %10:gpr = nuw ADDI %5:gpr, 8
  SD killed %18:gpr, killed %5:gpr, 0 :: (store (s64) into %ir.dstp.016, !tbaa !15)
  BNE %8:gpr, %4:gpr, %bb.2
 PseudoBR %bb.3
```

Note that `SD killed %18:gpr, killed %5:gpr, 0 :: (store (s64) into %ir.dstp.016, !tbaa !15)` is using the value of induction variable `%5` which is updated in `%10:gpr = nuw ADDI %5:gpr, 8`. 

However, it is legal to move the store before the add. Similar situations is with other copies.
Possible solutions:

(1) Have some scheduling pass before PHI elimination. Right after non-global `ISel` we have a scheduling, where target can choose a  custom scheduler via `ST.getDAGScheduler`. None of the existing targets use this and, as I understand, this code will be replaced by something else soon? Also, these schedulers are bottom-up, while this situation is best handled in top-down I think. 
Another possibility is to add a top-down scheduler somewhere before PHI Elimination?

(2) Do this reordering in some other existing non-scheduler pass? Maybe as a first step of PHI elimination? We'll have to reproduce some of the scheduler's logic though which doesn't seem right.

(3) Something else?

What do you think?

CC:
@topperc @preames @asb @wangpc-pp 
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to