Issue 173419
Summary LoopUnroll epilogue change causes regalloc degradation
Labels new issue
Assignees
Reporter jdenny-ornl
    [@valerydmit previously reported](https://github.com/llvm/llvm-project/pull/156549#issuecomment-3475229746) that PR #156549 (landed as 6d44b9082e42b918a152098ec70ed409c4da8c79) causes performance degradation due to changes it induces in LLVM's register allocator (regalloc).  Here, I have attached [f-before-after.zip](https://github.com/user-attachments/files/24318692/f-before-after.zip), which includes:

- `f.ll`: The reproducer that @valerydmit posted there.
- `f-opt-O3-before.ll`: The result of running `opt -O3 --preserve-ll-uselistorder -S f.ll` at b36e762cdb2e90e29f65c7abffc00541addfed3f, which is the parent of the above commit.
- `f-opt-O3-after.ll`: The result of running that again but at the above commit.

Here is the result of `diff -u f-opt-O3-before.ll f-opt-O3-after.ll` except that, for readability, I have manually eliminated superficial differences (i.e., label rename, predecessor comment changes, and `@llvm.assume` declaration):

```
@@ -81,19 +81,21 @@
   %14 = sub i32 %13, %9
   %xtraiter = and i32 %14, 1
   %15 = icmp eq i32 %10, %9
-  br i1 %15, label %omp_collapsed.exit.loopexit.unr-lcssa, label %omp_collapsed.body.lr.ph.new
+  br i1 %15, label %omp_collapsed.body.epil, label %omp_collapsed.body.lr.ph.new
 
 omp_collapsed.body.lr.ph.new: ; preds = %omp_collapsed.body.lr.ph
   %unroll_iter = and i32 %14, -2
   br label %omp_collapsed.body
 
 omp_collapsed.exit.loopexit.unr-lcssa:
-  %omp_collapsed.iv3.unr = phi i32 [ 0, %omp_collapsed.body.lr.ph ], [ %omp_collapsed.next.1, %omp_collapsed.body ]
   %lcmp.mod.not = icmp eq i32 %xtraiter, 0
   br i1 %lcmp.mod.not, label %omp_collapsed.exit, label %omp_collapsed.body.epil
 
 omp_collapsed.body.epil:
-  %16 = add i32 %omp_collapsed.iv3.unr, %9
+ %omp_collapsed.iv3.epil.init = phi i32 [ 0, %omp_collapsed.body.lr.ph ], [ %omp_collapsed.next.1, %omp_collapsed.exit.loopexit.unr-lcssa ]
+ %lcmp.mod4 = icmp ne i32 %xtraiter, 0
+  call void @llvm.assume(i1 %lcmp.mod4)
+  %16 = add i32 %omp_collapsed.iv3.epil.init, %9
   %17 = urem i32 %16, %omp_loop.tripcount22
   %18 = udiv i32 %16, %omp_loop.tripcount22
   %19 = urem i32 %18, %omp_loop.tripcount11
```

Remarks from regalloc seem to support the original report:

```
$ llc -O3 -pass-remarks-missed=regalloc f-opt-O3-before.ll
remark: <unknown>:0:0: 5 reloads 5.000000e+01 total reloads cost 4 folded reloads 4.000000e+01 total folded reloads cost 19 virtual registers copies 1.900000e+02 total copies cost generated in loop
remark: <unknown>:0:0: 9 spills 5.750000e+00 total spills cost 13 reloads 5.437500e+01 total reloads cost 7 folded reloads 4.125000e+01 total folded reloads cost 34 virtual registers copies 2.014375e+02 total copies cost generated in function

$ llc -O3 -pass-remarks-missed=regalloc f-opt-O3-after.ll
remark: <unknown>:0:0: 10 reloads 1.000000e+02 total reloads cost 5 folded reloads 5.000000e+01 total folded reloads cost 4 virtual registers copies 4.000000e+01 total copies cost generated in loop
remark: <unknown>:0:0: 15 spills 6.062500e+00 total spills cost 21 reloads 1.034375e+02 total reloads cost 8 folded reloads 5.125000e+01 total folded reloads cost 15 virtual registers copies 4.925000e+01 total copies cost generated in function
```

It does not matter which of the above commits runs `llc`.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to