Issue 143511
Summary Missed Optimization: Constant Propagation of Store Value Due to the Endless Loop
Labels new issue
Assignees
Reporter GINN-Imp
    The following reduced IR is derived from https://github.com/Kitware/CMake/blob/828fa0f1937c64ee207ff775405b9ebc2ae32b76/Utilities/cmzlib/inflate.c#L1375

Godbolt: https://godbolt.org/z/h57TroYTK
alive2 proof: https://alive2.llvm.org/ce/z/vMT5nb

The following optimization is missed by opt -O3. 
If the IPSCCP pass is run on the result produced by -O3, the optimization will be applied. This suggests that changing the optimization pipeline might enable this optimization, but modifying the pipeline may not be very practical. 
Could there be any better approaches to address this?
Given the complexity of this optimization, I guess this might not be a straightforward issue? For example, it seems that modifying certain passes like InstCombinePass alone may not be sufficient to solve it—but I'm not very familiar with the internals, so I might be wrong. 
I would be very grateful for any reply.

```llvm
define i32 @cm_zlib_inflateSync(ptr %0, i32 %1) {
  %3 = load ptr, ptr %0, align 8
  %4 = and i32 %1, 1
  %5 = getelementptr inbounds i8, ptr %3, i64 88
  %6 = sub i32 0, %4
  store i32 %6, ptr %5, align 8
  br label %7

7:                                                ; preds = %11, %2
  %8 = getelementptr i8, ptr %3, i64 88
  %9 = load i32, ptr %8, align 8
  %10 = icmp ugt i32 %9, 1
  br i1 %10, label %11, label %13

11:                                               ; preds = %7
  %12 = getelementptr i8, ptr %3, i64 88
  store i32 -8, ptr %12, align 8
  br label %7

13:                                               ; preds = %7
  ret i32 0
}
```

opt -O3 (missed: store i32 %3, ptr %7, align 8 --> store i32 0, ptr %7, align 8):
```llvm
define noundef i32 @cm_zlib_inflateSync(ptr readonly captures(none) %0, i32 %1) local_unnamed_addr #0 {
.peel.begin:
  %2 = and i32 %1, 1
  %3 = sub nsw i32 0, %2
  %4 = icmp ugt i32 %3, 1
  br i1 %4, label %.peel.next, label %5

.peel.next:
  br label %.peel.next

5:
  %6 = load ptr, ptr %0, align 8
  %7 = getelementptr inbounds nuw i8, ptr %6, i64 88
  store i32 %3, ptr %7, align 8                         ; can be optimized to 'store i32 0, ptr %7, align 8'
  ret i32 0
}
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to