Issue 79696
Summary [SCCP][ConstantRange] Missed optimization due to approximation of `xor` operation on ConstantRange
Labels new issue
Assignees
Reporter XChy
    Alive2 proof: https://alive2.llvm.org/ce/z/7h2WSQ

### Description

```llvm
define i1 @src(i64 %a) {
entry:
  %cmp = icmp ugt i64 %a, 8192
  br i1 %cmp, label %then, label %else
then:
 %ctlz = call i64 @llvm.ctlz.i64(i64 %a, i1 true) ;[0, 50]
  %conv = xor i64 %ctlz, 63                        ;[13, 63]
  %cmp1 = icmp ult i64 %conv, 13
  ret i1 %cmp1
else:
  call void @dummy()
  ret i1 0
}
```

can be folded to:

```llvm
define i1 @tgt(i64 %a) {
entry:
  %cmp = icmp ugt i64 %a, 8192
  br i1 %cmp, label %then, label %else
then:
  ret i1 0
else:
  call void @dummy()
  ret i1 0
}
```

As `opt -debug --passes=ipsccp` outputs, the constant range of `%ctlz` is [0, 50], which is precise, but that of `%conv` is [0, 63] but not [13, 63], which is imprecise. After looking into `ConstantRange::binaryXor`, I found that we reuse KnownBit's xor operation here, but as this example shows, it's an over-approximation.

### Real-world motivation

This snippet of IR is derived from [jemalloc/src/large.c@large_palloc](https://github.com/jemalloc/jemalloc/blob/f96010b7fa8ce5f83802144bdebf2bb7a6679649/src/large.c#L21) (after inline and O3 pipeline).
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, see also:https://godbolt.org/z/511rf89EK

**Let me know if you can confirm that it's an optimization opportunity, thanks.**
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to