Issue 125003
Summary Missed optimization: unreachable branch could be pruned after taking into account the possible values of a constexpr lookup table
Labels new issue
Assignees
Reporter inicula
    ## Context

Consider the following code:

```cpp
#include <array>
#include <cstddef>

constexpr size_t MAX_DATA_INDEX = 1024;

constexpr std::array<size_t, 256> GET_DATA_INDEX_V1 = {};

bool foo(unsigned *data, unsigned char first_idx)
{
    size_t second_idx = GET_DATA_INDEX_V1[first_idx];
    if (second_idx >= MAX_DATA_INDEX)
        return false;

    data[second_idx] = true;
 return true;
}

constexpr std::array<size_t, 256> GET_DATA_INDEX_V2 = {0, 1, 2, /* ... */};

bool bar(unsigned *data, unsigned char first_idx)
{
 size_t second_idx = GET_DATA_INDEX_V2[first_idx];
    if (second_idx >= MAX_DATA_INDEX)
        return false;

    data[second_idx] = true;
 return true;
}
```

And [the assembly](https://godbolt.org/z/joPc8vnx8) that gets generated with clang 19.1.0 (and trunk), with `-O3` enabled:

```
foo(unsigned int*, unsigned char):
        mov     dword ptr [rdi], 1
        mov     al, 1
        ret

bar(unsigned int*, unsigned char):
        mov     eax, esi
        lea     rcx, [rip + GET_DATA_INDEX_V2]
        mov     rax, qword ptr [rcx + 8*rax]
        cmp rax, 1023
        ja      .LBB1_2
        mov     dword ptr [rdi + 4*rax], 1
.LBB1_2:
        cmp     rax, 1024
        setb    al
 ret

GET_DATA_INDEX_V2:
        .quad   0
        .quad   1
 .quad   2
        .zero   2024
```

## Discussion

When the lookup table is filled with a single value (e.g. zero), like in `foo()`, clang is able to get rid of the `second_idx >= MAX_DATA_INDEX` branch, because it sees at compile time that the only possible value that `second_idx` can have is zero, which is less than `MAX_DATA_INDEX`.

But in `bar()`, where the lookup table is filled with other values besides 0, clang doesn't see that all possible values of `second_idx` satisfy `second_idx < MAX_DATA_INDEX`, even though this is _also_ derivable at compile time. So the `second_idx >= MAX_DATA_INDEX` branch is also redundant in `bar()`, and ideally that branch would be deleted.

How hard would it be to add an optimization to clang/LLVM such that the `second_idx >= MAX_DATA_INDEX` branch in `bar()` gets pruned?

You could imagine some code that has a bunch of transformation layers through which it passes an initial index (`first_idx -> second_idx -> third_idx -> ... -> last_idx`). And when moving from one layer to the next, it does bounds checking to verify that the current index is less than `MAX_DATA_INDEX`. In this case, it seems like this kind of optimization could get rid of a pretty significant amount of effectively dead code.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to