[llvm-bugs] [Bug 142042] Performance: LLVM 20 aggressively optimizes popcnt and results in worse performance

LLVM Bugs via llvm-bugs Thu, 29 May 2025 14:43:33 -0700

Issue	142042
Summary	Performance: LLVM 20 aggressively optimizes popcnt and results in worse performance
Labels	new issue
Assignees
Reporter	jabraham17

    I am finding that the same microbenchmark, when compiled with clang 20, is much slower then the same benchmark compiled with clang 19.


[This link](https://godbolt.org/z/5Md1dG9YG) has the full benchmark and the assembly for clang 19 and 20.

The LLVM 20 code is much longer, and seems to be because the LLVM 20 version is vectorized and not using the `popcnt` instruction. For some reason, this is slower. The LLVM 20 version takes .15s, the LLVM 19 version takes .05s.

Using the naive popcnt for the C version does seem to get pattern matched better and result in LLVM 20 being just as fast, if not faster

```
   uint64_t c = 0;
   while (n) {
        n &= (n - 1);
        c++;
    }
    return c;
```

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 142042] Performance: LLVM 20 aggressively optimizes popcnt and results in worse performance

Reply via email to