https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94787

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wilco at gcc dot gnu.org

--- Comment #3 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Gabriel Ravier from comment #1)
> Inversely, I'd also suggest doing the opposite. That is, if there is no
> hardware popcount instruction, `__builtin_popcount(v) == 1` should be
> optimized to `v && !(v & (v - 1))`

I actually posted a patch for this and popcount(x) > 1 given the reverse
transformation is faster on all targets - even if they have popcount
instruction (since they are typically more expensive). This is true on x86 as
well, (x-1) <u (x & -x) is never slower than using popcount.

So I suggest not to have LLVM emit popcount for this is there a popcount
instruction since that is non-optimal for pretty much every target.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693

Reply via email to