https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87528

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #2 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
x86 has native popcount only with -msse4.2, otherwise popcount(int) first
zero-extends to 64-bit, then calls __popcountdi2 (64-bit libgcc popcount).

If the original code computes popcount on narrow types, or has only a few
non-zero bits, it can be expected that libgcc replacement is slower.

Even if size-wise popcount detection is an optimization, speed-wise GCC
probably should avoid replacing a simple loop with a libgcc call (just like
final value replacement avoids replacing a loop with computations involving
modulus/division).

Reply via email to