https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117008

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|1                           |0
             Blocks|                            |53947
             Status|NEW                         |UNCONFIRMED

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note your `total+=values[index];` loop could be reduced down to just `total +=
values.count();` and that will over 10x faster.


I am not sure sure if this is useful benchmark either. because count uses
popcount directly. Maybe GCC could detect the popcount here but I am not sure.
LLVM does a slightly better job at vectorizing the loop but still messes it up.

Plus once you add other code around values[index], the vectorizer will no
longer kick in so the slow down is only for this bad micro-benchmark.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

Reply via email to