https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117008

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2024-10-08
             Target|                            |x86_64-*-*

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Looks like the reduction loop is vectorized and that is causing the slow down.

Semi reduced (unincluded) testcase:
```
#include <bitset>
void g(std::bitset<12800000> &);

int f()
{
  unsigned int total = 0;
  std::bitset<12800000> values;
  g(values);
  for (unsigned int index = 0; index != 12800000; ++index)
    total += values[index];
  return total ;
}
```

For Linux, you need `-m32 -O2 -mavx2` (-m32 since it uses long and for mingw
that is 32bits while for linux it is 64bits and that does not get vectorized).

Reply via email to