https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117008

--- Comment #9 from Matt Bentley <mattreecebentley at gmail dot com> ---
(In reply to Jonathan Wakely from comment #7)

> I'm certain it won't be, because (apart from vector<bool>) they access
> memory locations that are at least one byte, not repeating very similar
> bitwise ops for every bit in a word.

Have tested 4 additional scenarios: random access, sequential iteration with an
if statement, the vector equivalent, and sequential iteration with an if
statement and a more complicated action.

Can confirm that the issue only occurs in the second of those, ie.:
        {
                std::bitset<12800000> values;

                for (unsigned int counter = 0; counter != 500; ++counter)
                {
                        for (unsigned int index = 0; index != 12800000;
++index)
                        {
                                values.set(index, plf::rand() & 1);
                        }


                        for (unsigned int index = 0; index != 12800000;
++index)
                        {
                                if (values[index])
                                {
                                        total += values[index] * 2;
                                }
                        }
                }
        }


While this statement is closer to what you might see in real world code ie. if
1, do this,
I'm assuming it gets optimized into a conditional move or something, as the
same 25% slowdown occurs.
Once you add in a more complicated action ie.:

                        for (unsigned int index = 0; index != 12800000;
++index)
                        {
                                if (values[index])
                                {
                                        total += values[index] * 2;
                                        total ^= values[index];
                                }
                        }


The problem goes away. So essentially Andrew's correct, unless the user's
chosen action can be trivially optimized.

Will generate the -v output for the second of the scenarios now.

Reply via email to