------- Comment #50 from potswa at mac dot com  2009-11-03 17:53 -------
The current RAI algo uses a temporary regardless of size or class. We could put
in a "&& sizeof(_ValueType) < __MAX_TEMP_SIZE" or somethingÂ… but stack overflow
from a single temporary doesn't seem to have been concern in the past.

I don't see how being register-size in particular is important. If we were
swapping the temporary every time, we would want it to fit in a reasonable
number of registers so the compiler could optimize out read-after-writes. But
the __tmp here is only written and read once. The larger it is, the more
acceleration.

Proposed performance is very good with k small > 1, compared to current. Using
memmove is simply even faster. It's not clear such rotate operations are
popular enough to warrant a framework for optimization, though.

If we assure it's a non-move type then I also favor reverting out the
_GLIBCXX_MOVE[3]().


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41351

Reply via email to