https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059

--- Comment #3 from fdlbxtqi <euloanty at live dot com> ---
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/stl_algobase.h

I have found out the problem.

1. libstdc++ does not use memmove for different trivially copyable objects. It
only uses it for the same type which is clearly incorrect.

The performance lost is HUGE in my benchmarks for losing at least 50% of
performance in some critical path.

https://godbolt.org/z/ouNiHn

For vector, the performance lost is even worse. It generates 96 lines of
assembly for different types since it does not call memmove.

2. memmove should not be used for different types unless the source and dest
have char*, char const* and std::byte* for trivially copyable types. It should
call memcpy because of the strict-aliasing rule. However, I do not know whether
it is possible to detect the strict-aliasing context for libstdc++ in GCC. It
should add some magic here to make it faster.

Reply via email to