https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80373
Bug ID: 80373 Summary: non-optimal handling of copying a std::complex<double> Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: no...@turm-lahnstein.de Target Milestone: --- While copying a std::complex<double> from a memory location to another, four movsd operations are used. However it is possible to use two movups, which are faster (at least on some hardware) and need less memory (36 bytes for movsd-version, but only 16 the the movups-version). Consider the following example: #include <complex> void get(std::complex<double> *res){ res[1]=res[0]; } is compiled to: movsd (%rdi), %xmm0 movsd %xmm0, 16(%rdi) movsd 8(%rdi), %xmm0 movsd %xmm0, 24(%rdi) ret but could be: movups (%rdi), %xmm0 movups %xmm0, 16(%rdi) ret That is in fact, what clang and icc17 do.