https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462
--- Comment #4 from Hu Lin <lin1.hu at intel dot com> --- Created attachment 58470 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58470&action=edit A short case I tested the file with 1) -Ofast -flto -march=skylake-avx512 -mfpmath=sse -funroll-loops 2) -O2 -march=native (on an Icelake server) Both generate redundant mov.