https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68654
--- Comment #6 from Igor Zamyatin <izamyatin at gmail dot com> --- Created attachment 36961 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36961&action=edit Dumps Profilers show that core_state_transition and calc_func indeed became slower after r228668. First difference in dumps seems start in expand pass. I attached the dumps - quick look shows that there are extra register copies in several places could be seen. Options that were used - -m32 -Ofast -funroll-loops -flto -static -march=core-avx2