On Fri, Aug 30, 2019 at 2:08 AM Hongtao Liu <crazy...@gmail.com> wrote: > > On Fri, Aug 30, 2019 at 2:09 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > 2019-08-28 Uroš Bizjak <ubiz...@gmail.com> > > > > * config/i386/i386.c (ix86_register_move_cost): Do not > > limit the cost of moves to/from XMM register to minimum 8. > > > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > > > Actually committed as r274994 with the wrong ChangeLog. > > > > Uros. > > There is 11% regression in 548.exchange_r of SPEC2017. > > Reason for the regression: > For 548.exchange_r, a lot of movements between gpr and xmm are > generated as expected, > and it reduced clocksticks by 3%.
This is OK, and expected from the patch. > But however maybe too many xmm registers are used, > a frequency reduction issue is triggered(average frequency reduced by 13%). > So totally it takes more time. This is a secondary effect that is currently not modelled by the compiler. However, I expected that SSE <-> int moves in x86-tune-cost.h will have to be retuned. Up to now, both directions were limited to minimum 8, so any value lower than 8 was ignored. However, minimum was set to work-around certain limitation in reload, which is not needed anymore. You can simply set the values of SSE <-> int moves to 8 (which is an arbitrary value!) to restore the previous behaviour, but I think that a more precise cost value should be determined, probably a different one for each direction. But until register pressure effects are modelled, any artificially higher value will represent a workaround and not the true reg-reg move cost. Uros.