On Wed, Apr 27, 2016 at 4:39 PM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Wed, Apr 27, 2016 at 4:26 PM, Ilya Enkovich <enkovich....@gmail.com> wrote: > >>>> >> > X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE >>>> >> > regs >>>> >> instead of memory. >>>> >> > >>>> >> > I tried enabling the above tuning with -march=bdver4 -Ofast -mtune- >>>> >> ctrl=general_regs_sse_spill. >>>> >> > I did not find any code differences. >>>> >> > >>>> >> > Looking at the below code to enable this tune, mmx ISA needs to be >>>> >> > turned >>>> >> off. >>>> >> > >>>> >> > static reg_class_t >>>> >> > ix86_spill_class (reg_class_t rclass, machine_mode mode) { >>>> >> > if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! >>>> >> TARGET_MMX >>>> >> > && (mode == SImode || (TARGET_64BIT && mode == DImode)) >>>> >> > && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) >>>> >> > return ALL_SSE_REGS; >>>> >> > return NO_REGS; >>>> >> > } >>>> >> > >>>> >> > All processor variants enable MMX by default and why we need to >>>> >> > switch >>>> >> off mmx? >>>> >> >>>> >> That really looks weird to me. I ran SPEC2006 on Ofast + LTO with >>>> >> and without -mno-mmx and -mno-mmx gives (Haswell machine): >>>> >> >>>> >> SPEC2006INT : +0.30% >>>> >> SPEC2006FP : +0.60% >>>> >> SPEC2006ALL : +0.48% >>>> >> >>>> >> Which is quite surprising for disabling a hardware feature hardly >>>> >> used anywhere now. >>>> > >>>> > As I said without mmx (-mno-mmx), the tune >>>> X86_TUNE_GENERAL_REGS_SSE_SPILL may be active now. >>>> > Not sure if there are any other reason. >>>> >>>> Surely that should be the main reason I see performance gain. >>>> So I want to ask the same question as you did: why does this important >>>> performance feature requires disabled MMX. This restriction exists from >>>> the >>>> very start of X86_TUNE_GENERAL_REGS_SSE_SPILL existence (at least in >>>> trunk) and no comments on why we have this restriction. >>> >>> I was told by Uros, that using TARGET_MMX is to prevent intreg <-> MMX >>> moves that clobber stack registers. >> >> ix86_spill_class is supposed to return a register class to be used >> to store general purpose registers. It returns ALL_SSE_REGS which >> doesn't intersect with MMX_REGS class. So I don't see why >> intreg <-> MMX moves may appear. And if those moves appear we should >> fix it, not disable the whole feature. >> >> @Uros, do you have a comment here? > > Looking at the implementation of ix86_spill_class, TARGET_MMX check > really looks too restrictive. However, we need to check TARGET_SSE2 > and TARGET_INTERUNIT_MOVES instead, otherwise movq xmm <-> intreg > pattern gets disabled
I'm testing following patch: --cut here-- Index: i386.c =================================================================== --- i386.c (revision 235516) +++ i386.c (working copy) @@ -53560,9 +53560,12 @@ static reg_class_t ix86_spill_class (reg_class_t rclass, machine_mode mode) { - if (TARGET_SSE && TARGET_GENERAL_REGS_SSE_SPILL && ! TARGET_MMX + if (TARGET_GENERAL_REGS_SSE_SPILL + && TARGET_SSE2 + && TARGET_INTER_UNIT_MOVES_TO_VEC + && TARGET_INTER_UNIT_MOVES_FROM_VEC && (mode == SImode || (TARGET_64BIT && mode == DImode)) - && rclass != NO_REGS && INTEGER_CLASS_P (rclass)) + && INTEGER_CLASS_P (rclass)) return ALL_SSE_REGS; return NO_REGS; } --cut here-- Uros.