On Fri, May 28, 2010 at 3:03 PM, Jeff Law <l...@redhat.com> wrote: > On 05/28/10 10:38, H.J. Lu wrote: >> >> Hi, >> >> I want to generate vzeroupper when I know upper 128bits aren't used. I >> can't find >> a way to mark an pattern which zeros upper 128bits. So I added >> > > Presumably you can't use a zero_extract? > > [ ... ]
I tried something similar. But it doesn't work since register allocator will try to allocate all 8/16 SSE registers to vzeroupper. > >> before IRA, >> >> (insn 2 4 3 2 x.i:9 (set (reg/v:V8SF 59 [ y ]) >> (reg:V8SF 21 xmm0 [ y ])) 1036 {*avx_movv8sf_internal} >> (expr_list:REG_DEAD (reg:V8SF 21 xmm0 [ y ]) >> (nil))) >> >> (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG) >> >> (insn 6 3 7 2 x.i:10 (unspec_volatile [ >> (const_int 0 [0]) >> ] 17) 1960 {avx_vzeroupper_nop} (nil)) >> >> (call_insn 7 6 8 2 x.i:10 (call (mem:QI (symbol_ref:DI ("bar2") [flags >> 0x41]<function_decl 0x7ffa930ecd00 bar2>) [0 S1 A8]) >> (const_int 0 [0])) 599 {*call_0} (nil) >> (nil)) >> >> >> after IRA, >> >> (insn 6 3 20 2 x.i:10 (unspec_volatile [ >> (const_int 0 [0]) >> ] 17) 1960 {avx_vzeroupper_nop} (nil)) >> >> (insn 20 6 7 2 x.i:10 (set (mem/c:V8SF (reg/f:DI 7 sp) [3 S32 A256]) >> (reg:V8SF 21 xmm0)) 1036 {*avx_movv8sf_internal} (nil)) >> >> (call_insn 7 20 21 2 x.i:10 (call (mem:QI (symbol_ref:DI ("bar2") >> [flags 0x41]<function_decl 0x7ffa930ecd00 bar2>) [0 S1 A8]) >> (const_int 0 [0])) 599 {*call_0} (nil) >> (nil)) >> >> Since vzeroupper will change xmm0/ymm0, the value saved on stack is wrong. >> Is that a way to tell IRA not to move an instruction? >> > > You need to expose those operands so that the dataflow information is > correct. > I couldn't find a way to do so. I am adding a pass to move avx_vzeroupper_nop just before a jump. -- H.J.