On Wed, Nov 21, 2018 at 9:26 AM Uros Bizjak <ubiz...@gmail.com> wrote:
> Before vzeroupper gets emitted before function call, the compiler > checks if if there are live call-saved SSE registers at the insertion > point. This functionality is intended to handle Windows ABI, so we > don't clear upper parts of the XMM registers that live across the > call. > > However, the called function saves only lower 128bit part of the XMM > register, so it seems that wider modes have to be saved and restored > by the caller function anyway. If this is the case, we don't have to > cancel vzeroupper insertion before the call. > > Attached patch removes this cancellation, since all other ABIs clobber > all XMM registers. > > 2018-21-11 Uros Bizjak <ubiz...@gmail.com> > > * config/i386/i386.c (ix86_avx_emit_vzeroupper): Remove. > (ix86_emit_mode_set) <case AVX_U128>: Emit vzeroupper here. > > The patch is untested, since I have no Windows target here. Daniel, > can you please review the above assumptions and test the patch on > Windows target? It is evident from the generated asm and the compiler source that only lower 128 bits of xmm registers are saved. Now committed to mainline SVN. Uros.