>>> This caused: >>> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994 > > Thanks for reducing & tracking down the underlying cause. > >> This change doesn't work with -mzeroupper. When -mzeroupper is used, >> upper bits of vector registers are clobbered upon callee return if any >> MM/ZMM registers are used in callee. Even if YMM7 isn't used, upper >> bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used. > > The problem here really is that the pattern is just: > > (define_insn "avx_vzeroupper" > [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)] > "TARGET_AVX" > "vzeroupper" > ...) > > and so its effect on the registers isn't modelled at all in rtl. > Maybe one option would be to add a parallel: > > (set (reg:V2DI N) (reg:V2DI N)) > > for each register. Or we could do something like I did for the SVE > tlsdesc calls, although here that would mean using a call pattern for > something that isn't really a call. Or we could reinstate clobber_high > and use that, but that's very much third out of three. > > I don't think we should add target hooks to get around this, since that's > IMO papering over the issue. > > I'll try the parallel set thing first.
Please note that vzeroupper insertion pass runs after register allocation, so in effect vzeroupper pattern is hidden to the register allocator. Uros.