"H.J. Lu" <hjl.to...@gmail.com> writes:
> On Fri, Oct 4, 2019 at 11:03 AM H.J. Lu <hjl.to...@gmail.com> wrote:
>>
>> On Wed, Sep 11, 2019 at 12:14 PM Richard Sandiford
>> <richard.sandif...@arm.com> wrote:
>> >
>> > lra_reg has an actual_call_used_reg_set field that is only used during
>> > inheritance.  This in turn required a special lra_create_live_ranges
>> > pass for flag_ipa_ra to set up this field.  This patch instead makes
>> > the inheritance code do its own live register tracking, using the
>> > same ABI-mask-and-clobber-set pair as for IRA.
>> >
>> > Tracking ABIs simplifies (and cheapens) the logic in lra-lives.c and
>> > means we no longer need a separate path for -fipa-ra.  It also means
>> > we can remove TARGET_RETURN_CALL_WITH_MAX_CLOBBERS.
>> >
>> > The patch also strengthens the sanity check in lra_assigns so that
>> > we check that reg_renumber is consistent with the whole conflict set,
>> > not just the call-clobbered registers.
>> >
>> >
>> > 2019-09-11  Richard Sandiford  <richard.sandif...@arm.com>
>> >
>> > gcc/
>> >         * target.def (return_call_with_max_clobbers): Delete.
>> >         * doc/tm.texi.in (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
>> >         * doc/tm.texi: Regenerate.
>> >         * config/aarch64/aarch64.c (aarch64_return_call_with_max_clobbers)
>> >         (TARGET_RETURN_CALL_WITH_MAX_CLOBBERS): Delete.
>> >         * lra-int.h (lra_reg::actual_call_used_reg_set): Delete.
>> >         (lra_reg::call_insn): Delete.
>> >         * lra.c: Include function-abi.h.
>> >         (initialize_lra_reg_info_element): Don't initialize the fields 
>> > above.
>> >         (lra): Use crtl->abi to test whether the current function needs to
>> >         save a register in the prologue.  Remove special pre-inheritance
>> >         lra_create_live_ranges pass for flag_ipa_ra.
>> >         * lra-assigns.c: Include function-abi.h
>> >         (find_hard_regno_for_1): Use crtl->abi to test whether the current
>> >         function needs to save a register in the prologue.
>> >         (lra_assign): Assert that registers aren't allocated to a
>> >         conflicting register, rather than checking only for overlaps
>> >         with call_used_or_fixed_regs.  Do this even for flag_ipa_ra,
>> >         and for registers that are not live across a call.
>> >         * lra-constraints.c (last_call_for_abi): New variable.
>> >         (full_and_partial_call_clobbers): Likewise.
>> >         (setup_next_usage_insn): Remove the register from
>> >         full_and_partial_call_clobbers.
>> >         (need_for_call_save_p): Use call_clobbered_in_region_p to test
>> >         whether the register needs a caller save.
>> >         (need_for_split_p): Use full_and_partial_reg_clobbers instead
>> >         of call_used_or_fixed_regs.
>> >         (inherit_in_ebb): Initialize and maintain last_call_for_abi and
>> >         full_and_partial_call_clobbers.
>> >         * lra-lives.c (check_pseudos_live_through_calls): Replace
>> >         last_call_used_reg_set and call_insn arguments with an abi 
>> > argument.
>> >         Remove handling of lra_reg::call_insn.  Use 
>> > function_abi::mode_clobbers
>> >         as the set of conflicting registers.
>> >         (calls_have_same_clobbers_p): Delete.
>> >         (process_bb_lives): Track the ABI of the last call instead of an
>> >         insn/HARD_REG_SET pair.  Update calls to
>> >         check_pseudos_live_through_calls.  Use eh_edge_abi to calculate
>> >         the set of registers that could be clobbered by an EH edge.
>> >         Include partially-clobbered as well as fully-clobbered registers.
>> >         (lra_create_live_ranges_1): Don't initialize lra_reg::call_insn.
>> >         * lra-remat.c: Include function-abi.h.
>> >         (call_used_regs_arr_len, call_used_regs_arr): Delete.
>> >         (set_bb_regs): Use call_insn_abi to get the set of call-clobbered
>> >         registers and bitmap_view to combine them into dead_regs.
>> >         (call_used_input_regno_present_p): Take a function_abi argument
>> >         and use it to test whether a register is call-clobbered.
>> >         (calculate_gen_cands): Use call_insn_abi to get the ABI of the
>> >         call insn target.  Update tje call to 
>> > call_used_input_regno_present_p.
>> >         (do_remat): Likewise.
>> >         (lra_remat): Remove the initialization of call_used_regs_arr_len
>> >         and call_used_regs_arr.
>>
>> This caused:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91994

Thanks for reducing & tracking down the underlying cause.

> This change doesn't work with -mzeroupper.  When -mzeroupper is used,
> upper bits of vector registers are clobbered upon callee return if any
> MM/ZMM registers are used in callee.  Even if YMM7 isn't used, upper
> bits of YMM7 can still be clobbered by vzeroupper when YMM1 is used.

The problem here really is that the pattern is just:

(define_insn "avx_vzeroupper"
  [(unspec_volatile [(const_int 0)] UNSPECV_VZEROUPPER)]
  "TARGET_AVX"
  "vzeroupper"
  ...)

and so its effect on the registers isn't modelled at all in rtl.
Maybe one option would be to add a parallel:

  (set (reg:V2DI N) (reg:V2DI N))

for each register.  Or we could do something like I did for the SVE
tlsdesc calls, although here that would mean using a call pattern for
something that isn't really a call.  Or we could reinstate clobber_high
and use that, but that's very much third out of three.

I don't think we should add target hooks to get around this, since that's
IMO papering over the issue.

I'll try the parallel set thing first.

Richard

Reply via email to