FYI. I have committed the change to upstream as:
31933f4f788b6cd64cbb7ee42076997f6d0fe212 Qing > On Mar 31, 2022, at 8:10 AM, Richard Sandiford <richard.sandif...@arm.com> > wrote: > > Qing Zhao <qing.z...@oracle.com> writes: >> Hi, >> >> Per our discussion on: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-March/592002.html >> >> I come up with the following patch to: >> >> 1. Update the documentation for TARGET_ZERO_CALL_USED_REGS hook; >> 2. Add an assertion in function.cc to make sure the actually zeroed_regs is >> a subset of all call used regs; >> (The reason I didn’t add a new parameter to TARGET_ZERO_CALL_USED_REGS is, >> I think adding the >> assertion in the common place function.cc is simpler to be implemented). > > Yeah, that's fair. I guess in theory, passing the parameter would allow > targets to choose between two versions of zeroing the register, one with > a temporary and one without. But that's a purely hypothetical situation > and we could always add a parameter later if that turns out to be useful. > > Perhaps more realistically, there might be other uses of the hook in > future that want to zero registers for different reasons, with their > own rules about which registers can be zeroed. In other words, the > hook is providing a general facility that happens to be useful for > -fzero-call-used-regs. But again, we can deal with that if it ever > happens. > > So I agree this is the right call, especially for stage 4. > >> 3. This new assertion identified a bug in i386 implementation. Fix this bug >> in i386. >> >> This patch is bootstrapped on both x86 and aarch64, no regression. >> >> Okay for commit? > > OK for the non-x86 bits. > > Thanks, > Richard > >> thanks. >> >> Qing >> >> =================================== >> From 2e5bc1b25a707c6a17afbf03da2a8bec5b03454d Mon Sep 17 00:00:00 2001 >> From: Qing Zhao <qing.z...@oracle.com> >> Date: Fri, 18 Mar 2022 20:49:56 +0000 >> Subject: [PATCH] Add an assertion: the zeroed_hardregs set is a subset of all >> call used regs. >> >> We should make sure that the hard register set that is actually cleared by >> the target hook zero_call_used_regs should be a subset of all call used >> registers. >> >> At the same time, update documentation for the target hook >> TARGET_ZERO_CALL_USED_REGS. >> >> This new assertion identified a bug in the i386 implemenation, which >> incorrectly set the zeroed_hardregs for stack registers. Fixed this bug >> in i386 implementation. >> >> gcc/ChangeLog: >> >> 2022-03-21 Qing Zhao <qing.z...@oracle.com> >> >> * config/i386/i386.cc (zero_all_st_registers): Return the value of >> num_of_st. >> (ix86_zero_call_used_regs): Update zeroed_hardregs set according to >> the return value of zero_all_st_registers. >> * doc/tm.texi: Update the documentation of TARGET_ZERO_CALL_USED_REGS. >> * function.cc (gen_call_used_regs_seq): Add an assertion. >> * target.def: Update the documentation of TARGET_ZERO_CALL_USED_REGS. >> --- >> gcc/config/i386/i386.cc | 27 ++++++++++++++++++--------- >> gcc/doc/tm.texi | 7 +++++++ >> gcc/function.cc | 22 ++++++++++++++++++---- >> gcc/target.def | 7 +++++++ >> 4 files changed, 50 insertions(+), 13 deletions(-) >> >> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc >> index 5a561966eb44..d84047a4bc1b 100644 >> --- a/gcc/config/i386/i386.cc >> +++ b/gcc/config/i386/i386.cc >> @@ -3753,16 +3753,17 @@ zero_all_vector_registers (HARD_REG_SET >> need_zeroed_hardregs) >> needs to be cleared, the whole stack should be cleared. However, >> x87 stack registers that hold the return value should be excluded. >> x87 returns in the top (two for complex values) register, so >> - num_of_st should be 7/6 when x87 returns, otherwise it will be 8. */ >> + num_of_st should be 7/6 when x87 returns, otherwise it will be 8. >> + return the value of num_of_st. */ >> >> >> -static bool >> +static int >> zero_all_st_registers (HARD_REG_SET need_zeroed_hardregs) >> { >> >> /* If the FPU is disabled, no need to zero all st registers. */ >> if (! (TARGET_80387 || TARGET_FLOAT_RETURNS_IN_80387)) >> - return false; >> + return 0; >> >> unsigned int num_of_st = 0; >> for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) >> @@ -3774,7 +3775,7 @@ zero_all_st_registers (HARD_REG_SET >> need_zeroed_hardregs) >> } >> >> if (num_of_st == 0) >> - return false; >> + return 0; >> >> bool return_with_x87 = false; >> return_with_x87 = (crtl->return_rtx >> @@ -3802,7 +3803,7 @@ zero_all_st_registers (HARD_REG_SET >> need_zeroed_hardregs) >> insn = emit_insn (gen_rtx_SET (st_reg, st_reg)); >> add_reg_note (insn, REG_DEAD, st_reg); >> } >> - return true; >> + return num_of_st; >> } >> >> >> @@ -3851,7 +3852,7 @@ ix86_zero_call_used_regs (HARD_REG_SET >> need_zeroed_hardregs) >> { >> HARD_REG_SET zeroed_hardregs; >> bool all_sse_zeroed = false; >> - bool all_st_zeroed = false; >> + int all_st_zeroed_num = 0; >> bool all_mm_zeroed = false; >> >> CLEAR_HARD_REG_SET (zeroed_hardregs); >> @@ -3881,9 +3882,17 @@ ix86_zero_call_used_regs (HARD_REG_SET >> need_zeroed_hardregs) >> if (!exit_with_mmx_mode) >> /* x87 exit mode, we should zero all st registers together. */ >> { >> - all_st_zeroed = zero_all_st_registers (need_zeroed_hardregs); >> - if (all_st_zeroed) >> - SET_HARD_REG_BIT (zeroed_hardregs, FIRST_STACK_REG); >> + all_st_zeroed_num = zero_all_st_registers (need_zeroed_hardregs); >> + >> + if (all_st_zeroed_num > 0) >> + for (unsigned int regno = FIRST_STACK_REG; regno <= LAST_STACK_REG; >> regno++) >> + /* x87 stack registers that hold the return value should be excluded. >> + x87 returns in the top (two for complex values) register. */ >> + if (all_st_zeroed_num == 8 >> + || !((all_st_zeroed_num >= 6 && regno == REGNO (crtl->return_rtx)) >> + || (all_st_zeroed_num == 6 >> + && (regno == (REGNO (crtl->return_rtx) + 1))))) >> + SET_HARD_REG_BIT (zeroed_hardregs, regno); >> } >> else >> /* MMX exit mode, check whether we can zero all mm registers. */ >> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi >> index 2f92d37da8c0..c5006afc00d2 100644 >> --- a/gcc/doc/tm.texi >> +++ b/gcc/doc/tm.texi >> @@ -12330,6 +12330,13 @@ This target hook emits instructions to zero the >> subset of @var{selected_regs} >> that could conceivably contain values that are useful to an attacker. >> Return the set of registers that were actually cleared. >> >> +For most targets, the returned set of registers is a subset of >> +@var{selected_regs}, however, for some of the targets (for example MIPS), >> +clearing some registers that are in the @var{selected_regs} requires >> +clearing other call used registers that are not in the @var{selected_regs}, >> +under such situation, the returned set of registers must be a subset of all >> +call used registers. >> + >> The default implementation uses normal move instructions to zero >> all the registers in @var{selected_regs}. Define this hook if the >> target has more efficient ways of zeroing certain registers, >> diff --git a/gcc/function.cc b/gcc/function.cc >> index d5ed51a6a663..ad0096a43eff 100644 >> --- a/gcc/function.cc >> +++ b/gcc/function.cc >> @@ -5892,7 +5892,9 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int >> zero_regs_type) >> df_simulate_one_insn_backwards (bb, ret, live_out); >> >> HARD_REG_SET selected_hardregs; >> + HARD_REG_SET all_call_used_regs; >> CLEAR_HARD_REG_SET (selected_hardregs); >> + CLEAR_HARD_REG_SET (all_call_used_regs); >> for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) >> { >> if (!crtl->abi->clobbers_full_reg_p (regno)) >> @@ -5901,6 +5903,13 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int >> zero_regs_type) >> continue; >> if (REGNO_REG_SET_P (live_out, regno)) >> continue; >> +#ifdef LEAF_REG_REMAP >> + if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0) >> + continue; >> +#endif >> + /* This is a call used register that is dead at return. */ >> + SET_HARD_REG_BIT (all_call_used_regs, regno); >> + >> if (only_gpr >> && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno)) >> continue; >> @@ -5908,10 +5917,6 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int >> zero_regs_type) >> continue; >> if (only_arg && !FUNCTION_ARG_REGNO_P (regno)) >> continue; >> -#ifdef LEAF_REG_REMAP >> - if (crtl->uses_only_leaf_regs && LEAF_REG_REMAP (regno) < 0) >> - continue; >> -#endif >> >> /* Now this is a register that we might want to zero. */ >> SET_HARD_REG_BIT (selected_hardregs, regno); >> @@ -5925,6 +5930,15 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int >> zero_regs_type) >> HARD_REG_SET zeroed_hardregs; >> start_sequence (); >> zeroed_hardregs = targetm.calls.zero_call_used_regs (selected_hardregs); >> + >> + /* For most targets, the returned set of registers is a subset of >> + selected_hardregs, however, for some of the targets (for example MIPS), >> + clearing some registers that are in selected_hardregs requires clearing >> + other call used registers that are not in the selected_hardregs, under >> + such situation, the returned set of registers must be a subset of >> + all call used registers. */ >> + gcc_assert (hard_reg_set_subset_p (zeroed_hardregs, all_call_used_regs)); >> + >> rtx_insn *seq = get_insns (); >> end_sequence (); >> if (seq) >> diff --git a/gcc/target.def b/gcc/target.def >> index 72c2e1ef756c..d85adf36a391 100644 >> --- a/gcc/target.def >> +++ b/gcc/target.def >> @@ -5122,6 +5122,13 @@ DEFHOOK >> that could conceivably contain values that are useful to an attacker.\n\ >> Return the set of registers that were actually cleared.\n\ >> \n\ >> +For most targets, the returned set of registers is a subset of\n\ >> +@var{selected_regs}, however, for some of the targets (for example MIPS),\n\ >> +clearing some registers that are in the @var{selected_regs} requires\n\ >> +clearing other call used registers that are not in the >> @var{selected_regs},\n\ >> +under such situation, the returned set of registers must be a subset of >> all\n\ >> +call used registers.\n\ >> +\n\ >> The default implementation uses normal move instructions to zero\n\ >> all the registers in @var{selected_regs}. Define this hook if the\n\ >> target has more efficient ways of zeroing certain registers,\n\