Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

Richard Sandiford via Gcc-patches Mon, 26 Oct 2020 11:48:21 -0700

Qing Zhao <qing.z...@oracle.com> writes:
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index c9f7299..3a884e1 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3992,6 +3992,49 @@ performing a link with relocatable output (i.e.@: 
> @code{ld -r}) on them.
>  A declaration to which @code{weakref} is attached and that is associated
>  with a named @code{target} must be @code{static}.
>  
> +@item zero_call_used_regs ("@var{choice}")
> +@cindex @code{zero_call_used_regs} function attribute
> +
> +The @code{zero_call_used_regs} attribute causes the compiler to zero
> +a subset of all call-used registers at function return according to
> +@var{choice}.
> +This is used to increase the program security by either mitigating
> +Return-Oriented Programming (ROP) or preventing information leak
> +through registers.
> +
> +A "call-used" register is a register that is clobbered by function calls,
> +as a result, the caller has to save and restore it before or after a
> +function call.  It is also called as "call-clobbered", "caller-saved", or
> +"volatile".


texinfo quoting is to use ``…'' rather than "…".  So maybe:

-------------------------------------------------------------------
A ``call-used'' register is a register whose contents can be changed by
a function call; therefore, a caller cannot assume that the register has
the same contents on return from the function as it had before calling
the function.  Such registers are also called ``call-clobbered'',
``caller-saved'', or ``volatile''.
-------------------------------------------------------------------

> +In order to satisfy users with different security needs and control the
> +run-time overhead at the same time,  GCC provides a flexible way to choose

nit: should only be one space after the comma

> +the subset of the call-used registers to be zeroed.

Maybe add “The three basic values of @var{choice} are:”

> +
> +@samp{skip} doesn't zero any call-used registers.
> +@samp{used} zeros call-used registers which are used in the function.  A 
> "used"

Maybe s/zeros/only zeros/?

s/which/that/

> +register is one whose content has been set or referenced in the function.
> +@samp{all} zeros all call-used registers.

I think this would be better formatted using a @table.

> +In addition to the above three basic choices, the register set can be further
> +limited by adding "-gpr" (i.e., general purpose register), "-arg" (i.e.,
> +argument register), or both as following:

How about:

-------------------------------------------------------------------
In addition to these three basic choices, it is possible to modify
@samp{used} or @samp{all} as follows:

@itemize @bullet
@item
Adding @samp{-gpr} restricts the zeroing to general-purpose registers.

@item
Adding @samp{-arg} restricts the zeroing to registers that are used
to pass parameters.  When applied to @samp{all}, this includes all
parameter registers defined by the platform's calling convention,
regardless of whether the function uses those parameter registers.
@end @itemize

The modifiers can be used individually or together.  If they are used
together, they must appear in the order above.

The full list of @var{choice}s is therefore:
-------------------------------------------------------------------

with the list repeating @var{skip}, @var{used} and @var{all}.

(untested)

> +@samp{used-gpr-arg} zeros used call-used general purpose registers that
> +pass parameters.
> +@samp{used-arg} zeros used call-used registers that pass parameters.
> +@samp{all-gpr-arg} zeros all call-used general purpose registers that pass
> +parameters.
> +@samp{all-arg} zeros all call-used registers that pass parameters.
> +@samp{used-gpr} zeros call-used general purpose registers which are used in 
> the
> +function.
> +@samp{all-gpr} zeros all call-used general purpose registers.

I think this too should be a @table.

> +
> +Among this list, "used-gpr-arg", "used-arg", "all-gpr-arg", and "all-arg" are
> +mainly used for ROP mitigation.

Should be quoted using @samp rather than ".

> +@item -fzero-call-used-regs=@var{choice}
> +@opindex fzero-call-used-regs
> +Zero call-used registers at function return to increase the program
> +security by either mitigating Return-Oriented Programming (ROP) or
> +preventing information leak through registers.

After this, we should probably say something like:

-------------------------------------------------------------------
The possible values of @var{choice} are the same as for the
@samp{zero_call_used_regs} attribute (@pxref{…}).  The default
is @samp{skip}.
-------------------------------------------------------------------

(with the xref filled in)

> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index 97437e8..3b75c46 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -12053,6 +12053,18 @@ argument list due to stack realignment.  Return 
> @code{NULL} if no DRAP
>  is needed.
>  @end deftypefn
>  
> +@deftypefn {Target Hook} HARD_REG_SET TARGET_ZERO_CALL_USED_REGS 
> (HARD_REG_SET @var{selected_regs})
> +This target hook emits instructions to zero subset of @var{selected_regs}

…to zero the subset…
(probably my mistake, sorry)

> diff --git a/gcc/flag-types.h b/gcc/flag-types.h
> index 852ea76..0f7e503 100644
> --- a/gcc/flag-types.h
> +++ b/gcc/flag-types.h
> @@ -285,6 +285,15 @@ enum sanitize_code {
>                                 | SANITIZE_BOUNDS_STRICT
>  };
>  
> +enum  zero_call_used_regs_code {
> +  UNSET = 0,
> +  SKIP = 1UL << 0,
> +  ONLY_USED = 1UL << 1,
> +  ONLY_GPR = 1UL << 2,
> +  ONLY_ARG = 1UL << 3,
> +  ALL = 1UL << 4
> +};

I'd suggested these names on the assumption that we'd be using
a C++ enum class, so that the enum would be referenced as
name::ALL, name::SKIP, etc.  But I guess using a C++ enum class
doesn't work well with bitfields after all.

These names are too generic without the name:: scoping though.
Perhaps we should put them in a namespace:

  namespace zero_regs_flags {
    const unsigned int UNSET = 0;
    …etc…
  }

(call-used probably doesn't need to be part of the flag names,
since the concept is more general than that and call-usedness
is really a filter that's being applied on top.  Although I guess
the same is true of “zero”. ;-))

I don't think we should have ALL as a separate flag: ALL is the absence
of ONLY_*.  Maybe we should have an ENABLED flag that all non-skip
combinations use?

If it makes things easier, I think it would be good to have e.g.:

  unsigned int USED_GPR = ENABLED | ONLY_USED | ONLY_GPR;

inside the namespace, to reduce the verbosity in the option table.

> +  /* If gpr_only is true, only zero call-used-registers that are
> +     general-purpose registers; if used_only is true, only zero
> +     call-used-registers that are used in the current function.  */
> +
> +  gpr_only = crtl->zero_call_used_regs & ONLY_GPR;
> +  used_only = crtl->zero_call_used_regs & ONLY_USED;
> +  arg_only = crtl->zero_call_used_regs & ONLY_ARG;
> +
> +  /* For each of the hard registers, check to see whether we should zero it 
> if:

s/check to see whether //

> +     1. it is a call-used-registers;

s/call-used-registers/call-used register/

> + and 2. it is not a fixed-registers;

s/fixed-registers/fixed register/

> + and 3. it is not live at the return of the routine;
> + and 4. it is general registor if gpr_only is true;
> + and 5. it is used in the routine if used_only is true;
> + and 6. it is a register that passes parameter if arg_only is true;
> +   */

Under GCC formatting, the “and” lines need to be indented under “For each”.
Maybe indent the “1.” line a bit more if you think it looks nicer with the
numbers lined up (it probably does).

Similarly, the last bit of text should end with “.  */”, rather than
with the “;\n  */” above.

(Sorry that the rules are so picky about this.)

> +  /* First, prepare the data flow information.  */
> +  basic_block bb = BLOCK_FOR_INSN (ret);
> +  bitmap live_out;
> +  live_out = BITMAP_ALLOC (NULL);

Should just use auto_bitmap here, which will also handle the freeing.

> +  bitmap_copy (live_out, df_get_live_out (bb));
> +  df_simulate_initialize_backwards (bb, live_out);
> +  df_simulate_one_insn_backwards (bb, ret, live_out);
> +
> +  HARD_REG_SET need_zeroed_hardregs;
> +  CLEAR_HARD_REG_SET (need_zeroed_hardregs);

Maybe s/need_zeroed/selected/?  Similarly to the target hook comment
in the previous review, I think “need” makes it sound like the target
has no freedom to decline.

> +  for (unsigned int regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
> +    {
> +      if (!crtl->abi->clobbers_full_reg_p (regno))
> +     continue;
> +      if (fixed_regs[regno])
> +     continue;
> +      if (REGNO_REG_SET_P (live_out, regno))
> +     continue;
> +      if (gpr_only
> +       && !TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], regno))
> +     continue;
> +      if (used_only && !df_regs_ever_live_p (regno))
> +     continue;
> +      if (arg_only && !FUNCTION_ARG_REGNO_P (regno))
> +     continue;
> +
> +      /* Now this is a register that we might want to zero.  */
> +      SET_HARD_REG_BIT (need_zeroed_hardregs, regno);
> +    }
> +
> +  BITMAP_FREE (live_out);
> +
> +  if (hard_reg_set_empty_p (need_zeroed_hardregs))
> +    return;
> +
> +  /* Now we get a hard register set that need to be zeroed, pass it to
> +     target to generate zeroing sequence.  */

/* Now that we have a hard register set that needs to be zeroed, pass it
   to the target to generate the zeroing sequence.  */

> +  HARD_REG_SET zeroed_hardregs;
> +  start_sequence ();
> +  zeroed_hardregs = targetm.calls.zero_call_used_regs (need_zeroed_hardregs);
> +  rtx_insn *seq = get_insns ();
> +  end_sequence ();
> +  if (seq)
> +    {
> +      /* Emit the memory blockage and register clobber asm volatile before
> +      the whole sequence.  */
> +      start_sequence ();
> +      expand_asm_reg_clobber_mem_blockage (zeroed_hardregs);
> +      rtx_insn *seq_barrier = get_insns ();
> +      end_sequence ();
> +
> +      emit_insn_before (seq_barrier, ret);
> +      emit_insn_before (seq, ret);
> +
> +      /* Update the data flow information.  */
> +      crtl->must_be_zero_on_return |= zeroed_hardregs;
> +      df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun));
> +    }
> +}
> +
> +
>  /* Return a sequence to be used as the epilogue for the current function,
>     or NULL.  */
>  
> @@ -6486,7 +6584,120 @@ make_pass_thread_prologue_and_epilogue (gcc::context 
> *ctxt)
>  {
>    return new pass_thread_prologue_and_epilogue (ctxt);
>  }
> -
>
> +
> +static unsigned int
> +rest_of_zero_call_used_regs (void)

This needs a function comment.  Maybe:

/* Iterate over the function's return instructions and insert any
   register zeroing required by the -fzero-call-used-regs command-line
   option or the "zero_call_used_regs" function attribute.  */

Also, we might as well make it:

pass_zero_call_used_regs::execute

rather than a separate function.  The “rest_of_…” stuff is mostly legacy.

> +{
> +  edge_iterator ei;
> +  edge e;
> +  rtx_insn *insn;
> +
> +  /* This pass needs data flow information.  */
> +  df_analyze ();
> +
> +  /* Search all the "return"s in the routine, and insert instruction 
> sequence to
> +     zero the call used registers.  */
> +  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
> +    {
> +      insn = BB_END (e->src);

Modern style would be to declare insn here rather than above.

> +      if (JUMP_P (insn) && ANY_RETURN_P (JUMP_LABEL (insn)))
> +     gen_call_used_regs_seq (insn);
> +    }
> +
> +  return 0;
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_zero_call_used_regs =
> +{
> +  RTL_PASS, /* type */
> +  "zero_call_used_regs", /* name */
> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_NONE, /* tv_id */
> +  0, /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  0, /* todo_flags_finish */
> +};
> +
> +class pass_zero_call_used_regs: public rtl_opt_pass
> +{
> +public:
> +  pass_zero_call_used_regs (gcc::context *ctxt)
> +    : rtl_opt_pass (pass_data_zero_call_used_regs, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *);
> +
> +  virtual unsigned int execute (function *)
> +    {
> +      return rest_of_zero_call_used_regs ();
> +    }
> +
> +}; // class pass_zero_call_used_regs
> +
> +bool
> +pass_zero_call_used_regs::gate (function *fun)
> +{
> +  unsigned int zero_regs_type = UNSET;
> +  unsigned int attr_zero_regs_type = UNSET;
> +
> +  tree attr_zero_regs
> +     = lookup_attribute ("zero_call_used_regs",
> +                         DECL_ATTRIBUTES (fun->decl));
> +
> +  /* Get the type of zero_call_used_regs from function attribute.  */
> +  if (attr_zero_regs)
> +    {
> +      bool found = false;
> +      unsigned int i;
> +
> +      /* The TREE_VALUE of an attribute is a TREE_LIST whose TREE_VALUE
> +      is the attribute argument's value.  */
> +      attr_zero_regs = TREE_VALUE (attr_zero_regs);
> +      gcc_assert (TREE_CODE (attr_zero_regs) == TREE_LIST);
> +      attr_zero_regs = TREE_VALUE (attr_zero_regs);
> +      gcc_assert (TREE_CODE (attr_zero_regs) == STRING_CST);
> +
> +      for (i = 0; zero_call_used_regs_opts[i].name != NULL; ++i)
> +     if (strcmp (TREE_STRING_POINTER (attr_zero_regs),
> +                  zero_call_used_regs_opts[i].name) == 0)
> +       {
> +         attr_zero_regs_type |= zero_call_used_regs_opts[i].flag;

Think = is less surprising than |= here.

> +         found = true;

All valid values are nonzero, so we don't need a separate boolean.

> +         break;
> +       }
> +
> +      if (!found)
> +     warning_at (DECL_SOURCE_LOCATION (fun->decl), 0,
> +                 "unrecognized zero_call_used_regs attribute: %qs",
> +                 TREE_STRING_POINTER (attr_zero_regs));

I think we should warn when handling the attribute in c-attribs.c
(as before, IIRC), and make it silent here.

> +    }
> +
> +  if (flag_zero_call_used_regs)
> +    if (!attr_zero_regs)
> +      zero_regs_type = flag_zero_call_used_regs;
> +    else
> +      zero_regs_type = attr_zero_regs_type;
> +  else
> +    zero_regs_type = attr_zero_regs_type;

Seems easier to make the attribute code set zero_regs_type directly,
then have:

  if (!zero_regs_type)
    zero_regs_type = flag_zero_call_used_regs;

> +
> +  crtl->zero_call_used_regs = zero_regs_type;
> +
> +  /* No need to zero call-used-regs when no user request is present.  */
> +  return zero_regs_type > SKIP;

Think testing for skip using & SKIP or ==/!= SKIP is more obvious.

This is too much for a gate function, which should be a simple
side-effect-free function that tests whether the pass should run.
Perhaps we should just make the pass unconditional and do the above
in ::execute.  The pass is very cheap, so gating probably isn't
worthwhile.

> +}
> +
> +} // anon namespace
> +
> +rtl_opt_pass *
> +make_pass_zero_call_used_regs (gcc::context *ctxt)
> +{
> +  return new pass_zero_call_used_regs (ctxt);
> +}
>  
>  /* If CONSTRAINT is a matching constraint, then return its number.
>     Otherwise, return -1.  */
> diff --git a/gcc/optabs.c b/gcc/optabs.c
> index 8ad7f4b..bd64af0 100644
> --- a/gcc/optabs.c
> +++ b/gcc/optabs.c
> @@ -6484,6 +6484,48 @@ expand_memory_blockage (void)
>      expand_asm_memory_blockage ();
>  }
>  
> +/* Generate asm volatile("" : : : "memory") as a memory blockage, at the
> +   same time clobbering the register set specified by REGS.  */
> +
> +void
> +expand_asm_reg_clobber_mem_blockage (HARD_REG_SET regs)
> +{
> +  rtx asm_op, clob_mem;
> +
> +  unsigned int num_of_regs = 0;
> +  for (unsigned int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> +    if (TEST_HARD_REG_BIT (regs, i))
> +      num_of_regs++;
> +
> +  asm_op = gen_rtx_ASM_OPERANDS (VOIDmode, "", "", 0,
> +                              rtvec_alloc (0), rtvec_alloc (0),
> +                              rtvec_alloc (0), UNKNOWN_LOCATION);
> +  MEM_VOLATILE_P (asm_op) = 1;
> +
> +  rtvec v = rtvec_alloc (num_of_regs + 2);
> +
> +  clob_mem = gen_rtx_SCRATCH (VOIDmode);
> +  clob_mem = gen_rtx_MEM (BLKmode, clob_mem);
> +  clob_mem = gen_rtx_CLOBBER (VOIDmode, clob_mem);
> +
> +  RTVEC_ELT (v,0) = asm_op;
> +  RTVEC_ELT (v,1) = clob_mem;

nit: should be a space before the comma, here and below.

> +
> +  if (num_of_regs > 0)
> +    {
> +      unsigned int j = 2;
> +      for (unsigned int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> +     if (TEST_HARD_REG_BIT (regs, i))
> +       {
> +         RTVEC_ELT (v,j) = gen_rtx_CLOBBER (VOIDmode, regno_reg_rtx[i]);
> +         j++;
> +       }
> +      gcc_assert (j == (num_of_regs + 2));
> +    }
> +
> +  emit_insn (gen_rtx_PARALLEL (VOIDmode, v));
> +}
> +
>  /* This routine will either emit the mem_thread_fence pattern or issue a 
>     sync_synchronize to generate a fence for memory model MEMMODEL.  */
>  
> diff --git a/gcc/optabs.h b/gcc/optabs.h
> index 0b14700..bfa10c8 100644
> --- a/gcc/optabs.h
> +++ b/gcc/optabs.h
> @@ -345,6 +345,8 @@ rtx expand_atomic_store (rtx, rtx, enum memmodel, bool);
>  rtx expand_atomic_fetch_op (rtx, rtx, rtx, enum rtx_code, enum memmodel, 
>                             bool);
>  
> +extern void expand_asm_reg_clobber_mem_blockage (HARD_REG_SET);
> +
>  extern bool insn_operand_matches (enum insn_code icode, unsigned int opno,
>                                 rtx operand);
>  extern bool valid_multiword_target_p (rtx);
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 3bda59a..f95a1f0 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -1776,6 +1776,24 @@ const struct sanitizer_opts_s 
> coverage_sanitizer_opts[] =
>    { NULL, 0U, 0UL, false }
>  };
>  
> +/* -fzero-call-used-regs= suboptions.  */
> +const struct zero_call_used_regs_opts_s zero_call_used_regs_opts[] =
> +{
> +#define ZERO_CALL_USED_REGS_OPT(name, flags) \
> +    { #name, flags }
> +  ZERO_CALL_USED_REGS_OPT (skip, SKIP),
> +  ZERO_CALL_USED_REGS_OPT (used-gpr-arg, (ONLY_USED | ONLY_GPR | ONLY_ARG)),
> +  ZERO_CALL_USED_REGS_OPT (used-arg, (ONLY_USED | ONLY_ARG)),
> +  ZERO_CALL_USED_REGS_OPT (all-gpr-arg, (ONLY_GPR | ONLY_ARG)),
> +  ZERO_CALL_USED_REGS_OPT (all-arg, ONLY_ARG),
> +  ZERO_CALL_USED_REGS_OPT (used-gpr, (ONLY_USED | ONLY_GPR)),
> +  ZERO_CALL_USED_REGS_OPT (all-gpr, ONLY_GPR),
> +  ZERO_CALL_USED_REGS_OPT (used, ONLY_USED),
> +  ZERO_CALL_USED_REGS_OPT (all, ALL),
> +#undef ZERO_CALL_USED_REGS_OPT
> +  {NULL, 0U}
> +};
> +
>  /* A struct for describing a run of chars within a string.  */
>  
>  class string_fragment
> @@ -1970,6 +1988,30 @@ parse_no_sanitize_attribute (char *value)
>    return flags;
>  }
>  
> +/* Parse -fzero-call-used-regs suboptions from ARG, return the FLAGS.  */
> +
> +unsigned int
> +parse_zero_call_used_regs_options (const char *arg)
> +{
> +  bool found = false;
> +  unsigned int flags = 0;
> +  unsigned int i;
> +
> +  /* Check to see if the string matches a sub-option name.  */
> +  for (i = 0; zero_call_used_regs_opts[i].name != NULL; ++i)
> +    if (strcmp (arg, zero_call_used_regs_opts[i].name) == 0)
> +      {
> +     flags |= zero_call_used_regs_opts[i].flag;
> +     found = true;

Same comments as above.

> +     break;
> +      }
> +
> +  if (!found)
> +    error ("unrecognized argument to %<-fzero-call-used-regs=%>: %qs", arg);

Think we should use %qs for the option name too, to reduce the number
of translation strings.

> diff --git a/gcc/recog.c b/gcc/recog.c
> index ce83b7f..e231b5d 100644
> --- a/gcc/recog.c
> +++ b/gcc/recog.c
> @@ -923,6 +923,22 @@ validate_simplify_insn (rtx_insn *insn)
>    return ((num_changes_pending () > 0) && (apply_change_group () > 0));
>  }
>  
>
> +
> +/* Check whether INSN matches a specific alternative of an .md pattern.  */
> +bool
> +valid_insn_p (rtx_insn *insn)

Very minor nit, but it's unusual to have three blank lines before
the comment and none afterwards.  The codebase isn't very consistent
about this, but local style seems mostly to be one blank line before
the comment and one afterwards.

> diff --git a/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c 
> b/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c
> new file mode 100644
> index 0000000..f44add9
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/zero-scratch-regs-1.c
> @@ -0,0 +1,15 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fzero-call-used-regs=all" } */
> +
> +volatile int result = 0;
> +int 
> +__attribute__((noinline))

“noipa” is stronger.  Same for all the tests.

The i386 tests are Uros's domain, but I think it would be good to have
generic tests for all the variants.  E.g.:

(1) one test per -fzero-call-used-regs option (including skip)
(2) one test that tries all valid attribute values (including skip),
    compiled without -fzero-call-used-regs
(3) one test that #includes (2) but is compiled with an arbitrarily-chosen
    -fzero-call-used-regs (say =all).
(4) one test that tries invalid uses of the attribute, e.g.:
    - one use of the attribute on a variable
    - one use of the attribute on a function, but with an obviously-wrong
      value
    - one use of the attribute on a function, but with -gpr and -arg the
      wrong way around

(Sorry for not getting to the tests last time.)

Thanks,
Richard

Re: [PATCH][middle-end][i386][Version 4] Add -fzero-call-used-regs=[skip|used-gpr-arg|used-arg|all-arg|used-gpr|all-gpr|used|all]

Reply via email to