On Wed, 1 Dec 2021, Alexandre Oliva wrote:

> 
> This is now used by -fharden-conditional-branches and
> -fharden-compares to detach the values used in redundant hardening
> compares from their original locations.  It eventually still expands
> to an asm statement, as before, but the temporary is only introduced
> at expand time, preventing gimple optimizations from diverging input
> and output, which may prevent their unification in asm match operands.
> 
> Additional logic to check asm operands matches, and to keep them
> matchable, which may improve some cases of "+m" and "+X" in asm that
> are currently rejected because the single operand, split into input
> and output, ends up diverging in ways that can't be merged back by
> reload.
> 
> While investigating regressions hit by an earlier iteration of this
> patch, I came across pr93027.c, and noticed that, with optimization
> (unlike the test), reload overwrites one of the inputs with another.
> This preexisted this patch, and I have not attempted to fix it.
> 
> 
> The changes above paved the way to fix PR103149: a vectorized loop
> ends up using vector booleans on aarch64, and the hardening pass
> attempts to "copy-and-forget" them with an asm statement (thus
> ASMNESIA) for the redundant hardening compare.
> 
> The problem there is that vector booleans, on the affected platform,
> can't go in GENERAL_REGS, but the asm statement we used to emit could
> only use those registers or memory.
> 
> The work-around introduced in the newly-introduced ASMNESIA expander
> is to check whether the mode is suitable for GENERAL_REGS, and force
> the asm operands to use MEM otherwise.  This is not ideal, but
> functionality to map a mode to a constraint string that selects a
> suitable register class doesn't seem readily available.  Maybe in a
> future patch...
> 
> ASMNESIA currently works only with gimple registers of types with
> scalar modes.  I have not attempted to make it more general than that,
> though I envision future uses that might end up requiring it.  I'll
> cross that bridge if/when I get to it.
> 
> 
> Regstrapped on x86_64-linux-gnu.  Also bootstrapped along with a patch
> that enables both compare hardening passes.  Built crosses to aarch64-
> and riscv64-, and verified that the mentioned aarch64 PR is fixed.  Ok
> to install?

While adding ASMNESIA looks OK at this point, I'm not sure about the
asm handling stuff.  You mention 'reload' above, do you mean LRA?

Thanks,
Richard.

> 
> (Manual inspection of the asm output for riscv PR103302 suggests that
> the reported problem may be gone with the given testcase, but I have not
> built enough of the target environment to enable me to run that.)
> 
> 
> 
> for  gcc/ChangeLog
> 
>       PR middle-end/103149
>       * cfgexpand.c (expand_asm_stmt): Pass original input
>       constraint, and constraint array, to asm_operand_ok.
>       * gimple-hander-conditionals.cc (detach_value): Call
>       ASMNESIA internal function.
>       * internal-fn.c (expand_ASMNESIA): New.
>       * internal-fn.def (ASMNESIA): New.
>       * recog.c (asm_operand_ok): Turn into wrapper of...
>       (asm_operand_ok_match): ... renamed.  Skip non-constraint
>       characters faster.  Don't require register_operand for matched
>       operands.  Combine results with...
>       (check_match): New.
>       (check_asm_operands): Enable match checking.
> 
> for  gcc/testsuite/ChangeLog
> 
>       PR middle-end/103149
>       * gcc.target/aarch64/pr103149.c: New.
>       * gcc.target/i386/pr85030.c: Also accept impossible constraint
>       error.
> ---
>  gcc/cfgexpand.c                             |    6 +
>  gcc/gimple-harden-conditionals.cc           |   17 ----
>  gcc/internal-fn.c                           |   81 +++++++++++++++++++
>  gcc/internal-fn.def                         |    4 +
>  gcc/recog.c                                 |  114 
> +++++++++++++++++++++++----
>  gcc/testsuite/gcc.target/aarch64/pr103149.c |   13 +++
>  gcc/testsuite/gcc.target/i386/pr85030.c     |    2 
>  7 files changed, 200 insertions(+), 37 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/pr103149.c
> 
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index fb84d469f1e77..f0272ba1ac9ba 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -3374,10 +3374,10 @@ expand_asm_stmt (gasm *stmt)
>        tree val = input_tvec[i];
>        tree type = TREE_TYPE (val);
>        bool allows_reg, allows_mem, ok;
> -      const char *constraint;
> +      const char *constraint, *orig_constraint;
>        rtx op;
>  
> -      constraint = constraints[i + noutputs];
> +      orig_constraint = constraint = constraints[i + noutputs];
>        ok = parse_input_constraint (&constraint, i, ninputs, noutputs, 0,
>                                  constraints.address (),
>                                  &allows_mem, &allows_reg);
> @@ -3397,7 +3397,7 @@ expand_asm_stmt (gasm *stmt)
>        else if (MEM_P (op))
>       op = validize_mem (op);
>  
> -      if (asm_operand_ok (op, constraint, NULL) <= 0)
> +      if (asm_operand_ok (op, orig_constraint, constraints.address ()) <= 0)
>       {
>         if (allows_reg && TYPE_MODE (type) != BLKmode)
>           op = force_reg (TYPE_MODE (type), op);
> diff --git a/gcc/gimple-harden-conditionals.cc 
> b/gcc/gimple-harden-conditionals.cc
> index cfa2361d65be0..3768b2e23bdaf 100644
> --- a/gcc/gimple-harden-conditionals.cc
> +++ b/gcc/gimple-harden-conditionals.cc
> @@ -132,21 +132,8 @@ detach_value (location_t loc, gimple_stmt_iterator 
> *gsip, tree val)
>    tree ret = make_ssa_name (TREE_TYPE (val));
>    SET_SSA_NAME_VAR_OR_IDENTIFIER (ret, SSA_NAME_IDENTIFIER (val));
>  
> -  /* Output asm ("" : "=g" (ret) : "0" (val));  */
> -  vec<tree, va_gc> *inputs = NULL;
> -  vec<tree, va_gc> *outputs = NULL;
> -  vec_safe_push (outputs,
> -              build_tree_list
> -              (build_tree_list
> -               (NULL_TREE, build_string (2, "=g")),
> -               ret));
> -  vec_safe_push (inputs,
> -              build_tree_list
> -              (build_tree_list
> -               (NULL_TREE, build_string (1, "0")),
> -               val));
> -  gasm *detach = gimple_build_asm_vec ("", inputs, outputs,
> -                                    NULL, NULL);
> +  gcall *detach = gimple_build_call_internal (IFN_ASMNESIA, 1, val);
> +  gimple_call_set_lhs (detach, ret);
>    gimple_set_location (detach, loc);
>    gsi_insert_before (gsip, detach, GSI_SAME_STMT);
>  
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index 6ac3460d538be..a4a2ca91d9f93 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -3234,6 +3234,87 @@ expand_LAUNDER (internal_fn, gcall *call)
>    expand_assignment (lhs, gimple_call_arg (call, 0), false);
>  }
>  
> +/* Expand ASMNESIA to assignment and asm that makes the value unknown.  */
> +
> +static void
> +expand_ASMNESIA (internal_fn, gcall *call)
> +{
> +  tree to = gimple_call_lhs (call);
> +
> +  if (!to)
> +    return;
> +
> +  location_t locus = gimple_location (call);
> +
> +  tree from = gimple_call_arg (call, 0);
> +
> +  rtx to_rtx = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +
> +  rtx from_rtx = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL);
> +
> +  enum machine_mode mode = GET_MODE (to_rtx);
> +
> +  gcc_checking_assert (mode != BLKmode && mode != VOIDmode
> +                    && (GET_MODE (from_rtx) == mode
> +                        || GET_MODE (from_rtx) == VOIDmode));
> +
> +  rtvec argvec = rtvec_alloc (1);
> +  rtvec constraintvec = rtvec_alloc (1);
> +  rtvec labelvec = rtvec_alloc (0);
> +
> +  bool need_memory = true;
> +  for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> +    if (TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], i)
> +     && targetm.hard_regno_mode_ok (i, mode))
> +      {
> +     need_memory = false;
> +     break;
> +      }
> +
> +  rtx regtemp = copy_to_mode_reg (mode, from_rtx);
> +  rtx temp = regtemp;
> +  const char *constraint = "=g";
> +
> +  if (need_memory)
> +    {
> +      constraint = "=m";
> +      temp = assign_stack_temp (mode, GET_MODE_SIZE (mode));
> +      emit_move_insn (temp, regtemp);
> +    }
> +
> +  /* This is roughly equivalent to asm ("" : "+g" (temp));
> +
> +     We use +m instead of +g if we NEED_MEMORY, i.e., the mode is not 
> suitable
> +     for GENERAL_REGS.  ??? Ideally, we'd choose a suitable register class, 
> if
> +     there is one, but how do we get a constraint type that maps to a it?  
> +X is
> +     at the same time too lax, because arbitrary RTL and x87 FP regs are not
> +     desirable, and too strict, because it doesn't guide any register class.
> +     ??? Maybe the errors reg-stack would flag on e.g. libgcc/_multc3 could 
> be
> +     conditioned on the operand's being referenced?
> +
> +     Unlike expansion of gimple asm stmts, it doesn't go through
> +     targetm.md_asm_adjust (we don't wish any clobbers).
> +
> +     Furthermore, we ensure input and output start out with the same pseudo 
> or
> +     MEM, which gimple doesn't.  This is particularly important for MEMs, 
> since
> +     reloads can't merge disparate ones, and it would be important for X 
> because
> +     it guides NO_REGS.  */
> +  rtx body = gen_rtx_ASM_OPERANDS (mode,
> +                                ggc_strdup (""),
> +                                constraint, 0,
> +                                argvec, constraintvec, labelvec,
> +                                locus);
> +  ASM_OPERANDS_INPUT (body, 0) = temp;
> +  ASM_OPERANDS_INPUT_CONSTRAINT_EXP (body, 0)
> +    = gen_rtx_ASM_INPUT_loc (mode, "0", locus);
> +  emit_insn (gen_rtx_SET (temp, body));
> +
> +  if (need_memory)
> +    emit_move_insn (regtemp, temp);
> +
> +  emit_move_insn (to_rtx, regtemp);
> +}
> +
>  /* Expand {MASK_,}SCATTER_STORE{S,U} call CALL using optab OPTAB.  */
>  
>  static void
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index acb0dbda5567b..f10b220192196 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -410,6 +410,10 @@ DEF_INTERNAL_FN (FALLTHROUGH, ECF_LEAF | ECF_NOTHROW, 
> NULL)
>  /* To implement __builtin_launder.  */
>  DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL)
>  
> +/* Copy a value while preventing optimizations based on knowledge
> +   about the input operand.  */
> +DEF_INTERNAL_FN (ASMNESIA, ECF_LEAF | ECF_NOTHROW, NULL)
> +
>  /* Divmod function.  */
>  DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL)
>  
> diff --git a/gcc/recog.c b/gcc/recog.c
> index 5a42c45361d5c..2f402daad55a0 100644
> --- a/gcc/recog.c
> +++ b/gcc/recog.c
> @@ -53,6 +53,9 @@ along with GCC; see the file COPYING3.  If not see
>  static void validate_replace_rtx_1 (rtx *, rtx, rtx, rtx_insn *, bool);
>  static void validate_replace_src_1 (rtx *, void *);
>  static rtx_insn *split_insn (rtx_insn *);
> +static int asm_operand_ok_match (rtx op, const char *constraint,
> +                              const char **constraints,
> +                              rtx *operands, int match_index);
>  
>  struct target_recog default_target_recog;
>  #if SWITCHABLE_TARGET
> @@ -170,7 +173,7 @@ check_asm_operands (rtx x)
>        const char *c = constraints[i];
>        if (c[0] == '%')
>       c++;
> -      if (! asm_operand_ok (operands[i], c, constraints))
> +      if (! asm_operand_ok_match (operands[i], c, constraints, operands, -1))
>       return 0;
>      }
>  
> @@ -2167,6 +2170,67 @@ get_referenced_operands (const char *string, bool 
> *used,
>  
>  int
>  asm_operand_ok (rtx op, const char *constraint, const char **constraints)
> +{
> +  return asm_operand_ok_match (op, constraint, constraints, NULL, -1);
> +}
> +
> +/* Combine a previous cummulative result PREVRES with RES.  If MATCH_INDEX is
> +   nonnegative, check that OP is compatible with the matched operand, using
> +   OPERANDS if not NULL, under the CN-implied requirements.
> + */
> +
> +static int
> +check_match (int prevres, rtx op, enum constraint_num cn, int res,
> +          rtx *operands, int match_index)
> +{
> +  if (prevres > 0 || !res)
> +    return prevres;
> +
> +  if (match_index < 0)
> +    return res;
> +
> +  bool allows_reg = false, allows_mem = false;
> +  enum reg_class rclass = NO_REGS;
> +
> +  if (cn == CONSTRAINT__LIMIT)
> +    {
> +      allows_reg = allows_mem = true;
> +      rclass = GENERAL_REGS;
> +    }
> +  else
> +    {
> +      insn_extra_constraint_allows_reg_mem (cn, &allows_reg, &allows_mem);
> +      if (allows_reg)
> +     rclass = reg_class_for_constraint (cn);
> +    }
> +
> +  rtx mop = operands ? operands[match_index] : NULL_RTX;
> +
> +  /* Avoid divergence between input and output when reload wouldn't be able 
> to
> +     fix it up.  Keep matching MEMs identical if we can't have REGs, but 
> allow
> +     reloadable MEMs otherwise.  Avoid immediates and rvalue expressions, 
> since
> +     they can't match outputs.  */
> +  if (op != mop
> +      && ((mop && allows_mem && MEM_P (op) && MEM_P (mop)
> +        && (!allows_reg || rclass == NO_REGS || GET_MODE (mop) == BLKmode)
> +        && !rtx_equal_p (op, mop))
> +       || (mop && allows_reg
> +           && (rclass == NO_REGS || GET_MODE (mop) == BLKmode)
> +           && (!allows_mem || !MEM_P (op) || !MEM_P (mop)))
> +       || !nonimmediate_operand (op, VOIDmode)))
> +    return prevres;
> +
> +  return res;
> +}
> +
> +/* Check if an asm_operand matches its constraints.
> +   If MATCH_INDEX is nonnegative, also check for potential compatibility
> +   with the matched operand, using OPERANDS if non-NULL.
> +   Return > 0 if ok, = 0 if bad, < 0 if inconclusive.  */
> +
> +static int
> +asm_operand_ok_match (rtx op, const char *constraint, const char 
> **constraints,
> +                   rtx *operands, int match_index)
>  {
>    int result = 0;
>    bool incdec_ok = false;
> @@ -2177,7 +2241,8 @@ asm_operand_ok (rtx op, const char *constraint, const 
> char **constraints)
>    /* Empty constraint string is the same as "X,...,X", i.e. X for as
>       many alternatives as required to match the other operands.  */
>    if (*constraint == '\0')
> -    result = 1;
> +    result = check_match (result, op, CONSTRAINT_X,
> +                       1, operands, match_index);
>  
>    while (*constraint)
>      {
> @@ -2186,7 +2251,8 @@ asm_operand_ok (rtx op, const char *constraint, const 
> char **constraints)
>        int len;
>        switch (c)
>       {
> -     case ',':
> +     case '&': case '?': case '!': case '#': case '*':
> +     case '=': case '+': case ',':
>         constraint++;
>         continue;
>  
> @@ -2204,7 +2270,8 @@ asm_operand_ok (rtx op, const char *constraint, const 
> char **constraints)
>  
>             match = strtoul (constraint, &end, 10);
>             if (!result)
> -             result = asm_operand_ok (op, constraints[match], NULL);
> +             result = asm_operand_ok_match (op, constraints[match], NULL,
> +                                            operands, match);
>             constraint = (const char *) end;
>           }
>         else
> @@ -2229,12 +2296,14 @@ asm_operand_ok (rtx op, const char *constraint, const 
> char **constraints)
>            offsettable address exists.  */
>       case 'o': /* offsettable */
>         if (offsettable_nonstrict_memref_p (op))
> -         result = 1;
> +         result = check_match (result, op, CONSTRAINT_o, 1,
> +                               operands, match_index);
>         break;
>  
>       case 'g':
>         if (general_operand (op, VOIDmode))
> -         result = 1;
> +         result = check_match (result, op, CONSTRAINT__LIMIT, 1,
> +                               operands, match_index);
>         break;
>  
>       case '<':
> @@ -2253,18 +2322,21 @@ asm_operand_ok (rtx op, const char *constraint, const 
> char **constraints)
>         switch (get_constraint_type (cn))
>           {
>           case CT_REGISTER:
> -           if (!result
> -               && reg_class_for_constraint (cn) != NO_REGS
> -               && GET_MODE (op) != BLKmode
> -               && register_operand (op, VOIDmode))
> -             result = 1;
> +           /* Don't demand a register for a matched operand, see pr93027.  */
> +           result = check_match (result, op, cn,
> +                                 reg_class_for_constraint (cn) != NO_REGS
> +                                 && GET_MODE (op) != BLKmode
> +                                 && (match_index >= 0
> +                                     || register_operand (op, VOIDmode)),
> +                                 operands, match_index);
>             break;
>  
>           case CT_CONST_INT:
> -           if (!result
> -               && CONST_INT_P (op)
> -               && insn_const_int_ok_for_constraint (INTVAL (op), cn))
> -             result = 1;
> +           result = check_match (result, op, cn,
> +                                 CONST_INT_P (op)
> +                                 && (insn_const_int_ok_for_constraint
> +                                     (INTVAL (op), cn)),
> +                                 operands, match_index);
>             break;
>  
>           case CT_MEMORY:
> @@ -2275,16 +2347,22 @@ asm_operand_ok (rtx op, const char *constraint, const 
> char **constraints)
>             /* Every memory operand can be reloaded to fit.  */
>             if (!mem)
>               mem = extract_mem_from_operand (op);
> -           result = result || memory_operand (mem, VOIDmode);
> +           result = check_match (result, op, cn,
> +                                 memory_operand (mem, VOIDmode),
> +                                 operands, match_index);
>             break;
>  
>           case CT_ADDRESS:
>             /* Every address operand can be reloaded to fit.  */
> -           result = result || address_operand (op, VOIDmode);
> +           result = check_match (result, op, cn,
> +                                 address_operand (op, VOIDmode),
> +                                 operands, match_index);
>             break;
>  
>           case CT_FIXED_FORM:
> -           result = result || constraint_satisfied_p (op, cn);
> +           result = check_match (result, op, cn,
> +                                 constraint_satisfied_p (op, cn),
> +                                 operands, match_index);
>             break;
>           }
>         break;
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr103149.c 
> b/gcc/testsuite/gcc.target/aarch64/pr103149.c
> new file mode 100644
> index 0000000000000..e7283b4f569c7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr103149.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=armv8-a+sve -O2 -fharden-conditional-branches 
> -fno-tree-scev-cprop" } */
> +
> +/* -fharden-conditional-branches relies on ASMNESIA, that used to require
> +   GENERAL_REGS even for vectorized booleans, which can't go on
> +   GENERAL_REGS.  */
> +
> +void
> +foo (int *p)
> +{
> +  while (*p < 1)
> +    ++*p;
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr85030.c 
> b/gcc/testsuite/gcc.target/i386/pr85030.c
> index ff41df6bb643c..f24e690b8c11d 100644
> --- a/gcc/testsuite/gcc.target/i386/pr85030.c
> +++ b/gcc/testsuite/gcc.target/i386/pr85030.c
> @@ -6,5 +6,5 @@ void
>  foo ()
>  {
>    struct S a;
> -  asm volatile ("" : "=rm" (a) : "0" (1)); /* { dg-error "inconsistent 
> operand constraints in an 'asm'" } */
> +  asm volatile ("" : "=rm" (a) : "0" (1)); /* { dg-error "impossible 
> constraint in 'asm'|inconsistent operand constraints in an 'asm'" } */
>  }
> 
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

Reply via email to