On Wed, 1 Dec 2021, Alexandre Oliva wrote: > > This is now used by -fharden-conditional-branches and > -fharden-compares to detach the values used in redundant hardening > compares from their original locations. It eventually still expands > to an asm statement, as before, but the temporary is only introduced > at expand time, preventing gimple optimizations from diverging input > and output, which may prevent their unification in asm match operands. > > Additional logic to check asm operands matches, and to keep them > matchable, which may improve some cases of "+m" and "+X" in asm that > are currently rejected because the single operand, split into input > and output, ends up diverging in ways that can't be merged back by > reload. > > While investigating regressions hit by an earlier iteration of this > patch, I came across pr93027.c, and noticed that, with optimization > (unlike the test), reload overwrites one of the inputs with another. > This preexisted this patch, and I have not attempted to fix it. > > > The changes above paved the way to fix PR103149: a vectorized loop > ends up using vector booleans on aarch64, and the hardening pass > attempts to "copy-and-forget" them with an asm statement (thus > ASMNESIA) for the redundant hardening compare. > > The problem there is that vector booleans, on the affected platform, > can't go in GENERAL_REGS, but the asm statement we used to emit could > only use those registers or memory. > > The work-around introduced in the newly-introduced ASMNESIA expander > is to check whether the mode is suitable for GENERAL_REGS, and force > the asm operands to use MEM otherwise. This is not ideal, but > functionality to map a mode to a constraint string that selects a > suitable register class doesn't seem readily available. Maybe in a > future patch... > > ASMNESIA currently works only with gimple registers of types with > scalar modes. I have not attempted to make it more general than that, > though I envision future uses that might end up requiring it. I'll > cross that bridge if/when I get to it. > > > Regstrapped on x86_64-linux-gnu. Also bootstrapped along with a patch > that enables both compare hardening passes. Built crosses to aarch64- > and riscv64-, and verified that the mentioned aarch64 PR is fixed. Ok > to install?
While adding ASMNESIA looks OK at this point, I'm not sure about the asm handling stuff. You mention 'reload' above, do you mean LRA? Thanks, Richard. > > (Manual inspection of the asm output for riscv PR103302 suggests that > the reported problem may be gone with the given testcase, but I have not > built enough of the target environment to enable me to run that.) > > > > for gcc/ChangeLog > > PR middle-end/103149 > * cfgexpand.c (expand_asm_stmt): Pass original input > constraint, and constraint array, to asm_operand_ok. > * gimple-hander-conditionals.cc (detach_value): Call > ASMNESIA internal function. > * internal-fn.c (expand_ASMNESIA): New. > * internal-fn.def (ASMNESIA): New. > * recog.c (asm_operand_ok): Turn into wrapper of... > (asm_operand_ok_match): ... renamed. Skip non-constraint > characters faster. Don't require register_operand for matched > operands. Combine results with... > (check_match): New. > (check_asm_operands): Enable match checking. > > for gcc/testsuite/ChangeLog > > PR middle-end/103149 > * gcc.target/aarch64/pr103149.c: New. > * gcc.target/i386/pr85030.c: Also accept impossible constraint > error. > --- > gcc/cfgexpand.c | 6 + > gcc/gimple-harden-conditionals.cc | 17 ---- > gcc/internal-fn.c | 81 +++++++++++++++++++ > gcc/internal-fn.def | 4 + > gcc/recog.c | 114 > +++++++++++++++++++++++---- > gcc/testsuite/gcc.target/aarch64/pr103149.c | 13 +++ > gcc/testsuite/gcc.target/i386/pr85030.c | 2 > 7 files changed, 200 insertions(+), 37 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/pr103149.c > > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index fb84d469f1e77..f0272ba1ac9ba 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -3374,10 +3374,10 @@ expand_asm_stmt (gasm *stmt) > tree val = input_tvec[i]; > tree type = TREE_TYPE (val); > bool allows_reg, allows_mem, ok; > - const char *constraint; > + const char *constraint, *orig_constraint; > rtx op; > > - constraint = constraints[i + noutputs]; > + orig_constraint = constraint = constraints[i + noutputs]; > ok = parse_input_constraint (&constraint, i, ninputs, noutputs, 0, > constraints.address (), > &allows_mem, &allows_reg); > @@ -3397,7 +3397,7 @@ expand_asm_stmt (gasm *stmt) > else if (MEM_P (op)) > op = validize_mem (op); > > - if (asm_operand_ok (op, constraint, NULL) <= 0) > + if (asm_operand_ok (op, orig_constraint, constraints.address ()) <= 0) > { > if (allows_reg && TYPE_MODE (type) != BLKmode) > op = force_reg (TYPE_MODE (type), op); > diff --git a/gcc/gimple-harden-conditionals.cc > b/gcc/gimple-harden-conditionals.cc > index cfa2361d65be0..3768b2e23bdaf 100644 > --- a/gcc/gimple-harden-conditionals.cc > +++ b/gcc/gimple-harden-conditionals.cc > @@ -132,21 +132,8 @@ detach_value (location_t loc, gimple_stmt_iterator > *gsip, tree val) > tree ret = make_ssa_name (TREE_TYPE (val)); > SET_SSA_NAME_VAR_OR_IDENTIFIER (ret, SSA_NAME_IDENTIFIER (val)); > > - /* Output asm ("" : "=g" (ret) : "0" (val)); */ > - vec<tree, va_gc> *inputs = NULL; > - vec<tree, va_gc> *outputs = NULL; > - vec_safe_push (outputs, > - build_tree_list > - (build_tree_list > - (NULL_TREE, build_string (2, "=g")), > - ret)); > - vec_safe_push (inputs, > - build_tree_list > - (build_tree_list > - (NULL_TREE, build_string (1, "0")), > - val)); > - gasm *detach = gimple_build_asm_vec ("", inputs, outputs, > - NULL, NULL); > + gcall *detach = gimple_build_call_internal (IFN_ASMNESIA, 1, val); > + gimple_call_set_lhs (detach, ret); > gimple_set_location (detach, loc); > gsi_insert_before (gsip, detach, GSI_SAME_STMT); > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index 6ac3460d538be..a4a2ca91d9f93 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -3234,6 +3234,87 @@ expand_LAUNDER (internal_fn, gcall *call) > expand_assignment (lhs, gimple_call_arg (call, 0), false); > } > > +/* Expand ASMNESIA to assignment and asm that makes the value unknown. */ > + > +static void > +expand_ASMNESIA (internal_fn, gcall *call) > +{ > + tree to = gimple_call_lhs (call); > + > + if (!to) > + return; > + > + location_t locus = gimple_location (call); > + > + tree from = gimple_call_arg (call, 0); > + > + rtx to_rtx = expand_expr (to, NULL_RTX, VOIDmode, EXPAND_WRITE); > + > + rtx from_rtx = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL); > + > + enum machine_mode mode = GET_MODE (to_rtx); > + > + gcc_checking_assert (mode != BLKmode && mode != VOIDmode > + && (GET_MODE (from_rtx) == mode > + || GET_MODE (from_rtx) == VOIDmode)); > + > + rtvec argvec = rtvec_alloc (1); > + rtvec constraintvec = rtvec_alloc (1); > + rtvec labelvec = rtvec_alloc (0); > + > + bool need_memory = true; > + for (int i = 0; i < FIRST_PSEUDO_REGISTER; i++) > + if (TEST_HARD_REG_BIT (reg_class_contents[GENERAL_REGS], i) > + && targetm.hard_regno_mode_ok (i, mode)) > + { > + need_memory = false; > + break; > + } > + > + rtx regtemp = copy_to_mode_reg (mode, from_rtx); > + rtx temp = regtemp; > + const char *constraint = "=g"; > + > + if (need_memory) > + { > + constraint = "=m"; > + temp = assign_stack_temp (mode, GET_MODE_SIZE (mode)); > + emit_move_insn (temp, regtemp); > + } > + > + /* This is roughly equivalent to asm ("" : "+g" (temp)); > + > + We use +m instead of +g if we NEED_MEMORY, i.e., the mode is not > suitable > + for GENERAL_REGS. ??? Ideally, we'd choose a suitable register class, > if > + there is one, but how do we get a constraint type that maps to a it? > +X is > + at the same time too lax, because arbitrary RTL and x87 FP regs are not > + desirable, and too strict, because it doesn't guide any register class. > + ??? Maybe the errors reg-stack would flag on e.g. libgcc/_multc3 could > be > + conditioned on the operand's being referenced? > + > + Unlike expansion of gimple asm stmts, it doesn't go through > + targetm.md_asm_adjust (we don't wish any clobbers). > + > + Furthermore, we ensure input and output start out with the same pseudo > or > + MEM, which gimple doesn't. This is particularly important for MEMs, > since > + reloads can't merge disparate ones, and it would be important for X > because > + it guides NO_REGS. */ > + rtx body = gen_rtx_ASM_OPERANDS (mode, > + ggc_strdup (""), > + constraint, 0, > + argvec, constraintvec, labelvec, > + locus); > + ASM_OPERANDS_INPUT (body, 0) = temp; > + ASM_OPERANDS_INPUT_CONSTRAINT_EXP (body, 0) > + = gen_rtx_ASM_INPUT_loc (mode, "0", locus); > + emit_insn (gen_rtx_SET (temp, body)); > + > + if (need_memory) > + emit_move_insn (regtemp, temp); > + > + emit_move_insn (to_rtx, regtemp); > +} > + > /* Expand {MASK_,}SCATTER_STORE{S,U} call CALL using optab OPTAB. */ > > static void > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index acb0dbda5567b..f10b220192196 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -410,6 +410,10 @@ DEF_INTERNAL_FN (FALLTHROUGH, ECF_LEAF | ECF_NOTHROW, > NULL) > /* To implement __builtin_launder. */ > DEF_INTERNAL_FN (LAUNDER, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL) > > +/* Copy a value while preventing optimizations based on knowledge > + about the input operand. */ > +DEF_INTERNAL_FN (ASMNESIA, ECF_LEAF | ECF_NOTHROW, NULL) > + > /* Divmod function. */ > DEF_INTERNAL_FN (DIVMOD, ECF_CONST | ECF_LEAF, NULL) > > diff --git a/gcc/recog.c b/gcc/recog.c > index 5a42c45361d5c..2f402daad55a0 100644 > --- a/gcc/recog.c > +++ b/gcc/recog.c > @@ -53,6 +53,9 @@ along with GCC; see the file COPYING3. If not see > static void validate_replace_rtx_1 (rtx *, rtx, rtx, rtx_insn *, bool); > static void validate_replace_src_1 (rtx *, void *); > static rtx_insn *split_insn (rtx_insn *); > +static int asm_operand_ok_match (rtx op, const char *constraint, > + const char **constraints, > + rtx *operands, int match_index); > > struct target_recog default_target_recog; > #if SWITCHABLE_TARGET > @@ -170,7 +173,7 @@ check_asm_operands (rtx x) > const char *c = constraints[i]; > if (c[0] == '%') > c++; > - if (! asm_operand_ok (operands[i], c, constraints)) > + if (! asm_operand_ok_match (operands[i], c, constraints, operands, -1)) > return 0; > } > > @@ -2167,6 +2170,67 @@ get_referenced_operands (const char *string, bool > *used, > > int > asm_operand_ok (rtx op, const char *constraint, const char **constraints) > +{ > + return asm_operand_ok_match (op, constraint, constraints, NULL, -1); > +} > + > +/* Combine a previous cummulative result PREVRES with RES. If MATCH_INDEX is > + nonnegative, check that OP is compatible with the matched operand, using > + OPERANDS if not NULL, under the CN-implied requirements. > + */ > + > +static int > +check_match (int prevres, rtx op, enum constraint_num cn, int res, > + rtx *operands, int match_index) > +{ > + if (prevres > 0 || !res) > + return prevres; > + > + if (match_index < 0) > + return res; > + > + bool allows_reg = false, allows_mem = false; > + enum reg_class rclass = NO_REGS; > + > + if (cn == CONSTRAINT__LIMIT) > + { > + allows_reg = allows_mem = true; > + rclass = GENERAL_REGS; > + } > + else > + { > + insn_extra_constraint_allows_reg_mem (cn, &allows_reg, &allows_mem); > + if (allows_reg) > + rclass = reg_class_for_constraint (cn); > + } > + > + rtx mop = operands ? operands[match_index] : NULL_RTX; > + > + /* Avoid divergence between input and output when reload wouldn't be able > to > + fix it up. Keep matching MEMs identical if we can't have REGs, but > allow > + reloadable MEMs otherwise. Avoid immediates and rvalue expressions, > since > + they can't match outputs. */ > + if (op != mop > + && ((mop && allows_mem && MEM_P (op) && MEM_P (mop) > + && (!allows_reg || rclass == NO_REGS || GET_MODE (mop) == BLKmode) > + && !rtx_equal_p (op, mop)) > + || (mop && allows_reg > + && (rclass == NO_REGS || GET_MODE (mop) == BLKmode) > + && (!allows_mem || !MEM_P (op) || !MEM_P (mop))) > + || !nonimmediate_operand (op, VOIDmode))) > + return prevres; > + > + return res; > +} > + > +/* Check if an asm_operand matches its constraints. > + If MATCH_INDEX is nonnegative, also check for potential compatibility > + with the matched operand, using OPERANDS if non-NULL. > + Return > 0 if ok, = 0 if bad, < 0 if inconclusive. */ > + > +static int > +asm_operand_ok_match (rtx op, const char *constraint, const char > **constraints, > + rtx *operands, int match_index) > { > int result = 0; > bool incdec_ok = false; > @@ -2177,7 +2241,8 @@ asm_operand_ok (rtx op, const char *constraint, const > char **constraints) > /* Empty constraint string is the same as "X,...,X", i.e. X for as > many alternatives as required to match the other operands. */ > if (*constraint == '\0') > - result = 1; > + result = check_match (result, op, CONSTRAINT_X, > + 1, operands, match_index); > > while (*constraint) > { > @@ -2186,7 +2251,8 @@ asm_operand_ok (rtx op, const char *constraint, const > char **constraints) > int len; > switch (c) > { > - case ',': > + case '&': case '?': case '!': case '#': case '*': > + case '=': case '+': case ',': > constraint++; > continue; > > @@ -2204,7 +2270,8 @@ asm_operand_ok (rtx op, const char *constraint, const > char **constraints) > > match = strtoul (constraint, &end, 10); > if (!result) > - result = asm_operand_ok (op, constraints[match], NULL); > + result = asm_operand_ok_match (op, constraints[match], NULL, > + operands, match); > constraint = (const char *) end; > } > else > @@ -2229,12 +2296,14 @@ asm_operand_ok (rtx op, const char *constraint, const > char **constraints) > offsettable address exists. */ > case 'o': /* offsettable */ > if (offsettable_nonstrict_memref_p (op)) > - result = 1; > + result = check_match (result, op, CONSTRAINT_o, 1, > + operands, match_index); > break; > > case 'g': > if (general_operand (op, VOIDmode)) > - result = 1; > + result = check_match (result, op, CONSTRAINT__LIMIT, 1, > + operands, match_index); > break; > > case '<': > @@ -2253,18 +2322,21 @@ asm_operand_ok (rtx op, const char *constraint, const > char **constraints) > switch (get_constraint_type (cn)) > { > case CT_REGISTER: > - if (!result > - && reg_class_for_constraint (cn) != NO_REGS > - && GET_MODE (op) != BLKmode > - && register_operand (op, VOIDmode)) > - result = 1; > + /* Don't demand a register for a matched operand, see pr93027. */ > + result = check_match (result, op, cn, > + reg_class_for_constraint (cn) != NO_REGS > + && GET_MODE (op) != BLKmode > + && (match_index >= 0 > + || register_operand (op, VOIDmode)), > + operands, match_index); > break; > > case CT_CONST_INT: > - if (!result > - && CONST_INT_P (op) > - && insn_const_int_ok_for_constraint (INTVAL (op), cn)) > - result = 1; > + result = check_match (result, op, cn, > + CONST_INT_P (op) > + && (insn_const_int_ok_for_constraint > + (INTVAL (op), cn)), > + operands, match_index); > break; > > case CT_MEMORY: > @@ -2275,16 +2347,22 @@ asm_operand_ok (rtx op, const char *constraint, const > char **constraints) > /* Every memory operand can be reloaded to fit. */ > if (!mem) > mem = extract_mem_from_operand (op); > - result = result || memory_operand (mem, VOIDmode); > + result = check_match (result, op, cn, > + memory_operand (mem, VOIDmode), > + operands, match_index); > break; > > case CT_ADDRESS: > /* Every address operand can be reloaded to fit. */ > - result = result || address_operand (op, VOIDmode); > + result = check_match (result, op, cn, > + address_operand (op, VOIDmode), > + operands, match_index); > break; > > case CT_FIXED_FORM: > - result = result || constraint_satisfied_p (op, cn); > + result = check_match (result, op, cn, > + constraint_satisfied_p (op, cn), > + operands, match_index); > break; > } > break; > diff --git a/gcc/testsuite/gcc.target/aarch64/pr103149.c > b/gcc/testsuite/gcc.target/aarch64/pr103149.c > new file mode 100644 > index 0000000000000..e7283b4f569c7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/pr103149.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=armv8-a+sve -O2 -fharden-conditional-branches > -fno-tree-scev-cprop" } */ > + > +/* -fharden-conditional-branches relies on ASMNESIA, that used to require > + GENERAL_REGS even for vectorized booleans, which can't go on > + GENERAL_REGS. */ > + > +void > +foo (int *p) > +{ > + while (*p < 1) > + ++*p; > +} > diff --git a/gcc/testsuite/gcc.target/i386/pr85030.c > b/gcc/testsuite/gcc.target/i386/pr85030.c > index ff41df6bb643c..f24e690b8c11d 100644 > --- a/gcc/testsuite/gcc.target/i386/pr85030.c > +++ b/gcc/testsuite/gcc.target/i386/pr85030.c > @@ -6,5 +6,5 @@ void > foo () > { > struct S a; > - asm volatile ("" : "=rm" (a) : "0" (1)); /* { dg-error "inconsistent > operand constraints in an 'asm'" } */ > + asm volatile ("" : "=rm" (a) : "0" (1)); /* { dg-error "impossible > constraint in 'asm'|inconsistent operand constraints in an 'asm'" } */ > } > > > -- Richard Biener <rguent...@suse.de> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)