On Tue, 14 Jul 2020, Jakub Jelinek wrote: > On Tue, Jul 14, 2020 at 02:33:41PM +0200, Richard Biener wrote: > > > As mentioned in the PR, we generate a useless movzbl insn before lock > > > cmpxchg. > > > The problem is that the builtin for the char/short cases has the arguments > > > promoted to int and combine gives up, because the instructions have > > > MEM_VOLATILE_P arguments and recog in that case doesn't recognize anything > > > when volatile_ok is false, and nothing afterwards optimizes the > > > (reg:SI a) = (zero_extend:SI (reg:QI a)) > > > ... (subreg:QI (reg:SI a) 0) ... > > > > So the above isn't fixable? Because it would probably be the more > > generic fix, right? > > I'm afraid it is not, CCing Segher on that. The question is how to > differentiate between "combine didn't do anything dangerous to this > MEM_VOLATILE_P and just kept it in the pattern as is" vs. > "combine changed it in a dangerous way unsuitable for volatile accesses". > Even if we wanted to make just the case where i3 is the only insn > that originally contains MEM_VOLATILE_P and checked that the MEM stayed as > is, there is the difficulty that we change the insns in place, so remembering > how it looked before is hard. And then before trying to recognize it we'd > carefully need to check that nothing else is volatile before enabling > temporarily volatile_ok. > > > > + if (TREE_CODE (exp) == SSA_NAME > > > + && TYPE_MODE (TREE_TYPE (exp)) != mode) > > > + { > > > + /* Undo argument promotion if possible, as combine might not > > > + be able to do it later due to MEM_VOLATILE_P uses in the > > > + patterns. */ > > > + gimple *g = get_gimple_for_ssa_name (exp); > > > + if (g && gimple_assign_cast_p (g)) > > > + { > > > + tree rhs = gimple_assign_rhs1 (g); > > > + if (TYPE_MODE (TREE_TYPE (rhs)) == mode) > > > > Is this really enough to check? What if the cast was truncating? > > The gimple_assign_cast_p predicate also allows for FIX_TRUNC_EXPR > > and VIEW_CONVERT_EXPR where for the latter the rhs is the > > VIEW_CONVERT_EXPR itself. > > I don't know how could those happen. > mode is the result of get_builtin_sync_mode or > int_mode_for_size (BITS_PER_UNIT * size, 0).require () (the former is > int_mode_for_size (...).require ()) and thus an integral mode. > The checks verify that the SSA_NAME doesn't have that mode, but rhs1 > of the statement does have the right mode. > As mode is integral, I think that rules out FIX_TRUNC_EXPR > which would need real mode on the argument, and for VCE, if VCE is > in the operand itself, then it ought to have the same mode as the lhs. > I believe the only possibility where exp doesn't have already the desired > mode is the char/short to int promotion, otherwise all the builtins are > prototyped and thus one shouldn't have some unrelated mode. > > But if you want some extra checks just to be sure, I can add them, whether > it means checking only for CONVERT_EXPR_CODE_P instead of all 4 rhs codes, > and/or checking that both types are integral scalar and the conversion is an > extension (by comparing type precisions).
I see the function is only used from atomics expansion but still I would feel more comfortable with checking for just CONVERT_EXPR_CODE_P and both integral types plus the precision bits. You never know how these functions end up re-used ... OK with those changes. Thanks, Richard.