On Tue, 14 Jul 2020, Jakub Jelinek wrote:

> On Tue, Jul 14, 2020 at 02:33:41PM +0200, Richard Biener wrote:
> > > As mentioned in the PR, we generate a useless movzbl insn before lock 
> > > cmpxchg.
> > > The problem is that the builtin for the char/short cases has the arguments
> > > promoted to int and combine gives up, because the instructions have
> > > MEM_VOLATILE_P arguments and recog in that case doesn't recognize anything
> > > when volatile_ok is false, and nothing afterwards optimizes the
> > > (reg:SI a) = (zero_extend:SI (reg:QI a))
> > > ... (subreg:QI (reg:SI a) 0) ...
> > 
> > So the above isn't fixable?  Because it would probably be the more
> > generic fix, right?
> 
> I'm afraid it is not, CCing Segher on that.  The question is how to
> differentiate between "combine didn't do anything dangerous to this
> MEM_VOLATILE_P and just kept it in the pattern as is" vs.
> "combine changed it in a dangerous way unsuitable for volatile accesses".
> Even if we wanted to make just the case where i3 is the only insn
> that originally contains MEM_VOLATILE_P and checked that the MEM stayed as
> is, there is the difficulty that we change the insns in place, so remembering
> how it looked before is hard.  And then before trying to recognize it we'd
> carefully need to check that nothing else is volatile before enabling
> temporarily volatile_ok.
> 
> > > +  if (TREE_CODE (exp) == SSA_NAME
> > > +      && TYPE_MODE (TREE_TYPE (exp)) != mode)
> > > +    {
> > > +      /* Undo argument promotion if possible, as combine might not
> > > +  be able to do it later due to MEM_VOLATILE_P uses in the
> > > +  patterns.  */
> > > +      gimple *g = get_gimple_for_ssa_name (exp);
> > > +      if (g && gimple_assign_cast_p (g))
> > > + {
> > > +   tree rhs = gimple_assign_rhs1 (g);
> > > +   if (TYPE_MODE (TREE_TYPE (rhs)) == mode)
> > 
> > Is this really enough to check?  What if the cast was truncating?
> > The gimple_assign_cast_p predicate also allows for FIX_TRUNC_EXPR
> > and VIEW_CONVERT_EXPR where for the latter the rhs is the 
> > VIEW_CONVERT_EXPR itself.
> 
> I don't know how could those happen.
> mode is the result of get_builtin_sync_mode or
> int_mode_for_size (BITS_PER_UNIT * size, 0).require () (the former is
> int_mode_for_size (...).require ()) and thus an integral mode.
> The checks verify that the SSA_NAME doesn't have that mode, but rhs1
> of the statement does have the right mode.
> As mode is integral, I think that rules out FIX_TRUNC_EXPR
> which would need real mode on the argument, and for VCE, if VCE is
> in the operand itself, then it ought to have the same mode as the lhs.
> I believe the only possibility where exp doesn't have already the desired
> mode is the char/short to int promotion, otherwise all the builtins are
> prototyped and thus one shouldn't have some unrelated mode.
> 
> But if you want some extra checks just to be sure, I can add them, whether
> it means checking only for CONVERT_EXPR_CODE_P instead of all 4 rhs codes,
> and/or checking that both types are integral scalar and the conversion is an
> extension (by comparing type precisions).

I see the function is only used from atomics expansion but still
I would feel more comfortable with checking for just
CONVERT_EXPR_CODE_P and both integral types plus the precision bits.
You never know how these functions end up re-used ...

OK with those changes.

Thanks,
Richard.

Reply via email to