Re: [PATCH] avoid issuing -Wrestrict from folder (PR 93519)

Richard Biener Thu, 06 Feb 2020 03:58:20 -0800

On Thu, Feb 6, 2020 at 12:14 PM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Thu, Feb 6, 2020 at 11:33 AM Richard Biener
> <richard.guent...@gmail.com> wrote:
> >
> > On Thu, Feb 6, 2020 at 11:06 AM Richard Biener
> > <richard.guent...@gmail.com> wrote:
> > >
> > > On Wed, Feb 5, 2020 at 4:55 PM Martin Sebor <mse...@gmail.com> wrote:
> > > >
> > > > On 2/5/20 1:19 AM, Richard Biener wrote:
> > > > > On Tue, Feb 4, 2020 at 11:02 PM Martin Sebor <mse...@gmail.com> wrote:
> > > > >>
> > > > >> On 2/4/20 2:31 PM, Jeff Law wrote:
> > > > >>> On Tue, 2020-02-04 at 13:08 -0700, Martin Sebor wrote:
> > > > >>>> On 2/4/20 12:15 PM, Richard Biener wrote:
> > > > >>>>> On February 4, 2020 5:30:42 PM GMT+01:00, Jeff Law 
> > > > >>>>> <l...@redhat.com> wrote:
> > > > >>>>>> On Tue, 2020-02-04 at 10:34 +0100, Richard Biener wrote:
> > > > >>>>>>> On Tue, Feb 4, 2020 at 1:44 AM Martin Sebor <mse...@gmail.com> 
> > > > >>>>>>> wrote:
> > > > >>>>>>>> PR 93519 reports a false positive -Wrestrict issued for an 
> > > > >>>>>>>> inlined
> > > > >>>>>> call
> > > > >>>>>>>> to strcpy that carefully guards against self-copying.  This is
> > > > >>>>>> caused
> > > > >>>>>>>> by the caller's arguments substituted into the call during 
> > > > >>>>>>>> inlining
> > > > >>>>>> and
> > > > >>>>>>>> before dead code elimination.
> > > > >>>>>>>>
> > > > >>>>>>>> The attached patch avoids this by removing -Wrestrict from the
> > > > >>>>>> folder
> > > > >>>>>>>> and deferring folding perfectly overlapping (and so undefined)
> > > > >>>>>> calls
> > > > >>>>>>>> to strcpy (and mempcpy, but not memcpy) until much later.  
> > > > >>>>>>>> Calls to
> > > > >>>>>>>> perfectly overlapping calls to memcpy are still folded early.
> > > > >>>>>>>
> > > > >>>>>>> Why do we bother to warn at all for this case?  Just DWIM here.
> > > > >>>>>> Warnings like
> > > > >>>>>>> this can be emitted from the analyzer?
> > > > >>>>>> They potentially can, but the analyzer is and will almost always
> > > > >>>>>> certainly be considerably slower.  I would not expect it to be 
> > > > >>>>>> used
> > > > >>>>>> nearly as much as the core compiler.
> > > > >>>>>>
> > > > >>>>>> WHether or not a particular warning makes sense in the core 
> > > > >>>>>> compiler or
> > > > >>>>>> analyzer would seem to me to depend on whether or not we can 
> > > > >>>>>> reasonably
> > > > >>>>>> issue warnings without interprocedural analysis.  double-free
> > > > >>>>>> realistically requires interprocedural analysis to be effective. 
> > > > >>>>>>  I'm
> > > > >>>>>> not sure Wrestrict really does.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>> That is, I suggest to simply remove the bogus warning code from
> > > > >>>>>> folding
> > > > >>>>>>> (and _not_ fail the folding).
> > > > >>>>>> I haven't looked at the patch, but if we can get the warning out 
> > > > >>>>>> of the
> > > > >>>>>> folder that's certainly preferable.  And we could investigate 
> > > > >>>>>> deferring
> > > > >>>>>> self-copy removal.
> > > > >>>>>
> > > > >>>>> I think the issue is as usual, warning for code we'll later 
> > > > >>>>> remove as dead. Warning at folding is almost always premature.
> > > > >>>>
> > > > >>>> In this instance the code is reachable (or isn't obviously 
> > > > >>>> unreachable).
> > > > >>>> GCC doesn't remove it, but provides benign (and reasonable) 
> > > > >>>> semantics
> > > > >>>> for it(*).  To me, that's one aspect of quality.  Letting the user 
> > > > >>>> know
> > > > >>>> that the code is buggy is another.  I view that as at least as 
> > > > >>>> important
> > > > >>>> as folding the ill-effects away because it makes it possible to fix
> > > > >>>> the problem so the code works correctly even with compilers that 
> > > > >>>> don't
> > > > >>>> provide these benign semantics.
> > > > >>> If you look at the guts of what happens at the point where we issue 
> > > > >>> the
> > > > >>> warning from within gimple_fold_builtin_strcpy we have:
> > > > >>>
> > > > >>>> DCH_to_char (char * in, char * out, int collid)
> > > > >>>> {
> > > > >>>>     int type;
> > > > >>>>     char * D.2148;
> > > > >>>>     char * dest;
> > > > >>>>     char * num;
> > > > >>>>     long unsigned int _4;
> > > > >>>>     char * _5;
> > > > >>>>
> > > > >>>> ;;   basic block 2, loop depth 0
> > > > >>>> ;;    pred:       ENTRY
> > > > >>>> ;;    succ:       4
> > > > >>>>
> > > > >>>> ;;   basic block 4, loop depth 0
> > > > >>>> ;;    pred:       2
> > > > >>>> ;;    succ:       5
> > > > >>>>
> > > > >>>> ;;   basic block 5, loop depth 0
> > > > >>>> ;;    pred:       4
> > > > >>>> ;;    succ:       6
> > > > >>>>
> > > > >>>> ;;   basic block 6, loop depth 0
> > > > >>>> ;;    pred:       5
> > > > >>>>     if (0 != 0)
> > > > >>>>       goto <bb 7>; [53.47%]
> > > > >>>>     else
> > > > >>>>       goto <bb 8>; [46.53%]
> > > > >>>> ;;    succ:       7
> > > > >>>> ;;                8
> > > > >>>>
> > > > >>>> ;;   basic block 7, loop depth 0
> > > > >>>> ;;    pred:       6
> > > > >>>>     strcpy (out_1(D), out_1(D));
> > > > >>>> ;;    succ:       8
> > > > >>>>
> > > > >>>> ;;   basic block 8, loop depth 0
> > > > >>>> ;;    pred:       6
> > > > >>>> ;;                7
> > > > >>>>     _4 = __builtin_strlen (out_1(D));
> > > > >>>>     _5 = out_1(D) + _4;
> > > > >>>>     __builtin_memcpy (_5, "foo", 4);
> > > > >>>> ;;    succ:       3
> > > > >>>>
> > > > >>>> ;;   basic block 3, loop depth 0
> > > > >>>> ;;    pred:       8
> > > > >>>>     return;
> > > > >>>> ;;    succ:       EXIT
> > > > >>>>
> > > > >>>> }
> > > > >>>>
> > > > >>>
> > > > >>> Which shows the code is obviously unreachable in the case we're 
> > > > >>> warning
> > > > >>> about.  You can't see this in the dumps because it's exposed by
> > > > >>> inlining, then cleaned up before writing the dump file.
> > > > >>
> > > > >> In the specific case of the bug the code is of course eliminated
> > > > >> because it's guarded by the if (s != d).  I was referring to
> > > > >> the general (unguarded) case of:
> > > > >>
> > > > >>     char *s = "", *p;
> > > > >>
> > > > >>     int main (void)
> > > > >>     {
> > > > >>       p = strcpy (s, s);
> > > > >>       puts (p);
> > > > >>     }
> > > > >>
> > > > >> where GCC folds the assignment 'p = strcpy(s, s);' to effectively
> > > > >> p = s;  That's perfectly reasonable but it could equally as well
> > > > >> leave the call alone, as it does when s is null, for instance.
> > > > >>
> > > > >> I think folding it away is not only reasonable but preferable to
> > > > >> making the invalid call, but it's done only rarely.  Most of
> > > > >> the time GCC does emit the undefined access (it does that with
> > > > >> calls to library functions as well as with direct stores and
> > > > >> reads).  (I am hoping we can change that in the future so that
> > > > >> these kinds of problems are handled with some consistency.)
> > > > >>
> > > > >>>
> > > > >>> ISTM this would be a case we could handle with the __builtin_warning
> > > > >>> stuff.
> > > > >>>
> > > > >>> I think the question is do we want to do anything about it this 
> > > > >>> cycle?
> > > > >>>
> > > > >>>
> > > > >>> If so, I think Martin's approach is quite reasonable.  It disables
> > > > >>> folding away the self-copies from gimple-fold and moves the warning
> > > > >>> into the expander.  So if there's such a call in the IL at expansion
> > > > >>> time we get a warning (-O0).
> > > > >>>
> > > > >>> I'd hazard a guess that the diagnostic was added to the strlen pass 
> > > > >>> to
> > > > >>> capture the missed warning when we're optimizing and the self-copy 
> > > > >>> has
> > > > >>> survived until that point. There's a couple issues that raises 
> > > > >>> though.
> > > > >>>
> > > > >>> First, it's insufficient.  DSE (for example) can do self-copy 
> > > > >>> removal,
> > > > >>> so it needs the same handling.  There may be others places too.
> > > > >>>
> > > > >>> Second, if the code becomes unreachable after strlen, then we've got
> > > > >>> new false positive issues.
> > > > >>>
> > > > >>> It's the classic problems we have with all middle end based 
> > > > >>> warnings.
> > > > >>>
> > > > >>> But I could live with those if we can show that using 
> > > > >>> __builtin_warning
> > > > >>> to handle this stuff in gcc-11 works...  ISTM we emit the
> > > > >>> __builtin_warning call before any self-copy like that, whenever we
> > > > >>> happen to spot them.  They'll naturally get removed if the path 
> > > > >>> becomes
> > > > >>> unreachable.  We'd warn during expansion for calls to
> > > > >>> __builtin_warning.  We could even optionally warn when removing a 
> > > > >>> call
> > > > >>> to __bulitin_warning.
> > > > >>>
> > > > >>> Thoughts?
> > > > >>
> > > > >> The patch has pretty much the same effect as emitting 
> > > > >> __builtin_warning
> > > > >> from the folder would.  It defers the folding until much later, and 
> > > > >> if
> > > > >> the code isn't eliminated, it issues a warning and folds the call 
> > > > >> away.
> > > > >
> > > > > But it affects subsequent optimizations - the call is more expensive
> > > > > in any size heuristics, it posses an (alias-set zero) memory write
> > > > > barrier (unless you start to optimize no-op copies in the alias 
> > > > > oracle),
> > > > > it is a _call_ - passes like the vectorizer are not happy about a 
> > > > > call.
> > > > > It prevents SRA of the accessed object, ...
> > > >
> > > > This is a strcpy call copying over itself.  It's undefined code,
> > > > and so hardly anything that's common or so performance sensitive
> > > > to make a noticeable difference.
> > > >
> > > > > So no, leaving in the call is _not_ equivalent to sticking in a
> > > > > __builtin_warning() call (or however we actually implement it,
> > > > > I'd prefer a stmt in the "debug" category so it's simply ignored
> > > > > or elided by most passes by means of existing code).
> > > > >
> > > > > That said, I'd prefer to not do anything about this bug.  Iff then
> > > > > in the inliner try doing a CFG cleanup before folding stmts
> > > > > (it's doing delayed folding anyway).  But not for GCC 10.
> > > > > One could also mark stmts with no-warning before the inliner
> > > > > folds them (and then mark back) to avoid those classes of
> > > > > folding warnings.
> > > >
> > > > I think this would very unfortunate for GCC 10.  The user's code
> > > > is clearly correct -- they take pains to avoid the overlapping
> > > > copy by guarding against it just before it -- and GCC simply emits
> > > > an invalid warning because of how it does inlining.  All that will
> > > > be accomplished by not fixing it is we will release a worse quality
> > > > compiler than we otherwise can, unnecessarily eroding our users'
> > > > confidence in the value of GCC's diagnostic.
> > > >
> > > > I'll look into your suggestions for the inliner in stage 1 but
> > > > please reconsider for GCC 10.
> > >
> > > One possibility is the attached but that adds an extra CFG cleanup
> > > (also untested but on the testcase).  Another possibility would be
> > > to re-do fold_marked_statements in terms of cleanup_control_flow_pre (),
> > > recognizing unreachable regions on the way and simply avoid folding
> > > stmts in blocks that will be then removed by the pending CFG cleanup.
> > >
> > > I'll see whether I can cook that up quickly.
> >
> > Like this.  Indenting not fixed and untested apart from on the testcase.
> > Disadvantage is we're walking _all_ blocks (at least skipping stmts).
> > Technically the iteration scheme should probably change to push
> > edges rather than edge iterators so we can avoid the repeated 
> > find_taken_edge
> > call.
> >
> > Anyway, just a prototype.
> >
> > Note SSA propagators / VN avoid to fold stmts that are obviously
> > unreachable so it somewhat makes sense to beat the inliner to do
> > the same.
>
> I'm testing this variant but I'm quite sure the cgraph verifier will trip on
> us not updating cgraph edges for "unreachable" but changed (for example
> indirect -> direct call) calls.  Somewhat moot of course but it appears
> we want to keep it happy at all cost ...(?)  We even do "special"
>
>   delete_unreachable_blocks_update_callgraph (id.dst_node, false);
>
>   if (flag_checking)
>     id.dst_node->verify ();
>
> rather than relying on CFG cleanup we'll do afterwards anyway.
> Maybe that's for a reason?  Honza?  Might be for the early inliner?
> I see (see first posted variant) that regular delete_basic_block
> doesn't bother to remove cgraph edges.


It seems to work fine on

void foo() {}

static inline void bar (int i, int j, void (*p)())
{
  if (i != j)
    p();
}

void baz(int i)
{
  bar (i, i, foo);
}

where I see early inlining inlining bar, changing the call to foo()
(and not noticing it will be unreachable - for simpler if (0) cases
the inliner replaces the call to foo with a call to __builtin_unreachable!),
not folding it and passing verification.  Guess the cgraph edge updating
code in copy_bb deals with this so we should be safe not folding
stmts.

That said, testing still running.

Richard.

> Anyhow, will report back if the simple PRE walk skipping unreachable
> regions "works" in practice  (bootstrap/regtest).
>
> Richard.
>
> > Richard.
> >
> > > > Martin

Re: [PATCH] avoid issuing -Wrestrict from folder (PR 93519)

Reply via email to