On Thu, Feb 6, 2020 at 12:14 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Thu, Feb 6, 2020 at 11:33 AM Richard Biener > <richard.guent...@gmail.com> wrote: > > > > On Thu, Feb 6, 2020 at 11:06 AM Richard Biener > > <richard.guent...@gmail.com> wrote: > > > > > > On Wed, Feb 5, 2020 at 4:55 PM Martin Sebor <mse...@gmail.com> wrote: > > > > > > > > On 2/5/20 1:19 AM, Richard Biener wrote: > > > > > On Tue, Feb 4, 2020 at 11:02 PM Martin Sebor <mse...@gmail.com> wrote: > > > > >> > > > > >> On 2/4/20 2:31 PM, Jeff Law wrote: > > > > >>> On Tue, 2020-02-04 at 13:08 -0700, Martin Sebor wrote: > > > > >>>> On 2/4/20 12:15 PM, Richard Biener wrote: > > > > >>>>> On February 4, 2020 5:30:42 PM GMT+01:00, Jeff Law > > > > >>>>> <l...@redhat.com> wrote: > > > > >>>>>> On Tue, 2020-02-04 at 10:34 +0100, Richard Biener wrote: > > > > >>>>>>> On Tue, Feb 4, 2020 at 1:44 AM Martin Sebor <mse...@gmail.com> > > > > >>>>>>> wrote: > > > > >>>>>>>> PR 93519 reports a false positive -Wrestrict issued for an > > > > >>>>>>>> inlined > > > > >>>>>> call > > > > >>>>>>>> to strcpy that carefully guards against self-copying. This is > > > > >>>>>> caused > > > > >>>>>>>> by the caller's arguments substituted into the call during > > > > >>>>>>>> inlining > > > > >>>>>> and > > > > >>>>>>>> before dead code elimination. > > > > >>>>>>>> > > > > >>>>>>>> The attached patch avoids this by removing -Wrestrict from the > > > > >>>>>> folder > > > > >>>>>>>> and deferring folding perfectly overlapping (and so undefined) > > > > >>>>>> calls > > > > >>>>>>>> to strcpy (and mempcpy, but not memcpy) until much later. > > > > >>>>>>>> Calls to > > > > >>>>>>>> perfectly overlapping calls to memcpy are still folded early. > > > > >>>>>>> > > > > >>>>>>> Why do we bother to warn at all for this case? Just DWIM here. > > > > >>>>>> Warnings like > > > > >>>>>>> this can be emitted from the analyzer? > > > > >>>>>> They potentially can, but the analyzer is and will almost always > > > > >>>>>> certainly be considerably slower. I would not expect it to be > > > > >>>>>> used > > > > >>>>>> nearly as much as the core compiler. > > > > >>>>>> > > > > >>>>>> WHether or not a particular warning makes sense in the core > > > > >>>>>> compiler or > > > > >>>>>> analyzer would seem to me to depend on whether or not we can > > > > >>>>>> reasonably > > > > >>>>>> issue warnings without interprocedural analysis. double-free > > > > >>>>>> realistically requires interprocedural analysis to be effective. > > > > >>>>>> I'm > > > > >>>>>> not sure Wrestrict really does. > > > > >>>>>> > > > > >>>>>> > > > > >>>>>>> That is, I suggest to simply remove the bogus warning code from > > > > >>>>>> folding > > > > >>>>>>> (and _not_ fail the folding). > > > > >>>>>> I haven't looked at the patch, but if we can get the warning out > > > > >>>>>> of the > > > > >>>>>> folder that's certainly preferable. And we could investigate > > > > >>>>>> deferring > > > > >>>>>> self-copy removal. > > > > >>>>> > > > > >>>>> I think the issue is as usual, warning for code we'll later > > > > >>>>> remove as dead. Warning at folding is almost always premature. > > > > >>>> > > > > >>>> In this instance the code is reachable (or isn't obviously > > > > >>>> unreachable). > > > > >>>> GCC doesn't remove it, but provides benign (and reasonable) > > > > >>>> semantics > > > > >>>> for it(*). To me, that's one aspect of quality. Letting the user > > > > >>>> know > > > > >>>> that the code is buggy is another. I view that as at least as > > > > >>>> important > > > > >>>> as folding the ill-effects away because it makes it possible to fix > > > > >>>> the problem so the code works correctly even with compilers that > > > > >>>> don't > > > > >>>> provide these benign semantics. > > > > >>> If you look at the guts of what happens at the point where we issue > > > > >>> the > > > > >>> warning from within gimple_fold_builtin_strcpy we have: > > > > >>> > > > > >>>> DCH_to_char (char * in, char * out, int collid) > > > > >>>> { > > > > >>>> int type; > > > > >>>> char * D.2148; > > > > >>>> char * dest; > > > > >>>> char * num; > > > > >>>> long unsigned int _4; > > > > >>>> char * _5; > > > > >>>> > > > > >>>> ;; basic block 2, loop depth 0 > > > > >>>> ;; pred: ENTRY > > > > >>>> ;; succ: 4 > > > > >>>> > > > > >>>> ;; basic block 4, loop depth 0 > > > > >>>> ;; pred: 2 > > > > >>>> ;; succ: 5 > > > > >>>> > > > > >>>> ;; basic block 5, loop depth 0 > > > > >>>> ;; pred: 4 > > > > >>>> ;; succ: 6 > > > > >>>> > > > > >>>> ;; basic block 6, loop depth 0 > > > > >>>> ;; pred: 5 > > > > >>>> if (0 != 0) > > > > >>>> goto <bb 7>; [53.47%] > > > > >>>> else > > > > >>>> goto <bb 8>; [46.53%] > > > > >>>> ;; succ: 7 > > > > >>>> ;; 8 > > > > >>>> > > > > >>>> ;; basic block 7, loop depth 0 > > > > >>>> ;; pred: 6 > > > > >>>> strcpy (out_1(D), out_1(D)); > > > > >>>> ;; succ: 8 > > > > >>>> > > > > >>>> ;; basic block 8, loop depth 0 > > > > >>>> ;; pred: 6 > > > > >>>> ;; 7 > > > > >>>> _4 = __builtin_strlen (out_1(D)); > > > > >>>> _5 = out_1(D) + _4; > > > > >>>> __builtin_memcpy (_5, "foo", 4); > > > > >>>> ;; succ: 3 > > > > >>>> > > > > >>>> ;; basic block 3, loop depth 0 > > > > >>>> ;; pred: 8 > > > > >>>> return; > > > > >>>> ;; succ: EXIT > > > > >>>> > > > > >>>> } > > > > >>>> > > > > >>> > > > > >>> Which shows the code is obviously unreachable in the case we're > > > > >>> warning > > > > >>> about. You can't see this in the dumps because it's exposed by > > > > >>> inlining, then cleaned up before writing the dump file. > > > > >> > > > > >> In the specific case of the bug the code is of course eliminated > > > > >> because it's guarded by the if (s != d). I was referring to > > > > >> the general (unguarded) case of: > > > > >> > > > > >> char *s = "", *p; > > > > >> > > > > >> int main (void) > > > > >> { > > > > >> p = strcpy (s, s); > > > > >> puts (p); > > > > >> } > > > > >> > > > > >> where GCC folds the assignment 'p = strcpy(s, s);' to effectively > > > > >> p = s; That's perfectly reasonable but it could equally as well > > > > >> leave the call alone, as it does when s is null, for instance. > > > > >> > > > > >> I think folding it away is not only reasonable but preferable to > > > > >> making the invalid call, but it's done only rarely. Most of > > > > >> the time GCC does emit the undefined access (it does that with > > > > >> calls to library functions as well as with direct stores and > > > > >> reads). (I am hoping we can change that in the future so that > > > > >> these kinds of problems are handled with some consistency.) > > > > >> > > > > >>> > > > > >>> ISTM this would be a case we could handle with the __builtin_warning > > > > >>> stuff. > > > > >>> > > > > >>> I think the question is do we want to do anything about it this > > > > >>> cycle? > > > > >>> > > > > >>> > > > > >>> If so, I think Martin's approach is quite reasonable. It disables > > > > >>> folding away the self-copies from gimple-fold and moves the warning > > > > >>> into the expander. So if there's such a call in the IL at expansion > > > > >>> time we get a warning (-O0). > > > > >>> > > > > >>> I'd hazard a guess that the diagnostic was added to the strlen pass > > > > >>> to > > > > >>> capture the missed warning when we're optimizing and the self-copy > > > > >>> has > > > > >>> survived until that point. There's a couple issues that raises > > > > >>> though. > > > > >>> > > > > >>> First, it's insufficient. DSE (for example) can do self-copy > > > > >>> removal, > > > > >>> so it needs the same handling. There may be others places too. > > > > >>> > > > > >>> Second, if the code becomes unreachable after strlen, then we've got > > > > >>> new false positive issues. > > > > >>> > > > > >>> It's the classic problems we have with all middle end based > > > > >>> warnings. > > > > >>> > > > > >>> But I could live with those if we can show that using > > > > >>> __builtin_warning > > > > >>> to handle this stuff in gcc-11 works... ISTM we emit the > > > > >>> __builtin_warning call before any self-copy like that, whenever we > > > > >>> happen to spot them. They'll naturally get removed if the path > > > > >>> becomes > > > > >>> unreachable. We'd warn during expansion for calls to > > > > >>> __builtin_warning. We could even optionally warn when removing a > > > > >>> call > > > > >>> to __bulitin_warning. > > > > >>> > > > > >>> Thoughts? > > > > >> > > > > >> The patch has pretty much the same effect as emitting > > > > >> __builtin_warning > > > > >> from the folder would. It defers the folding until much later, and > > > > >> if > > > > >> the code isn't eliminated, it issues a warning and folds the call > > > > >> away. > > > > > > > > > > But it affects subsequent optimizations - the call is more expensive > > > > > in any size heuristics, it posses an (alias-set zero) memory write > > > > > barrier (unless you start to optimize no-op copies in the alias > > > > > oracle), > > > > > it is a _call_ - passes like the vectorizer are not happy about a > > > > > call. > > > > > It prevents SRA of the accessed object, ... > > > > > > > > This is a strcpy call copying over itself. It's undefined code, > > > > and so hardly anything that's common or so performance sensitive > > > > to make a noticeable difference. > > > > > > > > > So no, leaving in the call is _not_ equivalent to sticking in a > > > > > __builtin_warning() call (or however we actually implement it, > > > > > I'd prefer a stmt in the "debug" category so it's simply ignored > > > > > or elided by most passes by means of existing code). > > > > > > > > > > That said, I'd prefer to not do anything about this bug. Iff then > > > > > in the inliner try doing a CFG cleanup before folding stmts > > > > > (it's doing delayed folding anyway). But not for GCC 10. > > > > > One could also mark stmts with no-warning before the inliner > > > > > folds them (and then mark back) to avoid those classes of > > > > > folding warnings. > > > > > > > > I think this would very unfortunate for GCC 10. The user's code > > > > is clearly correct -- they take pains to avoid the overlapping > > > > copy by guarding against it just before it -- and GCC simply emits > > > > an invalid warning because of how it does inlining. All that will > > > > be accomplished by not fixing it is we will release a worse quality > > > > compiler than we otherwise can, unnecessarily eroding our users' > > > > confidence in the value of GCC's diagnostic. > > > > > > > > I'll look into your suggestions for the inliner in stage 1 but > > > > please reconsider for GCC 10. > > > > > > One possibility is the attached but that adds an extra CFG cleanup > > > (also untested but on the testcase). Another possibility would be > > > to re-do fold_marked_statements in terms of cleanup_control_flow_pre (), > > > recognizing unreachable regions on the way and simply avoid folding > > > stmts in blocks that will be then removed by the pending CFG cleanup. > > > > > > I'll see whether I can cook that up quickly. > > > > Like this. Indenting not fixed and untested apart from on the testcase. > > Disadvantage is we're walking _all_ blocks (at least skipping stmts). > > Technically the iteration scheme should probably change to push > > edges rather than edge iterators so we can avoid the repeated > > find_taken_edge > > call. > > > > Anyway, just a prototype. > > > > Note SSA propagators / VN avoid to fold stmts that are obviously > > unreachable so it somewhat makes sense to beat the inliner to do > > the same. > > I'm testing this variant but I'm quite sure the cgraph verifier will trip on > us not updating cgraph edges for "unreachable" but changed (for example > indirect -> direct call) calls. Somewhat moot of course but it appears > we want to keep it happy at all cost ...(?) We even do "special" > > delete_unreachable_blocks_update_callgraph (id.dst_node, false); > > if (flag_checking) > id.dst_node->verify (); > > rather than relying on CFG cleanup we'll do afterwards anyway. > Maybe that's for a reason? Honza? Might be for the early inliner? > I see (see first posted variant) that regular delete_basic_block > doesn't bother to remove cgraph edges.
It seems to work fine on void foo() {} static inline void bar (int i, int j, void (*p)()) { if (i != j) p(); } void baz(int i) { bar (i, i, foo); } where I see early inlining inlining bar, changing the call to foo() (and not noticing it will be unreachable - for simpler if (0) cases the inliner replaces the call to foo with a call to __builtin_unreachable!), not folding it and passing verification. Guess the cgraph edge updating code in copy_bb deals with this so we should be safe not folding stmts. That said, testing still running. Richard. > Anyhow, will report back if the simple PRE walk skipping unreachable > regions "works" in practice (bootstrap/regtest). > > Richard. > > > Richard. > > > > > > Martin