On Tue, Nov 12, 2019 at 9:15 AM Richard Biener <richard.guent...@gmail.com> wrote: > > On Tue, Nov 12, 2019 at 6:10 AM Jeff Law <l...@redhat.com> wrote: > > > > On 11/6/19 3:34 PM, Martin Sebor wrote: > > > On 11/6/19 2:06 PM, Martin Sebor wrote: > > >> On 11/6/19 1:39 PM, Jeff Law wrote: > > >>> On 11/6/19 1:27 PM, Martin Sebor wrote: > > >>>> On 11/6/19 11:55 AM, Jeff Law wrote: > > >>>>> On 11/6/19 11:00 AM, Martin Sebor wrote: > > >>>>>> The -Wstringop-overflow warnings for single-byte and multi-byte > > >>>>>> stores mention the amount of data being stored and the amount of > > >>>>>> space remaining in the destination, such as: > > >>>>>> > > >>>>>> warning: writing 4 bytes into a region of size 0 > > >>>>>> [-Wstringop-overflow=] > > >>>>>> > > >>>>>> 123 | *p = 0; > > >>>>>> | ~~~^~~ > > >>>>>> note: destination object declared here > > >>>>>> 45 | char b[N]; > > >>>>>> | ^ > > >>>>>> > > >>>>>> A warning like this can take some time to analyze. First, the size > > >>>>>> of the destination isn't mentioned and may not be easy to tell from > > >>>>>> the sources. In the note above, when N's value is the result of > > >>>>>> some non-trivial computation, chasing it down may be a small project > > >>>>>> in and of itself. Second, it's also not clear why the region size > > >>>>>> is zero. It could be because the offset is exactly N, or because > > >>>>>> it's negative, or because it's in some range greater than N. > > >>>>>> > > >>>>>> Mentioning both the size of the destination object and the offset > > >>>>>> makes the existing messages clearer, are will become essential when > > >>>>>> GCC starts diagnosing overflow into allocated buffers (as my > > >>>>>> follow-on patch does). > > >>>>>> > > >>>>>> The attached patch enhances -Wstringop-overflow to do this by > > >>>>>> letting compute_objsize return the offset to its caller, doing > > >>>>>> something similar in get_stridx, and adding a new function to > > >>>>>> the strlen pass to issue this enhanced warning (eventually, I'd > > >>>>>> like the function to replace the -Wstringop-overflow handler in > > >>>>>> builtins.c). With the change, the note above might read something > > >>>>>> like: > > >>>>>> > > >>>>>> note: at offset 11 to object ‘b’ with size 8 declared here > > >>>>>> 45 | char b[N]; > > >>>>>> | ^ > > >>>>>> > > >>>>>> Tested on x86_64-linux. > > >>>>>> > > >>>>>> Martin > > >>>>>> > > >>>>>> gcc-store-offset.diff > > >>>>>> > > >>>>>> gcc/ChangeLog: > > >>>>>> > > >>>>>> * builtins.c (compute_objsize): Add an argument and set it to > > >>>>>> offset > > >>>>>> into destination. > > >>>>>> * builtins.h (compute_objsize): Add an argument. > > >>>>>> * tree-object-size.c (addr_object_size): Add an argument and > > >>>>>> set it > > >>>>>> to offset into destination. > > >>>>>> (compute_builtin_object_size): Same. > > >>>>>> * tree-object-size.h (compute_builtin_object_size): Add an > > >>>>>> argument. > > >>>>>> * tree-ssa-strlen.c (get_addr_stridx): Add an argument and > > >>>>>> set it > > >>>>>> to offset into destination. > > >>>>>> (maybe_warn_overflow): New function. > > >>>>>> (handle_store): Call maybe_warn_overflow to issue warnings. > > >>>>>> > > >>>>>> gcc/testsuite/ChangeLog: > > >>>>>> > > >>>>>> * c-c++-common/Wstringop-overflow-2.c: Adjust text of expected > > >>>>>> messages. > > >>>>>> * g++.dg/warn/Wstringop-overflow-3.C: Same. > > >>>>>> * gcc.dg/Wstringop-overflow-17.c: Same. > > >>>>>> > > >>>>> > > >>>>>> Index: gcc/tree-ssa-strlen.c > > >>>>>> =================================================================== > > >>>>>> --- gcc/tree-ssa-strlen.c (revision 277886) > > >>>>>> +++ gcc/tree-ssa-strlen.c (working copy) > > >>>>>> @@ -189,6 +189,52 @@ struct laststmt_struct > > >>>>>> static int get_stridx_plus_constant (strinfo *, unsigned > > >>>>>> HOST_WIDE_INT, tree); > > >>>>>> static void handle_builtin_stxncpy (built_in_function, > > >>>>>> gimple_stmt_iterator *); > > >>>>>> +/* Sets MINMAX to either the constant value or the range VAL > > >>>>>> is in > > >>>>>> + and returns true on success. */ > > >>>>>> + > > >>>>>> +static bool > > >>>>>> +get_range (tree val, wide_int minmax[2], const vr_values *rvals = > > >>>>>> NULL) > > >>>>>> +{ > > >>>>>> + if (tree_fits_uhwi_p (val)) > > >>>>>> + { > > >>>>>> + minmax[0] = minmax[1] = wi::to_wide (val); > > >>>>>> + return true; > > >>>>>> + } > > >>>>>> + > > >>>>>> + if (TREE_CODE (val) != SSA_NAME) > > >>>>>> + return false; > > >>>>>> + > > >>>>>> + if (rvals) > > >>>>>> + { > > >>>>>> + gimple *def = SSA_NAME_DEF_STMT (val); > > >>>>>> + if (gimple_assign_single_p (def) > > >>>>>> + && gimple_assign_rhs_code (def) == INTEGER_CST) > > >>>>>> + { > > >>>>>> + /* get_value_range returns [0, N] for constant > > >>>>>> assignments. */ > > >>>>>> + val = gimple_assign_rhs1 (def); > > >>>>>> + minmax[0] = minmax[1] = wi::to_wide (val); > > >>>>>> + return true; > > >>>>>> + } > > >>>>> Umm, something seems really off with this hunk. If the SSA_NAME is > > >>>>> set > > >>>>> via a simple constant assignment, then the range ought to be a > > >>>>> singleton > > >>>>> ie [CONST,CONST]. Is there are particular test were this is not > > >>>>> true? > > >>>>> > > >>>>> The only way offhand I could see this happening is if originally > > >>>>> the RHS > > >>>>> wasn't a constant, but due to optimizations it either simplified > > >>>>> into a > > >>>>> constant or a constant was propagated into an SSA_NAME appearing on > > >>>>> the > > >>>>> RHS. This would have to happen between the last range analysis and > > >>>>> the > > >>>>> point where you're making this query. > > >>>> > > >>>> Yes, I think that's right. Here's an example where it happens: > > >>>> > > >>>> void f (void) > > >>>> { > > >>>> char s[] = "1234"; > > >>>> unsigned n = strlen (s); > > >>>> char vla[n]; // or malloc (n) > > >>>> vla[n] = 0; // n = [4, 4] > > >>>> ... > > >>>> } > > >>>> > > >>>> The strlen call is folded to 4 but that's not propagated to > > >>>> the access until sometime after the strlen pass is done. > > >>> Hmm. Are we calling set_range_info in that case? That goes behind the > > >>> back of pass instance of vr_values. If so, that might argue we want to > > >>> be setting it in vr_values too. > > >> > > >> No, set_range_info is only called for ranges. In this case, > > >> handle_builtin_strlen replaces the strlen() call with 4: > > >> > > >> s = "1234"; > > >> _1 = __builtin_strlen (&s); > > >> n_2 = (unsigned int) _1; > > >> a.1_8 = __builtin_alloca_with_align (_1, 8); > > >> (*a.1_8)[n_2] = 0; > > >> > > >> When the access is made, the __builtin_alloca_with_align call > > >> is found as the destination and the _1 SSA_NAME is used to > > >> get its size. We get back the range [4, 4]. > > > > > > By the way, I glossed over one detail. The above doesn't work > > > exactly as is because the allocation size is the SSA_NAME _1 > > > (with the range [4, 4]) but the index is the SSA_NAME n_2 (with > > > the range [0, 4]; the range is [0, 4] because it was set based > > > on the size of the argument to the strlen() call well before > > > the strlen pass even ran). > > Which would tend to argue that we should forward propagate the constant > > to the uses of _1. That should expose that the RHS of the assignment to > > n_2 is a constant as well. > > > > > > > > > > To make it work across assignments we need to propagate the strlen > > > results down the CFG somehow. I'm hoping the on-demand VRP will > > > do this automagically. > > It would, but it's probably more heavyweight than we need. We just need > > to forward propagate the discovered constant to the use points and pick > > up any secondary opportunities that arise. > > Yes. And the usual way of doing this is to keep a constant-and-copy > lattice (and for copies you'd need to track availability) and before > optimizing > a stmt substitute its operands with the lattice contents. > > forwprop has a scheme that can be followed doing a RPO walk, strlen > does a DOM walk, there you can follow what DOM/PRE elimination do > (for tracking copy availability - if you just track constants you can > elide that).
I guess we could enhance domwalk with lattice tracking utilities as well (in a derived class). > Richard. > > > jeff > >