On Fri, Jan 20, 2017 at 5:37 PM, Martin Sebor <mse...@gmail.com> wrote: > On 01/20/2017 01:17 AM, Richard Biener wrote: >> >> On Thu, Jan 19, 2017 at 5:53 PM, Martin Sebor <mse...@gmail.com> wrote: >>> >>> On 01/19/2017 05:45 AM, Richard Biener wrote: >>>> >>>> >>>> On Thu, Jan 19, 2017 at 1:17 PM, Aldy Hernandez <al...@redhat.com> >>>> wrote: >>>>> >>>>> >>>>> In the attached testcase, we have a clearly bounded case of alloca >>>>> which >>>>> is >>>>> being incorrectly reported: >>>>> >>>>> void g (int *p, int *q) >>>>> { >>>>> size_t n = (size_t)(p - q); >>>>> >>>>> if (n < 10) >>>>> f (__builtin_alloca (n)); >>>>> } >>>>> >>>>> The problem is that VRP gives us an anti-range for `n' which may be out >>>>> of >>>>> range: >>>>> >>>>> # RANGE ~[2305843009213693952, 16140901064495857663] >>>>> n_9 = (long unsigned int) _4; >>>>> >>>>> We do a less than stellar job with casts and VR_ANTI_RANGE's, mostly >>>>> because >>>>> we're trying various heuristics to make up for the fact that we have >>>>> crappy >>>>> range info from VRP. More specifically, we're basically punting on an >>>>> VR_ANTI_RANGE and ignoring that the casted result (n_9) has a bound >>>>> later >>>>> on. >>>>> >>>>> Luckily, we already have code to check simple ranges coming into the >>>>> alloca >>>>> by looking into all the basic blocks feeding it. The attached patch >>>>> delays >>>>> the final decision on anti ranges until we have examined the basic >>>>> blocks >>>>> and determined for that we are definitely out of range. >>>>> >>>>> I expect all this to disappear with Andrew's upcoming range info >>>>> overhaul. >>>>> >>>>> OK for trunk? >>>> >>>> >>>> >>>> I _really_ wonder why all the range consuming warnings are not emitted >>>> from VRP itself (like we do for -Warray-bounds). There we'd still see >>>> a range for the argument derived from the if () rather than needing to >>>> do our own mini-VRP from the needessly "incomplete" range-info on >>>> SSA vars. >>> >>> >>> >>> Can you explain why the range info is only available in VRP and >>> not outside, via the get_range_info() API? It sounds as though >>> the API shouldn't be relied on (it was virtually unused before >>> GCC 7). >> >> >> It's very simple. Look at the testcase from above > > > Thanks for the detailed answer. A few more questions below. > >> >> void g (int *p, int *q) >> { >> size_t n = (size_t)(p - q); >> >> if (n < 10) >> f (__builtin_alloca (n)); >> } >> >> The IL outside of VRP is >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> _5 = __builtin_alloca (n_9); >> f (_5); >> >> so there is no SSA name we can tack a range to covering the n_9 <= 9 >> condition that dominates __builtin_alloca. > > > This may be a naive question but why is it not possible to create > such an SSA name?
You'd insert a copy (like VRP does with ASSERTs) but those will be propagated out quickly again. > Having the get_range_info() API depend on this subtlety makes its > clients quite unreliable. It may be acceptable for optimization > but when it's visible to users in the form of false positives or > negatives it's a source of bug reports. Yes, but that's the fault of the warning machinery not the range-info API which was designed for optimization. >> Inside VRP we have >> >> <bb 2> [100.00%]: >> p.0_1 = (long int) p_7(D); >> q.1_2 = (long int) q_8(D); >> _3 = p.0_1 - q.1_2; >> _4 = _3 /[ex] 4; >> n_9 = (size_t) _4; >> if (n_9 <= 9) >> goto <bb 3>; [36.64%] >> else >> goto <bb 4>; [63.36%] >> >> <bb 3> [36.64%]: >> n_13 = ASSERT_EXPR <n_9, n_9 <= 9>; >> _5 = __builtin_alloca (n_13); >> f (_5); >> >> and viola - now the alloca call uses n_13 which is defined at a point >> dominated by if (n_9 <= 9) and thus it has an improved range: >> >> n_13: [0, 9] EQUIVALENCES: { n_9 } (1 elements) > > > Yes, I see that. But when I change size_t to unsigned int (in LP64) > like so: > > void g (int *p, int *q) > { > unsigned n = (unsigned)(p - q); > > if (n < 10) > f (__builtin_alloca (n)); > } > > -Walloca-larger-than=100 still complains: > > warning: argument to ‘alloca’ may be too large > note: limit is 100 bytes, but argument may be as large as 4294967295 > > and the VRP dump shows > > _5: [0, 4294967295] > _14: [0, 4294967295] > ... > _3 = p.0_1 - q.1_2; > _4 = _3 /[ex] 4; > n_10 = (unsigned int) _4; > if (n_10 <= 9) > goto <bb 3>; [36.64%] > else > goto <bb 4>; [63.36%] > > <bb 3> [36.64%]: > _14 = _4 & 4294967295; > _5 = (long unsigned int) _14; > _6 = __builtin_alloca (_5); > > so it seems that even in VRP itself the range information isn't > quite good enough to avoid false positives. (Perhaps that's one > the missed cases you mention below?) Well, you see that there's no obvious way to use the n_10 <= 9 conditional here. VRP would need to register a [0, 9] range for the lower half of n_10 and then figure after seeing > _14 = _4 & 4294967295; that _14 is now [0, 9]. That's a neat trick VRP cannot handle at the moment (there's no way the lattice can encode a range for a sub-set of all bits of a SSA name). Your source is bogus in the way that (unsigned)(p - q) might truncate the pointer difference. Richard. > > >> When in EVRP you get similar behavior (well, there are some missed >> cases I have patches queued for for GCC 8), but instead of modifying >> the IL EVRP just temporarily sets the above range when it processes >> BBs dominated by the condition. >> >> So while for VRP you can put the warning code after propagation for >> EVRP you'd have to do warning processing during propagation itself >> (and EVRP doesn't iterate). >> >>> To answer your question, the gimple-ssa-sprintf pass that depends >>> heavily on ranges would also benefit from having access to the data >>> computed in the strlen pass that's not available outside it. >>> >>> The -Wstringop-overflow and -Walloc-size-larger-than warnings depend >>> on both VRP and tree-object-size. >>> >>> I have been thinking about how to integrate these passes in GCC 8 >>> to improve the overall quality. (By integrating I don't necessarily >>> mean merging the code but rather having them share data or be able >>> to make calls into one another.) >>> >>> I'd be very interested in your thoughts on this. >> >> >> Well, my thought is that we should not have N SSA propagation and >> M "DOM" propagation passes but one of each kind. My thought is >> also that object-size and strlen are of either kind. >> >> So ideally DOM and EVRP would merge and CCP, copyprop and VRP >> would merge. It's probably not possible (or wise) to merge their lattices >> (maybe to some extent). >> >> Those two passes could be used by a static analysis phase prior to >> any optimization to emit warnings (just computing the lattice, not doing >> any IL modification). > > > Thanks. I'll ponder that. > > Martin