On 11/20/2017 03:13 AM, Richard Biener wrote: > On Sat, Nov 18, 2017 at 10:31 AM, Jeff Law <l...@redhat.com> wrote: >> And a WIP. I can justify continuing work on this during stage3 for >> pr78496. But I thought it was worth giving folks a looksie. >> >> The goal here is to make tree-vrp's threading obsolete and remove it and >> at the same time pick up all the missed jump threads in pr78496. >> >> Essentially this patch embeds the evrp analysis into DOM and adds some >> simplification code to use the results of the evrp analysis within jump >> threading. >> >> This bootstraps, but doesn't quite pass regression testing. There's a >> few tests that need adjustment. There's also two failing tests which I >> think are a manifestation of a latent bug in the EVRP bits I've been >> worried about since I started looking at the code. >> >> It does find *all* the missing threads in pr78496. I'm still evaluating >> the impact of dropping tree-vrp.c's jump threading, but it looks promising. >> >> There's two patches I'm not posting at this time. First is the range >> analyzer embedded in the sprintf warning pass to avoid a false positive >> reported to gcc-bugs a while back. I'm pretty sure it tickles the same >> latent bug I mentioned above with the range analyzer embedded in DOM. >> It also needs minor fixes to deal with being called when the optimizer >> is not on. Given the false positive posted to gcc-bugs and the tiny >> size of the patch I can justify wrapping that up early in stage3. >> >> The second patch I'm not posting rips jump threading out of tree-vrp.c >> It's just too rough in its current state. I'm sure there's a bug that >> says GCC has gotten slower than I can use to justify pushing on this >> early in stage3 as well. >> >> I'm calling it a night from my virtual office in Hawaii. > > So DOM now does EVRP in parallel but only uses the result for > threading, not for other simplification. That somehow feels like > a waste of cycles? Isn't this the opportunity to "merge" DOM and EVRP > to make one (configurable) DOM-walker optimization pass? Yes. I hadn't really figured out how I want to wire in the simplifications to the main part of DOM. But yes, that's in the plan after removal of tree-vrp's jump threading. There's all kinds of things that I think simplify or get stronger in DOM with the availability of quality context specific range data. I don't really consider that stage3 material though.
Though I did ultimately end up wiring in one simplification after posting that patch to address a missed optimization at -O1 which eventually led to a false positive warning. -- One of the things I'm still concerned about with the EVRP code is that it sets the global range information attached to SSA_NAMEs. It is careful to only set it at an object's DEF site, so it *may* be OK, though I'm still not 100% convinced. Anyway... In the case I'm looking at on the trunk the range information set during early VRP is coarse enough that the range for the argument passed to new is [0,0xffffffffffffffff]. In my local tree where we run the EVRP analysis engine during DOM (or the sprintf warnings) we end up with a much tighter range simply because we're running it after other optimizations have cleaned things up. THe range we get is: [0xfffffffe00000000,0xfffffffffffffff] The maximum size for a call to new is 0x7fffffffffffffff. In the warning code which checks for overflow in calls to allocators we compare the low bound of the object to the maximum size allowed. So on the trunk we compare 0 to 0x7fffffffffffffff. Since the low bound is smaller than the max size we don't warn. In my code where we have a nice tight range for the argument to new we're comparing 0xfffffffe00000000 to 0x7fffffffffffffff. Of course since the argument to new has low bound larger than the maximum allowed we get a warning. As it turns out the code is unexecutable. So even though I'm not entirely sure I know how I want to wire in exploiting the evrp analysis in DOM in general, I did add a chunk of code to allow DOM to use the range info to prove the result of a conditional was a compile-time constant. There's a similar situation that arises in the libgomp Fortran testsuite where a block of unreachable code passes a huge size to memset that we get a diagnostic for. The same code fixed this problem as well. > I applaud the first and foremost goal of ripping threading out of VRP That's certainly where I'm going. > (and to be honest I'd rather get rid of the SSA propagator VRP > implementation completely at some point...). I think that's probably a gcc-9 goal. I doubt the additional precision we get from propagating through the lattice is all that important in practice and that it probably can go away. The evrp style analysis gets most of the precision and should be notably cheaper. Jeff