Where to start?

  This is the culmination of numerous years work on the ranger for generating accurate on-demand ranges in GCC.
There are 2 patch sets as a part of this submission

1)  The ranger &  Enhancements to EVRP
3)  Other passes converted to Ranger

There is still some missing functionality in the ranger which is required for it to be a full replacement of either EVRP or VRP, but its complete enough to offer some enhanced functionality which we think is worthwhile.

It is still purely range based, which means it does not currently handle equivalences, nor other relations.  I have the relation oracle prototyped, and may still be able to get at least some of it in before the end of stage 1.

I will go into more details with the "hybrid" EVRP mode in the intro to that patchset, but basically you can run EVRP in one of 3 modes:
  - classic EVRP mode,
  - Ranger only (RVRP) or
  - hybrid mode where both engines are used.  This allows us to perform extra optimizations we could not perform before, all while continuing to exercise both engines fully, as well as articulating in dump files what the differences are.

performance:

Performance has been the primary holdup, and it still not as good as it was.  The original irange class was wide_int based and presumed quick and efficient manipulate of ranges. Back then, we were comparable to EVRP in speed.  When we merged value_range with irange last year  to produce int_range, we changed the internal representation to be tree based.   This has slowed down the mult-range code noticeably as now all the various sub-range pairs are allocated as trees/hash table looked up each time.  That plus various other tweaks have slowed us down.  We have plans for addressing at least some of this discrepancy.

Regardless, the net effect is that our performance runs have shown ranger now takes about 70% longer than EVRP does when compiling now.  Since the new hybrid mode uses both engines, we see a 170% increase in time required to execute EVRP.  EVRP starts off as a very fast pass, and In a compile of 242 GCC source files, it amounts to about a 3 second increase out of 280, or about 1% overall.

This is offset by the speed up we get by converting the -wrestrict pass to use the range.  That pass speeds up to 15%  of its original time. Yes, its an 85% reduction because it only calculates the few ranges it requires.

The end result of all patches being applied is we find overall compile time to typically be slightly faster with these changes than before, plus we are solving more cases within EVRP.  This data is backed up by a callgrind analysis of a compile where less insutrctions end up being executed with ths path.



Regardless, our goal with this submission is to enable other passes to begin using the rangers functionality now, as well as getting it regularly exercised while we move toward a complete VRP replacement.  The more we replace, the faster things will get :-)

The idea is that we run in hybrid mode for the rest of stage1 which fully exercises both engines, and see where we end up early in the next stage.  We can assess based on functionality vs speed whether we want EVRP to run in classic mode, hybrid mode, or rvrp only mode for the next release.

During the remainder of stage 1, we will work on a combination of performance and functionality, and then once stage 1 closes, we'll focus primarily on performance.  We have a laundry list of things to attack and we hope to get most of the lost time back.

I will provide the ranger and the hybrid EVRP patches, and Aldy will follow on with the different optimization conversions.

These patches bootstrap, pass all regressions, and additionally have gone thru an entire build of fedora in the past week, in hybrid mode.  At least on our branch :-)

More details in each of the patchsets.

I'll let these sit here for a few days, and barring issues, will re-post  the latest versions early next week along with any requires testcase tweaks for hybrid mode,  and check them in. (we're still tracking down a fedora build failure, and i have a couple of testcases to look at with the latest merge from trunk)

Andrew












Reply via email to