Hi, On Mon, 16 Nov 2015, Jeff Law wrote:
> OK, if you want to keep them, then have a consistent way to turn them > on/off for future debugging. if0/if1 doesn't provide much of a clue to > someone else what to turn on/off if they need to debug this stuff. > > > I don't see any negative tests -- ie tests that should not be split > > > due to boundary conditions. Do you have any from development? > > > > Good point, I had some but only ones where I was able to extend the > > splitters to cover them. I'll think of some that really shouldn't be > > split. > If you've got them, certainly add them. Though I realize they may get > lost over time. Actually, thinking a bit more about this, I don't have any that wouldn't be merely restrictions in the implementation that couldn't be lifted in the future (e.g. unequal step sizes), so I've added no additional ones. > But in that case, the immediate dominator of pre2 & join is still the > initial if statement. So I think we're OK. That was the conclusion I > was starting to come to yesterday, having the ascii art makes it pretty > clear. I'm just not good at conceptualizing a CFG. I have to see it > explicitly and then everything seems so clear and simple. So, this second version should reflect the review. I've moved everything to a new file, split the long function into several logically separate ones, and even included ascii art in the comments :) The testcase got a comment about what to #define for debugging. I've included the pass to -O3 or alternatively if profile-use is on, similar to funswitch-loops. I've also added a proper -fsplit-loops option. There's two functional changes in v2: a bugfix to not try splitting a non-iterating loop (irritatingly such a look returns true from number_of_iterations_exit, but with an ERROR_MARK comparator), and a limitation to avoid combinatorical explosion in artificial testcases: Once we have done a splitting, we don't do any in that loops parents (we may still do splitting in siblings or childs of siblings). I've also done some measurements: first, bootstrap time is unaffected, and regstrapping succeeds without regressions when I activate the pass by default. Then SPECcpu2006: build times are unaffected, everything builds and works also with -fsplit-loops, performance is mostly unaffected, base is -Ofast -funroll-loops -fpeel-loops, peak adds -fsplit-loops. Estimated Estimated Base Base Base Peak Peak Peak Benchmarks Ref. Run Time Ratio Ref. Run Time Ratio -------------- ------ --------- --------- ------ --------- --------- 400.perlbench 9770 325 30.1 * 9770 323 30.3 * 401.bzip2 9650 382 25.2 * 9650 382 25.3 * 403.gcc 8050 242 33.3 * 8050 241 33.4 * 429.mcf 9120 311 29.3 * 9120 311 29.3 * 445.gobmk 10490 392 26.8 * 10490 391 26.8 * 456.hmmer 9330 345 27.0 * 9330 342 27.3 * 458.sjeng 12100 422 28.7 * 12100 420 28.8 * 462.libquantum 20720 308 67.3 * 20720 308 67.3 * 464.h264ref 22130 423 52.3 * 22130 423 52.3 * 471.omnetpp 6250 273 22.9 * 6250 273 22.9 * 473.astar 7020 311 22.6 * 7020 311 22.6 * 483.xalancbmk 6900 191 36.2 * 6900 190 36.2 * Est. SPECint_base2006 31.7 Est. SPECint2006 31.7 Estimated Estimated Base Base Base Peak Peak Peak Benchmarks Ref. Run Time Ratio Ref. Run Time Ratio -------------- ------ --------- --------- ------ --------- --------- 410.bwaves 13590 235 57.7 * 13590 235 57.8 * 416.gamess NR NR 433.milc 9180 347 26.5 * 9180 345 26.6 * 434.zeusmp 9100 269 33.9 * 9100 268 33.9 * 435.gromacs 7140 260 27.4 * 7140 262 27.3 * 436.cactusADM 11950 237 50.5 * 11950 240 49.9 * 437.leslie3d 9400 228 41.3 * 9400 228 41.2 * 444.namd 8020 312 25.7 * 8020 311 25.7 * 447.dealII 11440 254 45.0 * 11440 254 45.0 * 450.soplex 8340 201 41.4 * 8340 202 41.4 * 453.povray NR NR 454.calculix 8250 282 29.2 * 8250 283 29.2 * 459.GemsFDTD 10610 310 34.3 * 10610 309 34.3 * 465.tonto 9840 683 14.4 * 9840 684 14.4 * 470.lbm 13740 224 61.2 * 13740 224 61.3 * 481.wrf 11170 291 38.4 * 11170 291 38.4 * 482.sphinx3 19490 377 51.7 * 19490 377 51.6 * Est. SPECfp_base2006 36.3 Est. SPECfp2006 36.3 The 1% improvements and degradations are all inside the normal result variations on this machine (I have the feeling that the hmmer improvement is stable, and will recheck this). Not all of the above had loops split at all, only: SPECint: 400.perlbench, 403.gcc, 445.gobmk, 456.hmmer, 462.libquantum, 464.h264ref, 471.omnetpp and SPECfp: 435.gromacs, 436.cactusADM, 447.dealII, 454.calculix. So, okay for trunk? Ciao, Michael.