Here are SPEC CPU 2000 results with plain trunk and the two alias-oracle patches. Base results are plain -O3 -ffast-math, peak results include --param max-fields-for-field-sensitive=0 which effectively disables the creation of SFTs.
Unpatched (three runs): Estimated Estimated Base Base Base Peak Peak Peak Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio ======================================================================== 164.gzip 1400 100 1400 * 1400 99.9 1402 * 175.vpr 1400 80.3 1742 * 1400 80.5 1738 * 176.gcc 1100 48.1 2288 * 1100 46.8 2353 * 181.mcf 1800 131 1371 * 1800 131 1370 * 186.crafty 1000 38.0 2635 * 1000 36.6 2732 * 197.parser 1800 134 1348 * 1800 133 1353 * 252.eon X X 253.perlbmk 1800 70.8 2541 * 1800 70.4 2557 * 254.gap 1100 57.3 1921 * 1100 57.1 1925 * 255.vortex X X 256.bzip2 1500 79.3 1892 * 1500 79.9 1877 * 300.twolf 3000 114 2635 * 3000 114 2633 * Est. SPECint_base2000 1914 Est. SPECint2000 1927 Estimated Estimated Base Base Base Peak Peak Peak Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio ======================================================================== 168.wupwise 1600 79.9 2002* 1600 80.0 2000* 171.swim 3100 155 1999* 3100 155 1999* 172.mgrid 1800 98.6 1825* 1800 98.4 1829* 173.applu 2100 178 1178* 2100 178 1181* 177.mesa 1400 57.8 2421* 1400 58.1 2411* 178.galgel 2900 69.0 4204* 2900 69.0 4203* 179.art 2600 34.7 7482* 2600 34.1 7617* 183.equake 1300 74.1 1755* 1300 74.0 1757* 187.facerec 1900 75.3 2523* 1900 75.3 2522* 188.ammp 2200 119 1845* 2200 119 1843* 189.lucas 2000 119 1688* 2000 118 1697* 191.fma3d 2100 132 1590* 2100 131 1598* 200.sixtrack 1100 120 919* 1100 120 918* 301.apsi 2600 171 1518* 2600 172 1509* Est. SPECfp_base2000 2029 Est. SPECfp2000 2032 Patched (three runs): Estimated Estimated Base Base Base Peak Peak Peak Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio ======================================================================== 164.gzip 1400 100 1400 * 1400 99.9 1401 * 175.vpr 1400 80.0 1751 * 1400 80.1 1749 * 176.gcc 1100 47.4 2319 * 1100 46.8 2352 * 181.mcf 1800 133 1358 * 1800 133 1349 * 186.crafty 1000 37.6 2656 * 1000 36.8 2718 * 197.parser 1800 133 1350 * 1800 133 1349 * 252.eon X X 253.perlbmk 1800 70.4 2557 * 1800 70.0 2573 * 254.gap 1100 57.3 1918 * 1100 57.4 1918 * 255.vortex X X 256.bzip2 1500 79.9 1877 * 1500 80.6 1862 * 300.twolf 3000 114 2641 * 3000 114 2638 * Est. SPECint_base2000 1918 Est. SPECint2000 1923 Estimated Estimated Base Base Base Peak Peak Peak Benchmarks Ref Time Run Time Ratio Ref Time Run Time Ratio ======================================================================== 168.wupwise 1600 80.2 1995* 1600 80.1 1998* 171.swim 3100 156 1993* 3100 155 1994* 172.mgrid 1800 98.7 1824* 1800 98.6 1826* 173.applu 2100 178 1178* 2100 178 1178* 177.mesa 1400 57.8 2422* 1400 57.9 2417* 178.galgel 2900 69.3 4188* 2900 69.2 4191* 179.art 2600 36.8 7063* 2600 33.5 7762* 183.equake 1300 74.0 1756* 1300 74.1 1754* 187.facerec 1900 76.0 2500* 1900 74.0 2569* 188.ammp 2200 119 1846* 2200 119 1845* 189.lucas 2000 117 1706* 2000 117 1703* 191.fma3d 2100 130 1612* 2100 129 1633* 200.sixtrack 1100 120 920* 1100 119 921* 301.apsi 2600 173 1505* 2600 174 1498* Est. SPECfp_base2000 2020 Est. SPECfp2000 2039 you can see that in both cases the runs without SFTs are significantly better(!) Which hints at the fact that we do a poor job with parititoning and/or that partitioning triggers earlier with SFTs enabled. The oracle patches are able to slightly improve the results in the non-SFT case, but overall there is less difference patched vs. unpatched compared to the differences that result if you disable SFTs. If you compare testresults with SFTs disabled unpatched vs. patched you can see that the oracle patches can retain optimizations that were only possible with SFTs previously (uninteresting parts snipped, full testsuite for all default languages was run, -m32 results only if they differ from -m64 results): unpatched, SFTs disabled: === g++ tests === Running target unix/ FAIL: g++.dg/torture/pr34850.C -O0 (test for warnings, line 14) FAIL: g++.dg/torture/pr34850.C -O1 (test for warnings, line 14) FAIL: g++.dg/torture/pr34850.C -O2 (test for warnings, line 14) FAIL: g++.dg/torture/pr34850.C -O3 -fomit-frame-pointer (test for warnings, line 14) FAIL: g++.dg/torture/pr34850.C -O3 -g (test for warnings, line 14) FAIL: g++.dg/torture/pr34850.C -Os (test for warnings, line 14) === g++ Summary for unix/ === # of expected passes 17440 # of unexpected failures 6 # of expected failures 82 # of unsupported tests 119 === gcc tests === Running target unix/ FAIL: gcc.dg/tree-ssa/alias-10.c scan-tree-dump optimized "return 3;" FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump salias "SFT.5 created for var m offset 128" FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump-times salias "VUSE <SFT.5_" 2 FAIL: gcc.dg/tree-ssa/alias-3.c scan-tree-dump optimized "return 1;" FAIL: gcc.dg/tree-ssa/alias-4.c scan-tree-dump optimized "return 1;" FAIL: gcc.dg/tree-ssa/alias-5.c scan-tree-dump optimized "return 1;" FAIL: gcc.dg/tree-ssa/ldist-4.c scan-tree-dump-times ldist "distributed: split to 2 loops" 0 FAIL: gcc.dg/tree-ssa/loadpre8.c scan-tree-dump-times pre "Eliminated: 1" 1 FAIL: gcc.dg/tree-ssa/pr26421.c scan-tree-dump-times salias "VDEF" 4 FAIL: gcc.dg/tree-ssa/salias-1.c scan-tree-dump-times salias "structure field tag SFT" 2 FAIL: gcc.dg/tree-ssa/structopt-1.c scan-tree-dump-times lim "Executing store motion of global.y" 1 FAIL: gcc.dg/tree-ssa/structopt-2.c scan-tree-dump-times optimized "a.e" 0 FAIL: gcc.dg/tree-ssa/structopt-2.c scan-tree-dump-times optimized "a.f" 0 FAIL: gcc.dg/tree-ssa/structopt-2.c scan-tree-dump-times optimized "a.g" 0 FAIL: gcc.dg/tree-ssa/structopt-3.c scan-tree-dump-times optimized "return 11" 1 === gcc Summary === # of expected passes 97489 # of unexpected failures 41 # of expected failures 335 # of untested testcases 70 # of unsupported tests 839 /space/rguenther/obj/gcc/xgcc version 4.4.0 20080304 (experimental) (GCC) Patched results: === g++ tests === Running target unix/ FAIL: g++.dg/tree-ssa/pr34355.C (test for excess errors) === g++ Summary for unix/ === # of expected passes 17445 # of unexpected failures 1 # of expected failures 82 # of unsupported tests 119 === gcc tests === Running target unix/ FAIL: gcc.dg/autopar/parallelization-1.c (internal compiler error) FAIL: gcc.dg/autopar/parallelization-1.c (test for excess errors) FAIL: gcc.dg/autopar/parallelization-1.c scan-tree-dump-times final_cleanup "loopfn" 5 FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump salias "SFT.5 created for var m offset 128" FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump-times salias "VUSE <SFT.5_" 2 FAIL: gcc.dg/tree-ssa/ldist-4.c scan-tree-dump-times ldist "distributed: split to 2 loops" 0 FAIL: gcc.dg/tree-ssa/loop-32.c scan-tree-dump-times lim "Executing store motion of" 7 FAIL: gcc.dg/tree-ssa/pr26421.c scan-tree-dump-times salias "VDEF" 4 FAIL: gcc.dg/tree-ssa/salias-1.c scan-tree-dump-times salias "structure field tag SFT" 2 === gcc Summary for unix/ === # of expected passes 48691 # of unexpected failures 15 # of expected failures 166 # of untested testcases 35 # of unsupported tests 478 Running target unix//-m32 FAIL: gcc.dg/autopar/parallelization-1.c (internal compiler error) FAIL: gcc.dg/autopar/parallelization-1.c (test for excess errors) FAIL: gcc.dg/autopar/parallelization-1.c scan-tree-dump-times final_cleanup "loopfn" 5 FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump salias "SFT.5 created for var m offset 128" FAIL: gcc.dg/tree-ssa/alias-15.c scan-tree-dump-times salias "VUSE <SFT.5_" 2 FAIL: gcc.dg/tree-ssa/pr26421.c scan-tree-dump-times salias "VDEF" 4 FAIL: gcc.dg/tree-ssa/salias-1.c scan-tree-dump-times salias "structure field tag SFT" 2 === gcc Summary for unix//-m32 === # of expected passes 48839 # of unexpected failures 13 # of expected failures 167 # of untested testcases 35 # of unsupported tests 361 Some of the fails with SFTs disabled are actually because the testcases scan for SFTs in the dumps, which are obviously not available. Those tests need to be disabled or adjusted to test optimization outcome instead. Thus, with the above results I propose we disable generating SFTs by default on the mainline (--para max-fields-for-field-sensitive=100 is still available for comparision). I will prepare a patch to adjust the false negative testcases above to check for optimization outcome as well. Richard.