Author: Florian Hahn Date: 2020-08-13T13:10:14+01:00 New Revision: 9e46fcc34c01386e5142e1630ad02e1284c49c67
URL: https://github.com/llvm/llvm-project/commit/9e46fcc34c01386e5142e1630ad02e1284c49c67 DIFF: https://github.com/llvm/llvm-project/commit/9e46fcc34c01386e5142e1630ad02e1284c49c67.diff LOG: [DSE,MSSA] Cache accesses with/without reachable read-clobbers. Summary: Currently we repeatedly check the same uses for read clobbers in some cases. We can avoid unnecessary checks by keeping track of the memory accesses we already found read clobbers for. To do so, we just add memory access causing read-clobbers to a set. Note that marking all visited accesses as read-clobbers would be to pessimistic, as that might include accesses not on any path to the actual read clobber. If we do not find any read-clobbers, we can add all visited instructions to another set and use that to skip the same accesses in the next call. I did not yet measure compile-time, but below is the impact on the number of iterations in getDomMemoryDef: Metric: dse.NumDomMemDefChecks Program base patch diff test-suite...000/183.equake/183.equake.test 132580.00 26961.00 -79.7% test-suite...T95/147.vortex/147.vortex.test 881946.00 297521.00 -66.3% test-suite...000/255.vortex/255.vortex.test 882090.00 297594.00 -66.3% test-suite...T2006/445.gobmk/445.gobmk.test 700940.00 247624.00 -64.7% test-suite...ications/JM/ldecod/ldecod.test 990956.00 357584.00 -63.9% test-suite...C/CFP2000/179.art/179.art.test 23014.00 8364.00 -63.7% test-suite...marks/SciMark2-C/scimark2.test 20939.00 8230.00 -60.7% test-suite.../CINT2006/403.gcc/403.gcc.test 2412386.00 951674.00 -60.6% test-suite...006/447.dealII/447.dealII.test 1850445.00 796042.00 -57.0% test-suite...006/453.povray/453.povray.test 1735262.00 753271.00 -56.6% test-suite...ProxyApps-C++/CLAMR/CLAMR.test 393888.00 172514.00 -56.2% test-suite...ProxyApps-C++/HPCCG/HPCCG.test 42350.00 18931.00 -55.3% test-suite...000/186.crafty/186.crafty.test 371608.00 167669.00 -54.9% test-suite...langs-C/unix-tbl/unix-tbl.test 10263.00 4763.00 -53.6% test-suite.../CINT2000/176.gcc/176.gcc.test 1641688.00 763696.00 -53.5% test-suite...ications/JM/lencod/lencod.test 1459213.00 679907.00 -53.4% test-suite...FreeBench/distray/distray.test 10477.00 5113.00 -51.2% test-suite.../Trimaran/enc-md5/enc-md5.test 2651.00 1295.00 -51.2% test-suite...s/Rodinia/hotspot/hotspot.test 4031.00 1989.00 -50.7% test-suite...T2006/401.bzip2/401.bzip2.test 171479.00 85496.00 -50.1% test-suite...lowfish/security-blowfish.test 6217.00 3143.00 -49.4% test-suite...rks/FreeBench/mason/mason.test 1386.00 712.00 -48.6% test-suite...yApps-C++/PENNANT/PENNANT.test 135316.00 71201.00 -47.4% test-suite...ks/McCat/04-bisect/bisect.test 3353.00 1801.00 -46.3% test-suite...6/464.h264ref/464.h264ref.test 1500143.00 810226.00 -46.0% test-suite...marks/7zip/7zip-benchmark.test 1278779.00 711387.00 -44.4% test-suite...lications/viterbi/viterbi.test 11564.00 6497.00 -43.8% test-suite...plications/d/make_dparser.test 54338.00 31158.00 -42.7% test-suite...ternal/HMMER/hmmcalibrate.test 75317.00 43258.00 -42.6% test-suite...lications/ClamAV/clamscan.test 447833.00 258126.00 -42.4% test-suite...lications/SIBsim4/SIBsim4.test 32896.00 19381.00 -41.1% test-suite...6/482.sphinx3/482.sphinx3.test 62177.00 37137.00 -40.3% test-suite...nia/pathfinder/pathfinder.test 1322.00 795.00 -39.9% test-suite...math/automotive-basicmath.test 146.00 88.00 -39.7% test-suite...ngs-C/simulator/simulator.test 18187.00 10982.00 -39.6% test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test 3040.00 1839.00 -39.5% test-suite...T2006/473.astar/473.astar.test 37795.00 22910.00 -39.4% test-suite...CFP2006/444.namd/444.namd.test 97627.00 59229.00 -39.3% test-suite...-typeset/consumer-typeset.test 802759.00 488393.00 -39.2% test-suite...T2006/456.hmmer/456.hmmer.test 96748.00 58957.00 -39.1% test-suite...quoia/CrystalMk/CrystalMk.test 9157.00 5669.00 -38.1% test-suite.../Benchmarks/Olden/mst/mst.test 1054.00 653.00 -38.0% test-suite :: External/Nurbs/nurbs.test 14763.00 9177.00 -37.8% test-suite...CFP2000/188.ammp/188.ammp.test 84072.00 53009.00 -36.9% test-suite...chmarks/MallocBench/gs/gs.test 98221.00 61993.00 -36.9% test-suite...rks/FreeBench/pifft/pifft.test 15560.00 9888.00 -36.5% test-suite...ks/McCat/01-qbsort/qbsort.test 497.00 323.00 -35.0% test-suite...INT2000/164.gzip/164.gzip.test 30328.00 19767.00 -34.8% test-suite...lications/obsequi/Obsequi.test 44239.00 29195.00 -34.0% test-suite.../CINT2000/252.eon/252.eon.test 340333.00 224748.00 -34.0% test-suite.../Prolangs-C++/trees/trees.test 2573.00 1715.00 -33.3% test-suite...TimberWolfMC/timberwolfmc.test 190315.00 126890.00 -33.3% test-suite...nch/fourinarow/fourinarow.test 1487.00 994.00 -33.2% test-suite...SPEC/CINT95/099.go/099.go.test 256287.00 173249.00 -32.4% test-suite...s-C/Pathfinder/PathFinder.test 8366.00 5708.00 -31.8% test-suite...5/124.m88ksim/124.m88ksim.test 62116.00 42425.00 -31.7% test-suite...arks/McCat/17-bintr/bintr.test 154.00 107.00 -30.5% test-suite...langs-C/football/football.test 17864.00 12416.00 -30.5% test-suite...lications/minisat/minisat.test 9504.00 6676.00 -29.8% test-suite...rks/tramp3d-v4/tramp3d-v4.test 1275472.00 896318.00 -29.7% test-suite...s/FreeBench/neural/neural.test 1534.00 1087.00 -29.1% test-suite...arks/mafft/pairlocalalign.test 259748.00 185421.00 -28.6% test-suite...comm-adpcm/telecomm-adpcm.test 53.00 38.00 -28.3% test-suite...adpcm/rawdaudio/rawdaudio.test 53.00 38.00 -28.3% test-suite.../Benchmarks/Olden/tsp/tsp.test 2522.00 1834.00 -27.3% test-suite...:: External/Povray/povray.test 650268.00 476041.00 -26.8% In some cases this also increase the number of eliminated stores, because we can explore further. Note that there is a small regression which I should track down. Metric: dse.NumFastStores Program base patch diff test-suite...T2006/445.gobmk/445.gobmk.test 82.00 123.00 50.0% test-suite...C/CFP2000/179.art/179.art.test 6.00 7.00 16.7% test-suite...math/automotive-basicmath.test 7.00 8.00 14.3% test-suite...ngs-C/assembler/assembler.test 8.00 9.00 12.5% test-suite...ks/Prolangs-C/gnugo/gnugo.test 9.00 10.00 11.1% test-suite...INT95/132.ijpeg/132.ijpeg.test 18.00 20.00 11.1% test-suite...langs-C/football/football.test 10.00 11.00 10.0% test-suite...ce/Benchmarks/Olden/bh/bh.test 13.00 14.00 7.7% test-suite...ications/JM/ldecod/ldecod.test 382.00 402.00 5.2% test-suite...000/183.equake/183.equake.test 40.00 38.00 -5.0% test-suite...6/482.sphinx3/482.sphinx3.test 22.00 23.00 4.5% test-suite...T95/147.vortex/147.vortex.test 215.00 224.00 4.2% test-suite...000/255.vortex/255.vortex.test 217.00 226.00 4.1% test-suite...SPEC/CINT95/099.go/099.go.test 63.00 65.00 3.2% test-suite.../Benchmarks/nbench/nbench.test 76.00 78.00 2.6% test-suite...lications/sqlite3/sqlite3.test 153.00 157.00 2.6% test-suite...INT2000/164.gzip/164.gzip.test 39.00 40.00 2.6% test-suite...ications/JM/lencod/lencod.test 840.00 854.00 1.7% test-suite...marks/7zip/7zip-benchmark.test 1211.00 1231.00 1.7% test-suite...6/464.h264ref/464.h264ref.test 730.00 741.00 1.5% test-suite...006/453.povray/453.povray.test 1417.00 1437.00 1.4% test-suite...lications/ClamAV/clamscan.test 230.00 233.00 1.3% test-suite.../Applications/SPASS/SPASS.test 156.00 158.00 1.3% test-suite...0.perlbench/400.perlbench.test 861.00 871.00 1.2% test-suite.../CINT2000/176.gcc/176.gcc.test 879.00 889.00 1.1% test-suite...nsumer-lame/consumer-lame.test 100.00 101.00 1.0% test-suite...:: External/Povray/povray.test 1220.00 1231.00 0.9% test-suite...ocBench/espresso/espresso.test 115.00 116.00 0.9% test-suite...chmarks/MallocBench/gs/gs.test 116.00 117.00 0.9% test-suite...5/124.m88ksim/124.m88ksim.test 116.00 117.00 0.9% test-suite...CI_Purple/SMG2000/smg2000.test 158.00 159.00 0.6% test-suite...000/186.crafty/186.crafty.test 158.00 159.00 0.6% test-suite...0/253.perlbmk/253.perlbmk.test 500.00 503.00 0.6% test-suite.../CINT2006/403.gcc/403.gcc.test 1178.00 1185.00 0.6% test-suite...CFP2000/188.ammp/188.ammp.test 181.00 182.00 0.6% test-suite.../CINT2000/252.eon/252.eon.test 2672.00 2685.00 0.5% test-suite...006/447.dealII/447.dealII.test 2117.00 2127.00 0.5% test-suite...-typeset/consumer-typeset.test 1047.00 1051.00 0.4% test-suite...rks/tramp3d-v4/tramp3d-v4.test 814.00 816.00 0.2% test-suite...3.xalancbmk/483.xalancbmk.test 1265.00 1267.00 0.2% test-suite.../Prolangs-C++/vcirc/vcirc.test 11.00 11.00 0.0% Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea Subscribers: Prazek, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D75025 Added: Modified: llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp Removed: ################################################################################ diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp index f5f17620e571..e426facc71f0 100644 --- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp +++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp @@ -88,6 +88,7 @@ STATISTIC(NumNoopStores, "Number of noop stores deleted"); STATISTIC(NumCFGChecks, "Number of stores modified"); STATISTIC(NumCFGTries, "Number of stores modified"); STATISTIC(NumCFGSuccess, "Number of stores modified"); +STATISTIC(NumDomMemDefChecks, "Number of iterations in getDomMemoryDef"); DEBUG_COUNTER(MemorySSACounter, "dse-memoryssa", "Controls which MemoryDefs are eliminated."); @@ -1513,6 +1514,18 @@ struct DSEState { /// basic block. DenseMap<BasicBlock *, InstOverlapIntervalsTy> IOLs; + struct CheckCache { + SmallPtrSet<MemoryAccess *, 16> KnownNoReads; + SmallPtrSet<MemoryAccess *, 16> KnownReads; + + bool isKnownNoRead(MemoryAccess *A) const { + return KnownNoReads.find(A) != KnownNoReads.end(); + } + bool isKnownRead(MemoryAccess *A) const { + return KnownReads.find(A) != KnownReads.end(); + } + }; + DSEState(Function &F, AliasAnalysis &AA, MemorySSA &MSSA, DominatorTree &DT, PostDominatorTree &PDT, const TargetLibraryInfo &TLI) : F(F), AA(AA), MSSA(MSSA), DT(DT), PDT(PDT), TLI(TLI) {} @@ -1743,7 +1756,8 @@ struct DSEState { Optional<MemoryAccess *> getDomMemoryDef(MemoryDef *KillingDef, MemoryAccess *Current, MemoryLocation DefLoc, bool DefVisibleToCallerBeforeRet, - bool DefVisibleToCallerAfterRet, int &ScanLimit) const { + bool DefVisibleToCallerAfterRet, CheckCache &Cache, + int &ScanLimit) const { MemoryAccess *DomAccess; bool StepAgain; LLVM_DEBUG(dbgs() << " trying to get dominating access for " << *Current @@ -1798,16 +1812,32 @@ struct DSEState { }; PushMemUses(DomAccess); + // Optimistically collect all accesses we for reads. If we do not find any + // read clobbers, add them to the cache. + SmallPtrSet<MemoryAccess *, 16> KnownNoReads; // Check if DomDef may be read. for (unsigned I = 0; I < WorkList.size(); I++) { MemoryAccess *UseAccess = WorkList[I]; - - LLVM_DEBUG(dbgs() << " " << *UseAccess); + NumDomMemDefChecks++; + LLVM_DEBUG(dbgs() << " Checking use " << *UseAccess); if (--ScanLimit == 0) { LLVM_DEBUG(dbgs() << "\n ... hit scan limit\n"); return None; } + // Check if we already visited this access. + if (Cache.isKnownNoRead(UseAccess)) { + LLVM_DEBUG(dbgs() << " ... skip, discovered that " << *UseAccess + << " is safe earlier.\n"); + continue; + } + if (Cache.isKnownRead(UseAccess)) { + LLVM_DEBUG(dbgs() << " ... bail out, discovered that " << *UseAccess + << " is has a read-clobber earlier.\n"); + return None; + } + KnownNoReads.insert(UseAccess); + if (isa<MemoryPhi>(UseAccess)) { LLVM_DEBUG(dbgs() << "\n ... adding PHI uses\n"); PushMemUses(UseAccess); @@ -1831,7 +1861,9 @@ struct DSEState { // Uses which may read the original MemoryDef mean we cannot eliminate the // original MD. Stop walk. if (isReadClobber(DefLoc, UseInst)) { - LLVM_DEBUG(dbgs() << " ... found read clobber\n"); + LLVM_DEBUG(dbgs() << " ... found read clobber\n"); + Cache.KnownReads.insert(UseAccess); + Cache.KnownReads.insert(Current); return None; } @@ -1944,6 +1976,7 @@ struct DSEState { return None; } + Cache.KnownNoReads.insert(KnownNoReads.begin(), KnownNoReads.end()); // No aliasing MemoryUses of DomAccess found, DomAccess is potentially dead. return {DomAccess}; } @@ -2159,6 +2192,7 @@ bool eliminateDeadStoresMemorySSA(Function &F, AliasAnalysis &AA, SetVector<MemoryAccess *> ToCheck; ToCheck.insert(KillingDef->getDefiningAccess()); + DSEState::CheckCache Cache; // Check if MemoryAccesses in the worklist are killed by KillingDef. for (unsigned I = 0; I < ToCheck.size(); I++) { Current = ToCheck[I]; @@ -2167,7 +2201,7 @@ bool eliminateDeadStoresMemorySSA(Function &F, AliasAnalysis &AA, Optional<MemoryAccess *> Next = State.getDomMemoryDef( KillingDef, Current, SILoc, DefVisibleToCallerBeforeRet, - DefVisibleToCallerAfterRet, ScanLimit); + DefVisibleToCallerAfterRet, Cache, ScanLimit); if (!Next) { LLVM_DEBUG(dbgs() << " finished walk\n"); _______________________________________________ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits