================ @@ -2856,8 +2873,29 @@ void ExprEngine::processBranch( // conflicts with the widen-loop analysis option (which is off by // default). If we intend to support and stabilize the loop widening, // we must ensure that it 'plays nicely' with this logic. - if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) + if (!SkipTrueBranch || AMgr.options.ShouldWidenLoops) { Builder.generateNode(StTrue, true, PredN); + } else if (AMgr.options.LegacyInliningPrevention) { + // FIXME: There is an ancient and very arbitrary heuristic in + // `ExprEngine::processCFGBlockEntrance` which prevents all further + // inlining of a function if it finds an execution path within that + // function which reaches the `MaxBlockVisitOnPath` limit (a/k/a + // `analyzer-max-loop`, by default four iterations in a loop). Adding + // this "don't assume third iteration" logic significantly increased + // the analysis runtime on some inputs because less functions were + // arbitrarily excluded from being inlined, so more entrypoints used + // up their full allocated budget. As a hacky compensation for this, + // here we apply the "should not inline" mark in cases when the loop + // could potentially reach the `MaxBlockVisitOnPath` limit without the + // "don't assume third iteration" logic. This slightly overcompensates + // (activates if the third iteration can be entered, and will not + // recognize cases where the fourth iteration would't be completed), but + // should be good enough for practical purposes. + if (const LocationContext *LC = getInlinedLocationContext(Pred, G)) { + Engine.FunctionSummaries->markShouldNotInline( + LC->getStackFrame()->getDecl()); + } + } ---------------- NagyDonat wrote:
I put this behind a flag (which restores the original behavior by default) because it is not just "wasted" runtime, but offers a tradeoff between runtime and the depth/accuracy of the analysis. There are several other options like `max-nodes` or the high-level `mode` setting that offer analogous tradeoffs. However, I will refine this judgement when I check the result produced with and without this patch -- and if the difference is negligible, I'm more open to applying this chunk unconditionally. On a long term we will probably need a heuristic to "avoid inlining _large_ functions", but I strongly suspect that this loop-based hacky condition is not the "right" (natural, straightforward, robust etc.) way of formalizing this idea. I feel that the current bounds of the analysis scope are almost "accidental" (or at least _I_ don't know about any measurements that justify e.g. the default `max-nodes` budget value or the various "don't inline if ..." conditions) and I hope that somewhen I could spend some time on understanding and improving this area. Right now the outline of my plan roughly looks like: - (1) Publish this workaround ASAP to restore the "old normal" runtimes and minimize the disruption. - (2) Investigate the (runtime) costs and benefits of various inlining or exploration strategies, and publish alternative heuristics that express "_large_ functions" in different, more natural terms (e.g. block count, total # of nodes created etc. instead of "did it reach four iterations in a loop"). - (3) Adjust the default config: disable this `legacy-inlining-prevention` by default and instead enable other heuristics which _on average_ guarantee similar total runtime, but perhaps provide better coverage / more useful results / better robustness, ... - Even then the users would be able to return to this legacy config if their project is an outlier where this particular heuristic was especially aggressive and they liked it that way. https://github.com/llvm/llvm-project/pull/136720 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits