On 11/11/21 08:15, Richard Biener wrote:
So I'd try to do no functional change first, improving the costing and setting up the transform to simply pick up the stmts to "fold" as discovered during analysis (as I hinted you possibly can use gimple_uid to mark the stmts that simplify, IIRC gimple_uid is preserved during copying. gimple_uid would also scale better than gimple_plf in case we do the analysis for all candidates at once).
Thinking about the analysis. Am I correct that we want to properly calculate loop size for true and false edge of a potential gcond before the actually unswitching? We can do that by finding a first gcond candidate, evaluate (symbolic + irange approache) all other gcond in the loop body and use BB_REACHABLE discovery. Similarly to what we do now at lines 378-446. Then tree_num_loop_insns can be adjusted for only these reachable blocks. Having that, we can calculate # of insns that will live in true/false loops. Then we can call tree_unswitch_loop and make the gcond folding as we do in the versioned loops. Is it a step in good direction? Having that we can then extend it to gswitch statements. Cheers, Martin