On Thu, Mar 14, 2024 at 05:16:59PM +0100, Jan Hubicka wrote: > Sorry, this was bit of a misunderstanding: I tought you still considered > the original patch to be full fix, while I tought I should look into it > more and dig out more issues. This is bit of can of worms. Overall I > think the plan is: > > This stage4 > 1) fix VR divergences by either comparing or droping them > 2) fix jump function differences by comparing them > (I just constructed a testcase showing that jump functions can > diverge for other reasons than just VR, so this is deeper problem, > too) > 3) try to construct aditional wrong code testcases (finite_p > divergences, trapping) > Next stage1 > 4) implement VR and PTR info streaming > 5) make ICF to compare VRs and punt otherwise > 6) implement optimize_size feature to ICF that will not punt on > diverging TBAA or value ranges and do merging instead. > We need some extra infrastructure for that, being able to keep the > maps between basic blocks and SSA names from comparsion stage to > merging stage > 7) measure how effective ICF becomes in optimize_size mode and implement > heuristics on how much metadata merging we want to do with -O2/-O3 as > well. > Based on older data Martin collected for his thesis, ICF used to be > must more effective on libreoffice then it is today, so hopefully we > can recover 10-15% binary size improvmeents here. > 8) General ICF TLC. There are many half finished things for a while in > this pass (such as merging with different BB or stmt orders).
Agreed. > I am attaching the compare patch which also fixes the original wrong > code. If you preffer your version, just go ahead to commit it. Seems both patches are the same size (at least when looking at number of added lines). I think I prefer my patch because it will make the LTO and non-LTO cases more similar which IMHO helps maintainance of the release branches. So, e.g. for all the other wrong-code issues like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907#c54 or https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907#c58 one can actually trigger them with both non-LTO and LTO, rather than only with LTO and for non-LTO not trigger because there are SSA_NAME_RANGE_INFO differences that prevent ICF. If we start streaming SSA_NAME_RANGE_INFO in GCC 15, that changes things and then we can decide what to do for differences, 5) or 6) or combinations thereof (e.g. consider not just if the ranges are different, but how much too or if one is a subset or superset of the other to decide between punting and unioning for non-Os). > Otherwise > I will add your testcase for this patch and commit this one. > Statistically we almost never merge functions with different value > ranges (three in testsuite, 0 during bootstrap, 1 during LTO bootstrap > and probably few in LLVM build - there are 15 cases reported, but some > are false positives caused by hash function conflicts). It is mostly the fnsplit functions, sure, because we don't really drop the __builtin_unreachable checks before IPA and so the if (cond) __builtin_unreachable (); style assertions or [[assume (cond)]]; still result in ICF failures. Jakub