https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68173
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu.org --- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> --- There is simply a _lot_ of CSE happening for these expressions. -O0 doesn't do any of that. -Og does and it's faster as -O1 as well. You can get worse than -O0 with -Og -fno-tree-fre for example. That takes 28s vs. the 7.5s at -O0. So it's really GIMPLE level CSE that fixes things up here (no you can't do -O0 -ftree-fre). I always wondered how much "optimization" people would accept at -O0 but instead of pursuing that I created -Og as "optimize unless debugging experience will be affected" (which may not be perfect in its implementation still). That said, we _do_ have to do something about our DF infrastructure (compressed bitmaps maybe?). This testcase is special compared to others in that it has a single big basic-block and thus it taxes the local DF problem implementation compared to those that blow up due to very many (also big) BBs that usually show the global DF problem is even worse.