Segher Boessenkool <seg...@kernel.crashing.org> writes: > On Wed, Aug 19, 2020 at 01:10:36PM +0100, Richard Sandiford wrote: >> Segher Boessenkool <seg...@kernel.crashing.org> writes: >> > When the compgotos pass copies the tail of blocks ending in an indirect >> > jump, there is a micro-optimization to not copy the last one, since the >> > original block will then just be deleted. This does not work properly >> > if cleanup_cfg does not merge all pairs of blocks we expect it to. >> > >> > >> > v2: This also deletes the other use of single_pred_p, which has the same >> > problem in principle, I just never have triggered it so far. >> >> Could you go into more details? I thought the reason you wanted to >> remove the test in the loop was that each successful iteration of the >> loop removes one incoming edge, meaning that if all previous iterations >> succeed, the final iteration is bound to see a single predecessor. >> And there's no guarantee in that case that the jump in the final >> block will be removed. I.e. the loop is unnecessarily leaving >> cfgcleanup to finish off the last step for it. So I agree that the >> loop part looks good. (I assume that was v1, but I missed it, sorry.) > > I didn't Cc: you, I should have :-) > > Yes, that is part of the problem. The case where it was noticed however > was some release of GCC (some 9 release iirc) that did something wrong > with cfg_cleanup apparenetly, so not everything was threaded optimally, > making duplicate_computed_gotos run into jumps to jumps and stopping. > Since it is such an important pass (for performance, for things with > bigger jump tables or with threaded FSMs etc.), it seemed a good idea to > make it more robust instead of slightly faster. > >> But it looks like the point of the initial test is to avoid “duplicating” >> the block if there would still only be one copy of the code. That test >> still seems reasonable since the case it rejects should be handled by >> other cfg cleanups. If for some reason it isn't, there doesn't seem >> to be any reason to apply the max_size limit for a single predecessor, >> since no code duplication will take place. So ISTM that removing the >> first condition is really a separate thing that should be its own patch. > > I have never seen the second case misfiring in practice, only the first > one!
Shucks, I guessed the wrong way round :-) I'd argue that the first check isn't a micro-optimisation though. It's testing whether the duplication really is a duplication (rather than a move). The second test does look like a micro-optimisation. > The original patch is at > https://gcc.gnu.org/pipermail/gcc-patches/2020-August/551560.html > btw. I'll prepare a testcase for that (it doesn't trigger with recent > GCC versions though :-/ ) Hmm, OK. If the first check is the important one, then I think it'd be better to make the max_size stuff conditional on !single_pred_p rather than drop the test entirely. Thanks, Richard