https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565
--- Comment #3 from Wilco <wilco at gcc dot gnu.org> --- (In reply to Segher Boessenkool from comment #2) > Of course it first tried to do > > Failed to match this instruction: > (parallel [ > (set (reg:DI 101 [ _9 ]) > (ctz:DI (reg/v:DI 98 [ x ]))) > (set (reg:DI 100) > (ctz:DI (reg/v:DI 98 [ x ]))) > ]) > > so we could try to do that as just the ctz and then a register move, > and hope that move can be optimised away. But this is more expensive > if it can *not* be optimised (higher latency). Hrm. Yes if a sign/zero-extend is proven to be redundant, it should be replaced with a move - it's unlikely it could not be removed either by Combine or during register allocation. It seems to me this could happen with any instruction pair where it decides to forward substitute, but keep the original instruction. If the costs are identical, it's better to replace the 2nd instruction with a move. Would it already do this if say we counted moves as somewhat lower cost than ALU instructions?