https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78103
--- Comment #15 from Segher Boessenkool <segher at gcc dot gnu.org> --- (In reply to Jakub Jelinek from comment #14) > (In reply to Segher Boessenkool from comment #13) > > (In reply to Jakub Jelinek from comment #10) > > > Unfortunately, it doesn't work for the #c0 testcase, after the combiner > > > splitter kicks in, the combiner doesn't even try that 4 insn combination. > > > > It does for me? > > But only in the unpatched gcc, no? Yes, of course. > For #c0 findLastSet I actually need to combine 5 original instructions, [...] That is not something we want to ever implement: 4 insns already is too expensive unless we try only the simplest, and/or only very specific combinations. > and > what I was hoping for is to first combine first 3 instructions into 2, > 9, 10 -> 12 to get rid of the useless sign-extension, You should be able to combine only 10 and 12 even, to a SImode xor followed by the sign extension (may not work out wrt costs, but it isn't even tried). Or, why is r86 DImode anyway? > the value is known to > be 0..63, so zero extension is fine, into 10 (bsr) and 12 (xor with zero > extend), which is what the #c9 patch does. > And then I was hoping 10, 12, 13 -> 14 would be attempted to be combined > because 13 is mov of a constant. But that doesn't happen because the 9, 10 > -> 12 combination with the #c9 patch throws away the 12 -> 10 LOG_LINKS and > doesn't add a new one, even when 10 is a setter of a fresh new pseudo and 12 > is the only use of that pseudo. This is only safe if it *is* a new pseudo, and even then, you need to prevent getting stuck somehow. insn 10 is the most problematic things here btw, having the same pseudo as input and as output (it is not the unique setter either). This happens in expand already, probably a machine pattern that forgets to create new registers where it should?