On Thu, Aug 6, 2020 at 10:40 AM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Uros, > > Many thanks for the review and feedback. Here's the final version as > committed, > with both the test cases requested by Richard Biener and your > suggestion/request > to use ix86_expand_clear. Tested again on x86_64-pc-linux-gnu. > > Thank you again for the fantastic ix86_expand_clear pointer, which cleared up > one > of two technical questions I had, and allowed this peephole2 to now also > apply to > QImode and HImode MOV0s, where my original version was limited to SImode and > DImode. > > My two questions were (i) why a QImode set of 0 with a flags clobber isn't a > recognized > instruction? I'd assume that on some architectures "xorb dl,dl" might be an > appropriate > sequence to use. This is mostly answered by the use of ix86_expand_clear, > which > intelligently selects the correct form, but the lack of a *movqi_xor was > previously odd.
XOR transformation is used mostly due to code size, where we have: 0: b0 00 mov $0x0,%al 2: 30 c0 xor %al,%al 4: bb 00 00 00 00 mov $0x0,%ebx 9: 31 db xor %ebx,%ebx So, as can be seen from the above example, there is no benefit for QImode, where 3 bytes can be saved for SImode. > (ii) My other question, was that despite my best efforts I couldn't seem to > convince GCC > to generate/use a *movsi_or to load the constant -1. It was just a > curiosity, but this > would affect/benefit the smaxm1 and sminm1 examples in the new > i386/minmax-10.c This transformation is enabled only for -Os or DEF_TUNE (X86_TUNE_MOVE_M1_VIA_OR, "move_m1_via_or", m_PENT | m_LAKEMONT) However, clearing the register with xor reg,reg also prevents partial reg stall, where or -1, reg does not. Uros.