On Thu, Aug 6, 2020 at 10:40 AM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
>
> Many thanks for the review and feedback.  Here's the final version as 
> committed,
> with both the test cases requested by Richard Biener and your 
> suggestion/request
> to use ix86_expand_clear.  Tested again on x86_64-pc-linux-gnu.
>
> Thank you again for the fantastic ix86_expand_clear pointer, which cleared up 
> one
> of two technical questions I had, and allowed this peephole2 to now also 
> apply to
> QImode and HImode MOV0s, where my original version was limited to SImode and
> DImode.
>
> My two questions were (i) why a QImode set of 0 with a flags clobber isn't a 
> recognized
> instruction?  I'd assume that on some architectures "xorb dl,dl" might be an 
> appropriate
> sequence to use.  This is mostly answered by the use of ix86_expand_clear, 
> which
> intelligently selects the correct form, but the lack of a *movqi_xor was 
> previously odd.

XOR transformation is used mostly due to code size, where we have:

   0:   b0 00                   mov    $0x0,%al
   2:   30 c0                   xor    %al,%al
   4:   bb 00 00 00 00          mov    $0x0,%ebx
   9:   31 db                   xor    %ebx,%ebx

So, as can be seen from the above example, there is no benefit for
QImode, where 3 bytes can be saved for SImode.

> (ii) My other question, was that despite my best efforts I couldn't seem to 
> convince GCC
> to generate/use a *movsi_or to load the constant -1.  It was just a 
> curiosity, but this
> would affect/benefit the smaxm1 and sminm1 examples in the new 
> i386/minmax-10.c

This transformation is enabled only for -Os or

DEF_TUNE (X86_TUNE_MOVE_M1_VIA_OR, "move_m1_via_or", m_PENT | m_LAKEMONT)

However, clearing the register with xor reg,reg also prevents partial
reg stall, where or -1, reg does not.

Uros.

Reply via email to