On Wed, Aug 10, 2011 at 12:29 PM, Paulo J. Matos <pa...@matos-sorge.com> wrote:
> Hi,
>
> I am having a size optimisation issue with GCC-4.6.1.
> The problem boils down to the fact that I have no idea on the best way to
> hint to GCC that a given insn would make more sense someplace else.
>
> The C code is simple:
> int16_t mask(uint32_t a)
> {
>    return (x & a) == a;
> }
>
> int16_t is QImode and uint32_t is HImode.
> After combine the insn chain (which is unmodified all the way to ira) is (in
> simplified form):
> regQI 27 <- regQI AH [a]
> regQI 28 <- regQI AL [a+1]
> regQI AL <- andQI(regQI 28, memQI(symbolrefQI(x) + 1))
> regQI AH <- andQI(regQI 27, memQI(symbolrefQI(x))
> regQI 30 <- regQI AL
> regQI 29 <- regQI AH
> regQI 24 <- 1
> if regQI 29 != regQI 27
>   goto labelref 20
> if regQI 30 != regQI 28
>   goto labelref 20
> goto labelref 22
> labelref 20
> regQI 24 <- 0
> labelref 22
> regQI AL <- regQI 24
>
> The problem resides in `regQI 24 <- 1' being before the jumps.
> Since regQI 24 is going to AL, IRA decides to allocate regQI 24 to AL, which
> creates loads of conflicts and reloads. If that same insn would be moved to
> after the jumps and before the `goto labelref 22' then all would be fine
> cause by then regs 27, 28, 29, 30 are dead.
>
> It's obviously hard to point to a solution but I was wondering if there's a
> way to hint to GCC that moving an insn might help the code issue. Or if I
> should look into a why an existing pass is not already doing that.

On x86 we expand the code to ((xl & al) ^ al) | ((xh & ah) ^ ah) == 0
which is then if-converted.  Modified testcase:

long long x;
_Bool __attribute__((regparm(2))) mask (long long a)
{
  return (x & a) == a;
}

on i?86 gets you

mask:
.LFB0:
        .cfi_startproc
        pushl   %ebx
        .cfi_def_cfa_offset 8
        .cfi_offset 3, -8
        movl    %eax, %ebx
        andl    x, %ebx
        movl    %edx, %ecx
        andl    x+4, %ecx
        xorl    %ebx, %eax
        xorl    %ecx, %edx
        orl     %edx, %eax
        sete    %al
        popl    %ebx
        .cfi_restore 3
        .cfi_def_cfa_offset 4
        ret

so I wonder if you should investigate why the xor variant doesn't trigger
for you?  On i?86 if-conversion probably solves your specific issue,
but I guess the initial expansion is where you could improve placement
of the 1 (after all, the 0 is after the jumps).

Richard.

> Cheers,
>
> --
> PMatos
>
>

Reply via email to