On Wed, Aug 10, 2011 at 12:29 PM, Paulo J. Matos <pa...@matos-sorge.com> wrote: > Hi, > > I am having a size optimisation issue with GCC-4.6.1. > The problem boils down to the fact that I have no idea on the best way to > hint to GCC that a given insn would make more sense someplace else. > > The C code is simple: > int16_t mask(uint32_t a) > { > return (x & a) == a; > } > > int16_t is QImode and uint32_t is HImode. > After combine the insn chain (which is unmodified all the way to ira) is (in > simplified form): > regQI 27 <- regQI AH [a] > regQI 28 <- regQI AL [a+1] > regQI AL <- andQI(regQI 28, memQI(symbolrefQI(x) + 1)) > regQI AH <- andQI(regQI 27, memQI(symbolrefQI(x)) > regQI 30 <- regQI AL > regQI 29 <- regQI AH > regQI 24 <- 1 > if regQI 29 != regQI 27 > goto labelref 20 > if regQI 30 != regQI 28 > goto labelref 20 > goto labelref 22 > labelref 20 > regQI 24 <- 0 > labelref 22 > regQI AL <- regQI 24 > > The problem resides in `regQI 24 <- 1' being before the jumps. > Since regQI 24 is going to AL, IRA decides to allocate regQI 24 to AL, which > creates loads of conflicts and reloads. If that same insn would be moved to > after the jumps and before the `goto labelref 22' then all would be fine > cause by then regs 27, 28, 29, 30 are dead. > > It's obviously hard to point to a solution but I was wondering if there's a > way to hint to GCC that moving an insn might help the code issue. Or if I > should look into a why an existing pass is not already doing that.
On x86 we expand the code to ((xl & al) ^ al) | ((xh & ah) ^ ah) == 0 which is then if-converted. Modified testcase: long long x; _Bool __attribute__((regparm(2))) mask (long long a) { return (x & a) == a; } on i?86 gets you mask: .LFB0: .cfi_startproc pushl %ebx .cfi_def_cfa_offset 8 .cfi_offset 3, -8 movl %eax, %ebx andl x, %ebx movl %edx, %ecx andl x+4, %ecx xorl %ebx, %eax xorl %ecx, %edx orl %edx, %eax sete %al popl %ebx .cfi_restore 3 .cfi_def_cfa_offset 4 ret so I wonder if you should investigate why the xor variant doesn't trigger for you? On i?86 if-conversion probably solves your specific issue, but I guess the initial expansion is where you could improve placement of the 1 (after all, the 0 is after the jumps). Richard. > Cheers, > > -- > PMatos > >