Hi, I have this problem in private backend, but it is reproducible on x86-gcc also, so I suppose core GCC probems. Lets consider simple example:
unsigned int buffer[10]; __attribute__((noinline)) void myFunc(unsigned int a, unsigned int b, unsigned int c) { unsigned int tmp; if( a & 0x2 ) { tmp = 0x3221; tmp |= (a & 0xF) << 24; tmp |= (a & 0x3) << 2; } else { tmp = 0x83621; tmp |= (a & 0xF) << 24; tmp |= (a & 0x3) << 2; } buffer[0] = tmp; } And compile it with -Os to assembler. It yields: movl %edi, %eax andl $15, %eax sall $24, %eax testb $2, %dil je .L2 andl $3, %edi sall $2, %edi orl %edi, %eax orl $12833, %eax jmp .L3 .L2: andl $3, %edi sall $2, %edi orl %edi, %eax orl $538145, %eax .L3: movl %eax, buffer(%rip) ret There is a big common code fragment: andl $3, %edi sall $2, %edi orl %edi, %eax That potentially can be moved upper and reduce code size. But it doesn't. Also on O2. Things are even worse in my backend, because my target have conditional or and all code possibly may be linearized in this case with good performance impact, and only thing why not is because GCC fails to move common part out of the if-else switch upper prior to trying conditional execution. Can someone advice me how to tune target machine options to signal, that moving common parts out of the conditional expressions is profitable? Or the only way is to write custom pass? --- With best regards, Konstantin