On Tue, Jan 14, 2014 at 3:21 PM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote: > Moving to gcc, I accidentally sent it to gcc-patches previously... > > > On 14/01/14 14:09, Richard Biener wrote: >> >> On Tue, Jan 14, 2014 at 3:03 PM, Kyrill Tkachov <kyrylo.tkac...@arm.com> >> wrote: >>> >>> Hi all, >>> >>> I'm looking into PR 54168 where we end up generating an unnecessary >>> extend >>> operation on arm. >>> >>> Given code: >>> >>> struct b2Body { >>> unsigned short flags; >>> int type; >>> }; >>> >>> static _Bool IsAwake(struct b2Body *b) >>> { >>> return (b->flags & 2) == 2; >>> } >>> >>> >>> int foo(struct b2Body *bA, struct b2Body *bB) >>> { >>> int typeA = bA->type; >>> int typeB = bB->type; >>> _Bool activeA = IsAwake(bA) && typeA != 0; >>> _Bool activeB = IsAwake(bB) && typeB != 0; >>> >>> if (!activeA && !activeB) >>> { >>> return 1; >>> } >>> >>> return 0; >>> } >>> >>> Compiled for arm-none-eabi with -O3 -march=armv7-a >>> >>> The inlined generated code for IsAwake contains the fragment: >>> >>> ldrh r0, [r1] >>> and r0, r0, #2 >>> uxth r0, r0 >>> cmp r0, #0 >>> >>> which contains a redundant extend, which also confuses combine and >>> prevents >>> the whole thing from being optimised into an ldrh ; ands sequence. >>> >>> Looking at the tree dumps I notice that after the forwprop pass the types >>> of >>> the operands in the _7 = _3 & 2; statement turn into short unsigned where >>> before that pass they were just ints: >>> >>> IsAwake (struct b2Body * b) >>> { >>> short unsigned int _3; >>> int _4; >>> _Bool _6; >>> short unsigned int _7; >>> >>> <bb 2>: >>> _3 = b_2(D)->flags; >>> _4 = (int) _3; >>> _7 = _3 & 2; >>> _6 = _7 != 0; >>> return _6; >>> >>> } >>> >>> >>> I believe the C standard expects the operation to be performed in int >>> mode. >>> Now, since this is a bitwise and operation with a known constant 2, the >>> operation can be safely performed in unsigned short mode. However on >>> word-based machines like arm this would introduce unnecessary extend >>> operations down the line, as I believe is the case here. >> >> Though the variant using shorts is clearly shorter (in number of stmts) >> and thus easier to optimize. Am I correct that the issue in the end >> is that _7 != 0 requires to extend _7? & 2 is trivially performed without >> any extension, no? > > > If I look at the dump before forwprop, the number of statements is exactly > the same, so it's not any shorter in that sense.
Well, it is - _4 = (int) _3; is dead, thus a zero-extension instruction is removed. > <Dump from before forwprop> > > IsAwake (struct b2Body * b) > { > short unsigned int _3; > int _4; > int _5; > _Bool _6; > > > <bb 2>: > _3 = b_2(D)->flags; > _4 = (int) _3; > _5 = _4 & 2; > _6 = _5 != 0; > return _6; > > } > > Using shorts is not cheaper on an architecture like arm which is word-based. > Just the fact that we're introducing shorts already implies we're going to > have to extend somewhere. But given the bit-and with '2' the extension can be removed, no? Richard. > Kyrill > > >> >> Richard. >> >>> Anyone have any insight on how to resolve this one? >>> >>> Thanks, >>> Kyrill >>> > >