On Tue, 4 Mar 2008, Jakub Jelinek wrote: > On Tue, Mar 04, 2008 at 11:15:00AM -0500, Diego Novillo wrote: > > >fold currently optimizes a.b.c == 0 to BIT_FIELD_REF <a, 8, big-num> & 1 > > >for bit field field-decls c. IMHO this is bad because it pessimizes > > >TBAA (needs to use a's alias set, not the underlying integral type > > >alias set) and it "breaks" type correctness as arbitrary structure > > >types appear as operand zero. > > > > Agreed. Unless this was done to fix some target-specific problem, I > > think it should disappear. > > Perhaps not in early GIMPLE passes, but we certainly want to lower > bitfield accesses to BIT_FIELD_REFs or something similar before expansion, > otherwise expander and RTL optimization passes aren't able to optimize but > the most trivial cases. GCC generates for bitfields terrible code ATM, > try say: > struct S > { > unsigned int a : 3; > unsigned int b : 3; > unsigned int c : 3; > unsigned int d : 3; > unsigned int e : 3; > unsigned int f : 3; > unsigned int g : 3; > unsigned int h : 11; > } a, b, c; > > void foo (void) > { > a.a = b.a | c.a; > a.b = b.b | c.b; > a.c = b.c | c.c; > a.d = b.d | c.d; > a.e = b.e | c.e; > a.f = b.f | c.f; > a.g = b.g | c.g; > a.h = b.h | c.h; > } > which could be optimized into BIT_FIELD_REF <a, 32, 0> = BIT_FIELD_REF <b, > 32, 0> | BIT_FIELD_REF <c, 32, 0>; > so something like 3 or 4 instructions, yet we generate 51. > Operating on adjacent bitfield fields is fairly common. > Similarly (and perhaps far more common in the wild) is e.g. > void bar (void) > { > a.a = 1; > a.b = 2; > a.c = 3; > a.d = 4; > a.e = 5; > a.f = 6; > a.g = 7; > a.h = 8; > } > - on x86_64 24 instructions on the trunk, 1 is enough. > RTL is too late to try to optimize this, I've tried that once. > Given combiner's limitation of only trying to combine 3 instructions > at once, we'd need more. So this is something that needs to > be optimized at the tree level, either by having a separate pass > that takes care of it, or by lowering it early enough into something > that the optimizers will handle.
Sure. With 4.3 SRA tries to do this. With the MEM_REF lowering I have we optimize the above to foo () { unsigned int MEML.2; unsigned int MEML.1; unsigned int MEML.0; <bb 2>: MEML.0 = MEM <unsigned int {0}, &b>; MEML.1 = MEM <unsigned int {0}, &c>; MEML.2 = MEM <unsigned int {0}, &a>; (load all three words once) MEM <unsigned int {0}, &a> = BIT_FIELD_EXPR <BIT_FIELD_EXPR <BIT_FIELD_EXPR <BIT_FIELD_EXPR <BIT_FIELD_EXPR <BIT_FIELD_EXPR <BIT_FIELD_EXPR <BIT_FIELD_EXPR <MEML.2, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 0> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 0>), 3, 0>, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 3> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 3>), 3, 3>, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 6> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 6>), 3, 6>, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 9> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 9>), 3, 9>, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 12> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 12>), 3, 12>, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 15> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 15>), 3, 15>, (<unnamed-unsigned:3>) ((unsigned char) BIT_FIELD_REF <MEML.1, 3, 18> | (unsigned char) BIT_FIELD_REF <MEML.0, 3, 18>), 3, 18>, (<unnamed-unsigned:11>) ((short unsigned int) BIT_FIELD_REF <MEML.1, 11, 21> | (short unsigned int) BIT_FIELD_REF <MEML.0, 11, 21>), 11, 21>; return; } TER makes a mess out of the expression and obviously we miss some expression combining here (I only have trivial constant folding implemented for BIT_FIELD_EXPR right now). Richard.