On Sun, 9 Mar 2008, Alexandre Oliva wrote: > On Mar 9, 2008, Richard Guenther <[EMAIL PROTECTED]> wrote: > > > On Sun, 9 Mar 2008, Alexandre Oliva wrote: > >> Do you have any plans to recover the performance loss for machines > >> that have bit-field instructions that operate directly in memory? > >> Especially for writes, I don't see how this is going to work. > > > Do you have an example target in mind? > > AM33/2.0 and H8SX come to mind, although it's been a while since I > dealt with the memory bit-field operations of these two ports to have > the details handy.
Ok, I would expect it a size benefit at most. > >> > Right. By means of fixing the BIT_FIELD_REF_UNSIGNED case it is now > >> > as specified above. > >> > >> This doesn't make the change that went in > > > ? > > ... desirable. To recap, what went in is a restriction to constant args 2 and 3 and a result TYPE_PRECISION (in case of intergal type) or MODE_PRECISION (in other cases) matching to the bit-field size in argument 2. The latter change made BIT_FIELD_REF_UNSIGNED redundant and (finally) allows for constant folding of BIT_FIELD_REF in fold_ternary. > >> > The least conversion is removed. You can look at the MEM_REF branch > >> > and see that the load from memory is done with a MEM_REF expression > >> > from the register result the bits are extracted with BIT_FIELD_REF. > >> > >> Extracting (or replacing) bits from registers of integral type with > >> BIT_FIELD_REF is not very useful. You're better off expanding the > >> BIT_FIELD_REF into mask and shift operations to get better > >> optimization. > > > You only generate lots of IL this way. > > You make for much better optimization, at least with current > handling of BIT_FIELD_REF. > > OTOH, the much constrained version of BIT_FIELD_REF you propose will > indeed only generate lots of IL for the most common case of > BIT_FIELD_REF. It generates much less IL than what the SRA lowering currently does. > >> It is for memory access that BIT_FIELD_REF makes the most sense, > >> because you can't create temporaries or duplicates for memory refs as > >> easily, especially when memory is volatile. > > > Volatile bit-field ref stores won't work on most targets anyway, > > so this is a non-issue. > > What do you mean, won't work? You get a read followed by a write, > even if for a wider unit of the object, which AFAIK is legitimate > implementation-defined behavior for volatile bit-fields. Oh, ok. But this is no different in lowered form, or maybe I don't understand your concern. (volatile is specified as doing as many accesses as written, which is where (IMHO) read + write for a bit-field store is a mismatch, thus the "doesn't work") > > Why do you think you cannot create temporaries for memory refs? You > > did the same with SRA. > > You can. It's just not as easy. You see the amount of IL for > replacing stuff. And then, SRA does not really create temporaries for > MEMs that remain as the primary copy of an object, which is the case > I'm really talking about. The temporaries created by SRA mostly > replace the corresponding MEMs. > > > Besides, in SRA, for example, we'll often create a temporary that > holds multiple bit-fields that didn't make sense to scalarize on their > own. This all works fine with the lowering MEM_REF does. With less IL than SRA creates (because of BIT_FIELD_REF). > If I'd worked harder on it, they wouldn't even have to be > contiguous, as long as they were in the same word. BIT_FIELD_REF > couldn't represent this, though. Represent yes, obviously you'd still need something like SRA to to the re-ordering / packing. > >> This new definition of BIT_FIELD_REF you came up with appears to be > >> redundant and useless to me. All the interesting uses are ruled out, > >> and only uninteresting (and harmful) uses remain. Might as well just > >> get rid of it, and later on realize it is needed for the interesting > >> cases and re-introduce it with the previous semantics. > > > See above, it is mainly to ease combining with BIT_FIELD_EXPR. > > BIT_FIELD_EXPR appears to be necessary only for SSA gimple registers. > For addressable objects, a BIT_FIELD_REF lhs can be much simpler, and > it may even generate better code without as much effort. BIT_FIELD_REF doesn't play well with our scalar optimizers. You can't work on registers, as partial writes break SSA form, so you need to keep a temporary in memory which is bad. BIT_FIELD_EXPR makes the new value created available, so you can keep SSA form and work on registers. So, BIT_FIELD_REF <reg_1, 3, 4> = foo_2; (bogus of course) would instead be written as reg_2 = BIT_FIELD_EXPR <reg_1, foo_2, 3, 4>; Thus, BIT_FIELD_EXPR is necessary to move bitfields from memory to registers (unless you want to expose all the shifting and masking directly). Richard.