================ @@ -439,82 +444,194 @@ CGRecordLowering::accumulateBitFields(RecordDecl::field_iterator Field, Members.push_back(MemberInfo(bitsToCharUnits(StartBitOffset), MemberInfo::Field, nullptr, *Field)); } - return; + return Field; } - // Check if OffsetInRecord (the size in bits of the current run) is better - // as a single field run. When OffsetInRecord has legal integer width, and - // its bitfield offset is naturally aligned, it is better to make the - // bitfield a separate storage component so as it can be accessed directly - // with lower cost. - auto IsBetterAsSingleFieldRun = [&](uint64_t OffsetInRecord, - uint64_t StartBitOffset) { - if (!Types.getCodeGenOpts().FineGrainedBitfieldAccesses) - return false; - if (OffsetInRecord < 8 || !llvm::isPowerOf2_64(OffsetInRecord) || - !DataLayout.fitsInLegalInteger(OffsetInRecord)) - return false; - // Make sure StartBitOffset is naturally aligned if it is treated as an - // IType integer. - if (StartBitOffset % - Context.toBits(getAlignment(getIntNType(OffsetInRecord))) != - 0) - return false; - return true; - }; + // The SysV ABI can overlap bitfield storage units with both other bitfield + // storage units /and/ other non-bitfield data members. Accessing a sequence + // of bitfields mustn't interfere with adjacent non-bitfields -- they're + // permitted to be accessed in separate threads for instance. + + // We split runs of bit-fields into a sequence of "access units". When we emit + // a load or store of a bit-field, we'll load/store the entire containing + // access unit. As mentioned, the standard requires that these loads and + // stores must not interfere with accesses to other memory locations, and it + // defines the bit-field's memory location as the current run of + // non-zero-width bit-fields. So an access unit must never overlap with + // non-bit-field storage or cross a zero-width bit-field. Otherwise, we're + // free to draw the lines as we see fit. + + // Drawing these lines well can be complicated. LLVM generally can't modify a + // program to access memory that it didn't before, so using very narrow access + // units can prevent the compiler from using optimal access patterns. For + // example, suppose a run of bit-fields occupies four bytes in a struct. If we + // split that into four 1-byte access units, then a sequence of assignments + // that doesn't touch all four bytes may have to be emitted with multiple + // 8-bit stores instead of a single 32-bit store. On the other hand, if we use + // very wide access units, we may find ourselves emitting accesses to + // bit-fields we didn't really need to touch, just because LLVM was unable to + // clean up after us. + + // It is desirable to have access units be aligned powers of 2 no larger than + // a register. (On non-strict alignment ISAs, the alignment requirement can be + // dropped.) A three byte access unit will be accessed using 2-byte and 1-byte + // accesses and bit manipulation. If no bitfield straddles across the two + // separate accesses, it is better to have separate 2-byte and 1-byte access + // units, as then LLVM will not generate unnecessary memory accesses, or bit + // manipulation. Similarly, on a strict-alignment architecture, it is better + // to keep access-units naturally aligned, to avoid similar bit + // manipulation synthesizing larger unaligned accesses. + + // We do this in two phases, processing a sequential run of bitfield + // declarations. + + // a) Bitfields that share parts of a single byte are, of necessity, placed in + // the same access unit. That unit will encompass a consecutive + // run where adjacent bitfields share parts of a byte. (The first bitfield of + // such an access unit will start at the beginning of a byte.) + + // b) Accumulate adjacent access units when the combined unit is naturally + // sized, no larger than a register, and on a strict alignment ISA, + // aligned. Note that this requires lookahead to one or more subsequent access + // units. For instance, consider a 2-byte access-unit followed by 2 1-byte + // units. We can merge that into a 4-byte access-unit, but we would not want + // to merge a 2-byte followed by a single 1-byte (and no available tail + // padding). + + // This accumulation is prevented when: + // *) it would cross a zero-width bitfield (ABI-dependent), or + // *) one of the candidate access units contains a volatile bitfield, or + // *) fine-grained bitfield access option is in effect. + + CharUnits RegSize = + bitsToCharUnits(Context.getTargetInfo().getRegisterWidth()); + unsigned CharBits = Context.getCharWidth(); + + RecordDecl::field_iterator Begin = FieldEnd; + CharUnits StartOffset; + uint64_t BitSize; + CharUnits BestEndOffset; + RecordDecl::field_iterator BestEnd = Begin; + bool Volatile; ---------------- rjmccall wrote:
Could you add comments to these explaining the invariants / significance? - `Begin` is the start of the span we're currently building an access unit from. - `StartOffset` is its offset in chars (probably ought to be called `BeginOffset` just to reinforce that we're talking about the same thing). - `BestEnd` is the end of the largest acceptable access unit we've found since `Begin`, and it might still be `Begin`. - `BestEndOffset` is the offset of the end of that access unit in chars. - `Volatile` is whether the current extension (the fields we've added *since* `BestEnd`) contains a volatile bit-field. - `BitSize` is the width of the span so far, starting from `Begin`. I'd probably call this `BitSizeSinceBegin`; it's unambiguous out here, but when you get into the merge logic below, you start computing other sizes, and it gets messy. Having a clearer name will make the reasoning about what needs to get updated much clearer. https://github.com/llvm/llvm-project/pull/65742 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits