On Sep 27, 2006, at 7:04 PM, Sandra Loosemore wrote:
I've been having a heck of a time figuring out how to translate the
offsets for struct fields from the DWARF encoding back to GCC's
internal encoding for the LTO project. I've got a handle on the
DWARF encoding and how to do the necessary big/little endian
conversions, but for the GCC side, there doesn't seem to be any
documentation about the relevant macros in the manual, and the
comments in tree.h don't seem to reflect what is actually going on
in the representation.
For example, DECL_FIELD_OFFSET is supposed to be "the field
position, counting in bytes, of the byte containing the bit closest
to the beginning of the structure", while DECL_FIELD_BIT_OFFSET is
supposed to be "the offset, in bits, of the first bit of the field
from DECL_FIELD_OFFSET". So I'm quite puzzled why, for fields that
are not bit fields and that are aligned on byte boundaries, the C
front end is generating a DECL_FIELD_OFFSET that points to some
byte that doesn't contain any part of the field, and a non-zero
DECL_FIELD_BIT_OFFSET instead. If I make the LTO front end do what
the comments in tree.h describe, then dwarf2out.c produces
incorrect offsets that don't match those from the original C file.
I see in stor-layout.c that there are routines to "perform
computations that convert between the offset/bitpos forms and byte
and bit offsets", but what exactly are these forms and which values
are the ones that I should actually be storing inside the
FIELD_DECL object? Is it possible to compute the DECL_OFFSET_ALIGN
value somehow, given that it's not encoded in the DWARF
representation? Trying to reverse-engineer dwarf2out.c isn't
turning out to be very productive.... :-P
I had to look at this recently and I wound up looking at it this
way. The total bit offset is represented as
(byte offset) + (8* bit offset);
there are multiple ways to do that that produce the same result, and
gcc's choice of which one it uses
is, as you say, somewhat arbitrary. However all the different ways
seem to work equivalently for
codegen purposes. I'm not familiar with the dwarf representation,
but the same information must be
there, so perhaps the dwarf code could impose a canonical form if
necessary.