On Thu, 13 Mar 2008, chromatic via RT wrote: > On Thursday 13 March 2008 09:14:07 Andy Dougherty wrote: > > > On Thu, 13 Mar 2008, Nicholas Clark via RT wrote: > > > > Specifically, I am suspecting that if > > > > > > offsetof(struct parrot_string_t, bufused) == sizeof(Buffer) > > > > > > matters, then something is either looking at or copying (sub)structures > > > than happen to have padding, and in turn that padding happens to end up > > > with bit patterns that have meaning in some other, larger (containing?) > > > structure. > > > Yes. That's exactly my suspicion. Strings are stored in "bufferlike" > > pools, and many of the mainpulations in src/headers.c involve > > sizeof(Buffer), even though there is no actual "Buffer" inside a string > > anymore. To be fair, though, there's a *lot* more going on in parrot's > > memory management that I just don't understand, and I have been unable > > to pinpoint a specific assignment that is in error. > > Originally I was going to ask "Why would there be padding at the end of a > Buffer?" but now I realize that the real question is "Is there padding in > parrot_string_t between flags and strstart?"
Actually, you should ask both questions. There really can be padding at the end of Buffer. The start of every Buffer needs to be suitably aligned. If you have an array of Buffers, each one will occupy sizeof(Buffer). The only way to accomplish that is to have padding on the end of Buffer. Also, no, there is no padding in parrot_string_t betwen flags and strstart. (At least on SPARC.) The net result of these two answers is that offsetof(struct parrot_string_t, strstart) != sizeof(Buffer) (assuming the current order of elements inside include/parrot/pobj.h, in which strstart comes right after flags.) Consider the following program (I've changed the parrot-specific types to their generic equivalents just so it's easier to compile). The "Nested" structure is what parrot used to have. It now has the "Flat" structure. #include <stdio.h> #include <stddef.h> typedef union UnionVal { struct _b { /* One Buffer structure */ void * _bufstart; size_t _buflen; } _b; struct _ptrs { /* or two pointers, both are defines */ void * _struct_val; double * _pmc_val; } _ptrs; struct _i { int _int_val; /* or 2 intvals */ int _int_val2; } _i; double _num_val; /* or one float */ void * _string_val; /* or a pointer to a string */ } UnionVal; /* Parrot Object - base class for all others */ typedef struct Buffer { UnionVal u; unsigned int flags; } Buffer; typedef struct Nested { Buffer o; int foo; } Nested_t; typedef struct Flat { UnionVal cache; unsigned int flags; int foo; } Flat_t; int main(int argc, char **argv) { printf("sizeof UnionVal = %d\n", sizeof(UnionVal)); printf("sizeof flags = %d\n", sizeof(int)); printf("sizeof Buffer = %d\n", sizeof(Buffer)); printf("offsetof(Nested_t, foo) = %d\n", offsetof(Nested_t, foo)); printf("offsetof(Flat_t, foo) = %d\n", offsetof(Flat_t, foo)); return 0; } On Linux/x86, the output of this is sizeof UnionVal = 8 sizeof flags = 4 sizeof Buffer = 12 offsetof(Nested_t, foo) = 12 offsetof(Flat_t, foo) = 12 On SPARC, the output of this is sizeof UnionVal = 8 sizeof flags = 4 sizeof Buffer = 16 offsetof(Nested_t, foo) = 16 offsetof(Flat_t, foo) = 12 > It looks like the UnionVal is two pointers long, so if we rearranged things > such that flags comes first, would the Buffer structure get padded so that > anything after that in memory starts at the appropriate alignment for a > pointer? There is no "Buffer" structure anymore inside a string. However, if you switched the "Flat" structure so that the flags came first and UnionVal came second, then the compiler might stick padding inside the string structure so that UnionVal is aligned. This extra padding would indeed change the SPARC output to sizeof flags = 4 sizeof UnionVal = 8 sizeof Buffer = 16 offsetof(Nested_t, foo) = 16 offsetof(Flat_t, foo) = 16 so that once again 'foo' would be at the same position in either the Nested or the Flat versions. (Of course all code that assumes that PObj_bufstart() points to the beginning of any "bufferlike" object would have to be changed, but that's a separate issue.) Of course this throws away the space savings obtained by getting rid of the nested structure without regaining any of the benefits of the Nested structure. More generally, you can rely on the compiler to ensure that elements within a structure are suitably aligned for their declared uses. Inside parrot_string_t, strstart is already suitably aligned for a pointer. Any padding required is automatically supplied by the compiler. Where it matters is when you try to use the memory allocated for one structure in the place of another structure. Then you have to be sure the structures agree on things. (And in a virtual machine, you naturally end up doing stuff like that.) So, to return to my original point: > > Strings are stored in "bufferlike" > > pools, and many of the mainpulations in src/headers.c involve > > sizeof(Buffer), even though there is no actual "Buffer" inside a string I don't know if those calculations are still correct, now that strings are not "bufferlike". -- Andy Dougherty [EMAIL PROTECTED]