On Mon, Jan 27, 2014 at 09:43:49PM +0000, Nicholas Clark wrote: > I don't think that there are that many places where this happens (but I > haven't rigged the build to count them yet), so I don't think that it's > worth adding to the C structures to remember how many slots are used, and > updating the serialisation to serialise this. Particularly as the new > variable length integers store 0 as 1 byte, it's likely that the overhead of > serialising the count will be more than the saving from not serialising a > few zeros.
Exactly 2 places. Just in compiling the setting, and just leaving 2 slots free. The revised "count" code is this: diff --git a/src/6model/reprs/P6opaque.c b/src/6model/reprs/P6opaque.c index 4edbcf1..24d4392 100644 --- a/src/6model/reprs/P6opaque.c +++ b/src/6model/reprs/P6opaque.c @@ -893,11 +893,19 @@ static void serialize_repr_data(MVMThreadContext *tc, MVMSTable *st, MVMSerializ writer->write_varint(tc, writer, repr_data->unbox_str_slot); if (repr_data->unbox_slots) { + unsigned int count = 0; writer->write_varint(tc, writer, 1); for (i = 0; i < repr_data->num_attributes; i++) { + if (repr_data->unbox_slots[i].repr_id == 0) + ++count; writer->write_varint(tc, writer, repr_data->unbox_slots[i].repr_id); writer->write_varint(tc, writer, repr_data->unbox_slots[i].slot); } + if (1) { + int fd = open("/tmp/p6o", O_WRONLY|O_APPEND|O_CREAT, 0600); + write(fd, &count, sizeof(count)); + close(fd); + } } else { writer->write_varint(tc, writer, 0); and counting all the places where that code is reached: $ od -i /tmp/p6o 0000000 0 0 0 0 0000020 0 0 0 2 0000040 0 0 2 0 0000060 Exactly two aren't fully used. And I'm clearly not reading the code carefully enough, as the length is already (effectively) serialised [that writer->write_varint(tc, writer, 1);] so it would be possible to store the used length instead. But for saving 4 bytes on disk, and (I think) 16 in memory, is it worth it? Nicholas Clark