Fun stuff. I, too, am not entirely happy with the Cap'n Proto wire format in retrospect. Here's what I'd do if compatibility were not an issue:
1) Eliminate the concept of segments from the encoding. Segments need only exist at write time, to allow for progressive allocation of additional message space. But, tracking this could be handled entirely in the MessageBuilder implementation. All the segments could be concatenated at write time, and the receiver could then treat the message as one large segment. Inter-segment pointers no longer need to be far pointers; as long as the distance is under 4GB, a regular relative pointer is fine. (Builder implementations would recognize cross-segment pointers by the fact that they fail the bounds check.) 2) All pointers -- including far pointers, if they exist -- should be relative to the pointer location, never absolute. (Currently, far pointers contain an absolute segment index.) Making all pointers relative means that it's always possible to embed an existing message into a larger message without doing a tree copy, which turns out to be a pretty nifty thing to be able to do. 3) Recognize that there's no fundamental need to distinguish on the wire whether a pointer points to a struct or a list. All we really need to know is the object location, size, and which bits are data vs. pointers. One we recognize this, the pointer format can instead focus on optimizing the common case, with fallbacks for less-common cases. The "common" pointer encoding could be: 1 bit: 1 to indicate this encoding. 31 bits: offset 16 bits: element count 8 bits: data words per element (with special reserved values for 0-bit, 1-bit, 8-bit, 16-bit, 32-bit) 8 bits: pointers per element This encoding would cover the vast majority of use cases -- including struct lists without the need for a tag. Note that for a simple struct (not a list), the element count would be 1. We then add a fallback encoding used when any of these fields is not large enough. When the first bit is 0, this indicates an "uncommon pointer", which could be any of: - Null pointer (all-zero). - Capability reference. - Tag pointer: Encodes a 61-bit word offset pointing to a tagged object. A tagged object starts with a tag word that encodes a 32-bit element count, 16-bit data word per element, 16-bit pointers per element. - Trampoline pointer: Like today's "double-far" pointer: points to a two-word object which contains tag information and a further pointer to the actual object content. Here we can have 2x16-bit segment sizes, 61-bit offset, and 35-bit element count, which would become the new upper limit on list sizes (compared to today's 29-bit limit). - Other pointer types to be defined later. -Kenton On Mon, Feb 26, 2018 at 5:47 PM, <[email protected]> wrote: > On Tue, 2018-02-27 at 02:35 +0100, [email protected] wrote: > > Let me now describe the format that results from these observations, > > without fixing the numerical constants. The pointer formats are: > > Forgot to include the far pointer format; it is, of course, > > +--+-+--------------------------+---------------+---------------+ > |Ty|M| Pad offset | Pad segment | Obj segment | > +--+-+--------------------------+---------------+---------------+ > > Ty ( 2 bits) = "far pointer" > M ( 1 bit ) = more bit > Pad offset (29 bits) = offset of pad, in words > Pad segment (16 bits) = segment of pad > Obj segment (16 bits) = segment of object > > The destination object is referred to by the pointer located at (Pad > segment):(Pad offset) (the "landing pad"), with the offset inside the > pointer being interpreted relative to (Obj segment). > > Alex > > -- > You received this message because you are subscribed to the Google Groups > "Cap'n Proto" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > Visit this group at https://groups.google.com/group/capnproto. > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. Visit this group at https://groups.google.com/group/capnproto.
