Fun stuff.

I, too, am not entirely happy with the Cap'n Proto wire format in
retrospect. Here's what I'd do if compatibility were not an issue:

1) Eliminate the concept of segments from the encoding. Segments need only
exist at write time, to allow for progressive allocation of additional
message space. But, tracking this could be handled entirely in the
MessageBuilder implementation. All the segments could be concatenated at
write time, and the receiver could then treat the message as one large
segment. Inter-segment pointers no longer need to be far pointers; as long
as the distance is under 4GB, a regular relative pointer is fine. (Builder
implementations would recognize cross-segment pointers by the fact that
they fail the bounds check.)

2) All pointers -- including far pointers, if they exist -- should be
relative to the pointer location, never absolute. (Currently, far pointers
contain an absolute segment index.) Making all pointers relative means that
it's always possible to embed an existing message into a larger message
without doing a tree copy, which turns out to be a pretty nifty thing to be
able to do.

3) Recognize that there's no fundamental need to distinguish on the wire
whether a pointer points to a struct or a list. All we really need to know
is the object location, size, and which bits are data vs. pointers. One we
recognize this, the pointer format can instead focus on optimizing the
common case, with fallbacks for less-common cases. The "common" pointer
encoding could be:

1 bit: 1 to indicate this encoding.
31 bits: offset
16 bits: element count
8 bits: data words per element (with special reserved values for 0-bit,
1-bit, 8-bit, 16-bit, 32-bit)
8 bits: pointers per element

This encoding would cover the vast majority of use cases -- including
struct lists without the need for a tag. Note that for a simple struct (not
a list), the element count would be 1. We then add a fallback encoding used
when any of these fields is not large enough. When the first bit is 0, this
indicates an "uncommon pointer", which could be any of:

- Null pointer (all-zero).
- Capability reference.
- Tag pointer: Encodes a 61-bit word offset pointing to a tagged object. A
tagged object starts with a tag word that encodes a 32-bit element count,
16-bit data word per element, 16-bit pointers per element.
- Trampoline pointer: Like today's "double-far" pointer: points to a
two-word object which contains tag information and a further pointer to the
actual object content. Here we can have 2x16-bit segment sizes, 61-bit
offset, and 35-bit element count, which would become the new upper limit on
list sizes (compared to today's 29-bit limit).
- Other pointer types to be defined later.

-Kenton

On Mon, Feb 26, 2018 at 5:47 PM, <[email protected]> wrote:

> On Tue, 2018-02-27 at 02:35 +0100, [email protected] wrote:
> > Let me now describe the format that results from these observations,
> > without fixing the numerical constants.  The pointer formats are:
>
> Forgot to include the far pointer format; it is, of course,
>
>   +--+-+--------------------------+---------------+---------------+
>   |Ty|M|        Pad offset        |  Pad segment  |  Obj segment  |
>   +--+-+--------------------------+---------------+---------------+
>
>   Ty          ( 2 bits) = "far pointer"
>   M           ( 1 bit ) = more bit
>   Pad offset  (29 bits) = offset of pad, in words
>   Pad segment (16 bits) = segment of pad
>   Obj segment (16 bits) = segment of object
>
> The destination object is referred to by the pointer located at (Pad
> segment):(Pad offset) (the "landing pad"), with the offset inside the
> pointer being interpreted relative to (Obj segment).
>
> Alex
>
> --
> You received this message because you are subscribed to the Google Groups
> "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> Visit this group at https://groups.google.com/group/capnproto.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.

Reply via email to