+1 for varints here-- it would save quite a bit of space. They are pretty quick to implement as well.
I think it makes sense for values to be byte arrays. Users might want to attach arbitrary payloads; they shouldn't be forced to serialize everything to Java strings. best, Colin On Thu, Feb 16, 2017, at 09:52, Jason Gustafson wrote: > Hey Michael, > > Hmm, I guess the point of representing it as bytes is to allow the broker > to pass it through opaquely? Is the cost of parsing them a concern, or > are > we simply trying to ensure that the broker stays agnostic to the format? > > On varints, I think adding support for them makes less sense for an > isolated use case, but as part of a more holistic change (such as what we > have proposed in KIP-98), I think they are justifiable. If we add them, > then the need to use attributes becomes quite a bit weaker, right? The > other thing I find slightly odd is the fact that null headers has no > actual > semantic meaning for the message (unlike null keys and values). It is > just > a space optimization. It seems a bit better to always use size 0 to > indicate having no headers. > > Overall, the main point is ensuring that the message schema remains > consistent, either within the larger protocol, or at a minimum within the > message itself. > > -Jason > > On Thu, Feb 16, 2017 at 6:39 AM, Michael Pearce <michael.pea...@ig.com> > wrote: > > > Hi Jason, > > > > On point 1) in the message protocol the headers are simply a byte array, > > as like the key or value, this is to clearly demarcate the header in the > > core message. Then the header byte array in the core message is an array of > > key, value pairs. This is what it is denoting. > > > > Then this would be I guess in the given notation: > > > > Headers => [KeyLength, Key, ValueLength, Value] > > KeyLength => int32 <-----------------NEW size of the byte[] of the > > serialised key value > > Key => bytes <---------------------- NEW serialised string (UTF8) > > bytes of the header key > > ValueLength => int32 <-------------- NEW size of the byte[] of the > > serialised header value > > Value => bytes <-------------------- NEW serialised form of the header > > value > > > > The key length and value length is matching the way the protocol is > > defined in the core message currently. > > > > > > > > > > On point 2) > > Var sized ints, this was discussed much earlier on, in fact I had > > suggested it myself (with Hadoop references), the complexity of this > > compared to having a simpler protocol was argued and agreed it wasn’t worth > > the complexity as all other clients in other languages would need to ensure > > theyre using the right var size algorithm, as there is a few. > > > > On point 3) > > We did the attributes, optional approach as originally there was marked > > concern that headers would cause a message size overhead for others, who > > don’t want them. As such this is the clean solution to achieve that. If > > that no longer holds, and we don’t care that we add 4bytes overhead, then > > im happy to remove. > > > > I’m personally in favour of keeping the message as small as possible so > > people don’t get shocks in perf and throughputs dues to message size, > > unless they actively use the feature, as such I do prefer the attribute bit > > wise feature flag approach myself. > > > > > > > > > > On 16/02/2017, 05:40, "Jason Gustafson" <ja...@confluent.io> wrote: > > > > We have proposed a few significant changes to the message format in > > KIP-98 > > which now seems likely to pass (perhaps with some iterations on > > implementation details). It would be good to try and coordinate the > > changes > > in both of the proposals to make sure they are consistent and > > compatible. > > > > I think using the attributes to indicate null headers is a reasonable > > approach. We have proposed to do the same thing for the message key and > > value. That said, I sympathize with Jay's argument. Having multiple > > ways to > > specify a null value increases the overall complexity of the protocol. > > You > > can see this just from the fact that you need the extra verbiage in the > > protocol specification in this KIP and in KIP-98 to describe the > > dependence > > between the fields and the attributes. It seems like a slippery slope > > if > > you start allowing different request types to implement the protocol > > specification differently. > > > > You can also argue that the messages already are and are likely to > > remain a > > special case. For example, there is currently no generality in how > > compressed message sets are represented that would be applicable for > > other > > request types. Some might see this divergence as an unfortunate > > protocol > > deficiency which should be fixed; others might see it as sort of the > > inevitability of needing to optimize where it counts most. I'm probably > > somewhere in between, but I think we probably all share the intuition > > that > > the protocol should be kept as consistent as possible. With that in > > mind, > > here are a few comments: > > > > 1. One thing I found a little odd when reading the current proposal is > > that > > the headers are both represented as an array of bytes and as an array > > of > > key/value pairs. I'd probably suggest something like this: > > > > Headers => [HeaderKey HeaderValue] > > HeaderKey => String > > HeaderValue => Bytes > > > > An array in the Kafka protocol is represented as a 4-byte integer > > indicating the number of elements in the array followed by the > > serialization of the elements. Unless I'm misunderstanding, what you > > have > > instead is the total size of the headers in bytes followed by the > > elements. > > I'm not sure I see any reason for this inconsistency. > > > > 2. In KIP-98, we've introduced variable-length integer fields. > > Effectively, > > we've enriched (or "complicated" as Jay might say ;) the protocol > > specification to include the following types: VarInt, VarLong, > > UnsignedVarInt and UnsignedVarLong. > > > > Along with these primitives, we could introduce the following types: > > > > VarSizeArray => NumberOfItems Item1 Item2 .. ItemN > > NumberOfItems => UnsignedVarInt > > > > VarSizeNullableArray => NumberOfItemsOrNull Item1 Item2 .. ItemN > > NumberOfItemsOrNull => VarInt (-1 means null) > > > > And similarly for the `String` and `Bytes` types. These types can save > > a > > considerable amount of space in this proposal because they can be used > > for > > both the number of headers included in the message and the lengths of > > the > > header keys and values. We could do this instead: > > > > Headers => VarSizeArray[HeaderKey HeaderValue] > > HeaderKey => VarSizeString > > HeaderValue => VarSizeBytes > > > > Combining the savings from the use of variable length fields, the > > benefit > > of using the attributes to represent null seems pretty small. > > > > 3. Whichever way we go (whether we use the attributes or not), we > > should at > > least be consistent between this KIP and KIP-98. It would be very > > strange > > to have two ways to represent null values in the same schema. Either > > way is > > OK with me. I think some message-level optimizations are justifiable, > > but > > the savings here seem minimal (a few bytes per message), so maybe it's > > not > > worth the cost of letting the message diverge even further from the > > rest of > > the protocol. > > > > -Jason > > > > > > On Wed, Feb 15, 2017 at 8:52 AM, radai <radai.rosenbl...@gmail.com> > > wrote: > > > > > I've trimmed the inline contents as this mail is getting too big for > > the > > > apache mailing list software to deliver :-( > > > > > > 1. the important thing for interoperability is for different > > "interested > > > parties" (plugins, infra layers/wrappers, user-code) to be able to > > stick > > > pieces of metadata onto msgs without getting in each other's way. a > > common > > > key scheme (Strings, as of the time of this writing?) is all thats > > required > > > for that. it is assumed that the other end interested in any such > > piece of > > > metadata knows the encoding, and byte[] provides for the most > > flexibility. > > > i believe this is the same logic behind core kafka being > > byte[]/byte[] - > > > Strings are more "usable" but bytes are flexible and so were chosen. > > > Also - core kafka doesnt even do that good of a job on usability of > > the > > > payload (example - i have to specify the nop byte[] "decoders" > > explicitly > > > in conf), and again sacrificies usability for the sake of > > performance (no > > > convenient single-record processing as poll is a batch, lots of > > obscure > > > little config details exposing internals of the batching mechanism, > > etc) > > > > > > this is also why i really dislike the idea of a "type system" for > > header > > > values, it further degrades the usability, adds complexity and will > > > eventually get in people's way, also, it would be the 2nd/3rd > > home-group > > > serialization mechanism in core kafka (counting 2 iterations of the > > "type > > > definition DSL") > > > > > > 2. this is an implementation detail, and not even a very "user > > facing" one? > > > to the best of my understanding the vote process is on proposed > > > API/behaviour. also - since we're willing to go with strings just > > serialize > > > a 0-sized header blob and IIUC you dont need any optionals anymore. > > > > > > 3. yes, we can :-) > > > > > > On Tue, Feb 14, 2017 at 11:56 PM, Michael Pearce < > > michael.pea...@ig.com> > > > wrote: > > > > > > > Hi Jay, > > > > > > > > 1) There was some initial debate on the value part, as youll note > > String, > > > > String headers were discounted early on. The reason for this is > > > flexibility > > > > and keeping in line with the flexibility of key, value of the > > message > > > > object itself. I don’t think it takes away from an ecosystem as > > each > > > plugin > > > > will care for their own key, this way ints, booleans , exotic > > custom > > > binary > > > > can all be catered for=. > > > > a. If you really wanted to push for a typed value interface, I > > wouldn’t > > > > want just String values supported, but the the primatives plus > > string and > > > > also still keeping the ability to have a binary for custom > > binaries that > > > > some organisations may have. > > > > i. I have written this slight alternative here, > > > https://cwiki.apache.org/ > > > > confluence/display/KAFKA/KIP-82+-+Add+Record+Headers+-+Typed > > > > ii. Essentially the value bytes, has a leading byte overhead. > > > > 1. This tells you what type the value is, before reading the rest > > of the > > > > bytes, allowing serialisation/deserialization to and from the > > primitives, > > > > string and byte[]. This is akin to some other messaging systems. > > > > 2) We are making it optional, so that for those not wanting > > headers have > > > 0 > > > > bytes overhead (think of it as a feature flag), I don’t think this > > is > > > > complex, especially if comparing to changes proposed in other kips > > like > > > > kip-98. > > > > a. If you really really don’t like this, we can drop it, but it > > would > > > mean > > > > buying into 4 bytes extra overhead for users who do not want to use > > > headers. > > > > 3) In the summary yes, it is at a higher level, but I think this > > is well > > > > documented in the proposed changes section. > > > > a. Added getHeaders method to Producer/Consumer record (that is it) > > > > b. We’ve also detailed the new Headers class that this method > > returns > > > that > > > > encapsulates the headers protocol and logic. > > > > > > > > Best, > > > > Mike > > > > > > > > ==Original questions from the vote thread from Jay.== > > > > > > > > Couple of things I think we still need to work out: > > > > > > > > 1. I think we agree about the key, but I think we haven't > > talked about > > > > the value yet. I think if our goal is an open ecosystem of these > > > header > > > > spread across many plugins from many systems we should consider > > making > > > > this > > > > a string as well so it can be printed, set via a UI, set in > > config, > > > etc. > > > > Basically encouraging pluggable serialization formats here will > > lead > > > to > > > > a > > > > bit of a tower of babel. > > > > 2. This proposal still includes a pretty big change to our > > > serialization > > > > and protocol definition layer. Essentially it is introducing an > > > optional > > > > type, where the format is data dependent. I think this is > > actually a > > > big > > > > change though it doesn't seem like it. It means you can no > > longer > > > > specify > > > > this type with our type definition DSL, and likewise it requires > > > custom > > > > handling in client libs. This isn't a huge thing, since the > > Record > > > > definition is custom anyway, but I think this kind of protocol > > > > inconsistency is very non-desirable and ties you to hand-coding > > > things. > > > > I > > > > think the type should instead by [Key Value] in our BNF, where > > key and > > > > value are both short strings as used elsewhere. This brings it > > in line > > > > with > > > > the rest of the protocol. > > > > 3. Could we get more specific about the exact Java API change to > > > > ProducerRecord, ConsumerRecord, Record, etc? > > > > > > > > -Jay > > > > > > > > > > > > > The information contained in this email is strictly confidential and for > > the use of the addressee only, unless otherwise indicated. If you are not > > the intended recipient, please do not read, copy, use or disclose to others > > this message or any attachment. Please also notify the sender by replying > > to this email or by telephone (+44(020 7896 0011) and then delete the email > > and any copies of it. Opinions, conclusion (etc) that do not relate to the > > official business of this company shall be understood as neither given nor > > endorsed by it. IG is a trading name of IG Markets Limited (a company > > registered in England and Wales, company number 04008957) and IG Index > > Limited (a company registered in England and Wales, company number > > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, > > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG > > Index Limited (register number 114059) are authorised and regulated by the > > Financial Conduct Authority. > >