-VarInts I'm one of the people (if not the most) opposed to VarInts. VarInts have a place, but this is not it. (We had a large discussion about them at the beginning of KIP-82 time)
If anybody has real life performance numbers of VarInts improving things or significantly reducing resources I would like to know what that case may be. Yes, you can save some bytes here and there, but this is probably insignificant to the overall system behavior and storage requirements. -- I say this with respect to using VarInts in the protocol itself, not as part of the data. VarInts require you to parse the Int before using it and depending on the encoding they can suffer from aliasing (multiple representations for the same value). Why add complexity? Nacho On Thu, Feb 16, 2017 at 10:29 AM, Colin McCabe <cmcc...@apache.org> wrote: > +1 for varints here-- it would save quite a bit of space. They are > pretty quick to implement as well. > > I think it makes sense for values to be byte arrays. Users might want > to attach arbitrary payloads; they shouldn't be forced to serialize > everything to Java strings. > > best, > Colin > > > On Thu, Feb 16, 2017, at 09:52, Jason Gustafson wrote: >> Hey Michael, >> >> Hmm, I guess the point of representing it as bytes is to allow the broker >> to pass it through opaquely? Is the cost of parsing them a concern, or >> are >> we simply trying to ensure that the broker stays agnostic to the format? >> >> On varints, I think adding support for them makes less sense for an >> isolated use case, but as part of a more holistic change (such as what we >> have proposed in KIP-98), I think they are justifiable. If we add them, >> then the need to use attributes becomes quite a bit weaker, right? The >> other thing I find slightly odd is the fact that null headers has no >> actual >> semantic meaning for the message (unlike null keys and values). It is >> just >> a space optimization. It seems a bit better to always use size 0 to >> indicate having no headers. >> >> Overall, the main point is ensuring that the message schema remains >> consistent, either within the larger protocol, or at a minimum within the >> message itself. >> >> -Jason >> >> On Thu, Feb 16, 2017 at 6:39 AM, Michael Pearce <michael.pea...@ig.com> >> wrote: >> >> > Hi Jason, >> > >> > On point 1) in the message protocol the headers are simply a byte array, >> > as like the key or value, this is to clearly demarcate the header in the >> > core message. Then the header byte array in the core message is an array of >> > key, value pairs. This is what it is denoting. >> > >> > Then this would be I guess in the given notation: >> > >> > Headers => [KeyLength, Key, ValueLength, Value] >> > KeyLength => int32 <-----------------NEW size of the byte[] of the >> > serialised key value >> > Key => bytes <---------------------- NEW serialised string (UTF8) >> > bytes of the header key >> > ValueLength => int32 <-------------- NEW size of the byte[] of the >> > serialised header value >> > Value => bytes <-------------------- NEW serialised form of the header >> > value >> > >> > The key length and value length is matching the way the protocol is >> > defined in the core message currently. >> > >> > >> > >> > >> > On point 2) >> > Var sized ints, this was discussed much earlier on, in fact I had >> > suggested it myself (with Hadoop references), the complexity of this >> > compared to having a simpler protocol was argued and agreed it wasn’t worth >> > the complexity as all other clients in other languages would need to ensure >> > theyre using the right var size algorithm, as there is a few. >> > >> > On point 3) >> > We did the attributes, optional approach as originally there was marked >> > concern that headers would cause a message size overhead for others, who >> > don’t want them. As such this is the clean solution to achieve that. If >> > that no longer holds, and we don’t care that we add 4bytes overhead, then >> > im happy to remove. >> > >> > I’m personally in favour of keeping the message as small as possible so >> > people don’t get shocks in perf and throughputs dues to message size, >> > unless they actively use the feature, as such I do prefer the attribute bit >> > wise feature flag approach myself. >> > >> > >> > >> > >> > On 16/02/2017, 05:40, "Jason Gustafson" <ja...@confluent.io> wrote: >> > >> > We have proposed a few significant changes to the message format in >> > KIP-98 >> > which now seems likely to pass (perhaps with some iterations on >> > implementation details). It would be good to try and coordinate the >> > changes >> > in both of the proposals to make sure they are consistent and >> > compatible. >> > >> > I think using the attributes to indicate null headers is a reasonable >> > approach. We have proposed to do the same thing for the message key and >> > value. That said, I sympathize with Jay's argument. Having multiple >> > ways to >> > specify a null value increases the overall complexity of the protocol. >> > You >> > can see this just from the fact that you need the extra verbiage in the >> > protocol specification in this KIP and in KIP-98 to describe the >> > dependence >> > between the fields and the attributes. It seems like a slippery slope >> > if >> > you start allowing different request types to implement the protocol >> > specification differently. >> > >> > You can also argue that the messages already are and are likely to >> > remain a >> > special case. For example, there is currently no generality in how >> > compressed message sets are represented that would be applicable for >> > other >> > request types. Some might see this divergence as an unfortunate >> > protocol >> > deficiency which should be fixed; others might see it as sort of the >> > inevitability of needing to optimize where it counts most. I'm probably >> > somewhere in between, but I think we probably all share the intuition >> > that >> > the protocol should be kept as consistent as possible. With that in >> > mind, >> > here are a few comments: >> > >> > 1. One thing I found a little odd when reading the current proposal is >> > that >> > the headers are both represented as an array of bytes and as an array >> > of >> > key/value pairs. I'd probably suggest something like this: >> > >> > Headers => [HeaderKey HeaderValue] >> > HeaderKey => String >> > HeaderValue => Bytes >> > >> > An array in the Kafka protocol is represented as a 4-byte integer >> > indicating the number of elements in the array followed by the >> > serialization of the elements. Unless I'm misunderstanding, what you >> > have >> > instead is the total size of the headers in bytes followed by the >> > elements. >> > I'm not sure I see any reason for this inconsistency. >> > >> > 2. In KIP-98, we've introduced variable-length integer fields. >> > Effectively, >> > we've enriched (or "complicated" as Jay might say ;) the protocol >> > specification to include the following types: VarInt, VarLong, >> > UnsignedVarInt and UnsignedVarLong. >> > >> > Along with these primitives, we could introduce the following types: >> > >> > VarSizeArray => NumberOfItems Item1 Item2 .. ItemN >> > NumberOfItems => UnsignedVarInt >> > >> > VarSizeNullableArray => NumberOfItemsOrNull Item1 Item2 .. ItemN >> > NumberOfItemsOrNull => VarInt (-1 means null) >> > >> > And similarly for the `String` and `Bytes` types. These types can save >> > a >> > considerable amount of space in this proposal because they can be used >> > for >> > both the number of headers included in the message and the lengths of >> > the >> > header keys and values. We could do this instead: >> > >> > Headers => VarSizeArray[HeaderKey HeaderValue] >> > HeaderKey => VarSizeString >> > HeaderValue => VarSizeBytes >> > >> > Combining the savings from the use of variable length fields, the >> > benefit >> > of using the attributes to represent null seems pretty small. >> > >> > 3. Whichever way we go (whether we use the attributes or not), we >> > should at >> > least be consistent between this KIP and KIP-98. It would be very >> > strange >> > to have two ways to represent null values in the same schema. Either >> > way is >> > OK with me. I think some message-level optimizations are justifiable, >> > but >> > the savings here seem minimal (a few bytes per message), so maybe it's >> > not >> > worth the cost of letting the message diverge even further from the >> > rest of >> > the protocol. >> > >> > -Jason >> > >> > >> > On Wed, Feb 15, 2017 at 8:52 AM, radai <radai.rosenbl...@gmail.com> >> > wrote: >> > >> > > I've trimmed the inline contents as this mail is getting too big for >> > the >> > > apache mailing list software to deliver :-( >> > > >> > > 1. the important thing for interoperability is for different >> > "interested >> > > parties" (plugins, infra layers/wrappers, user-code) to be able to >> > stick >> > > pieces of metadata onto msgs without getting in each other's way. a >> > common >> > > key scheme (Strings, as of the time of this writing?) is all thats >> > required >> > > for that. it is assumed that the other end interested in any such >> > piece of >> > > metadata knows the encoding, and byte[] provides for the most >> > flexibility. >> > > i believe this is the same logic behind core kafka being >> > byte[]/byte[] - >> > > Strings are more "usable" but bytes are flexible and so were chosen. >> > > Also - core kafka doesnt even do that good of a job on usability of >> > the >> > > payload (example - i have to specify the nop byte[] "decoders" >> > explicitly >> > > in conf), and again sacrificies usability for the sake of >> > performance (no >> > > convenient single-record processing as poll is a batch, lots of >> > obscure >> > > little config details exposing internals of the batching mechanism, >> > etc) >> > > >> > > this is also why i really dislike the idea of a "type system" for >> > header >> > > values, it further degrades the usability, adds complexity and will >> > > eventually get in people's way, also, it would be the 2nd/3rd >> > home-group >> > > serialization mechanism in core kafka (counting 2 iterations of the >> > "type >> > > definition DSL") >> > > >> > > 2. this is an implementation detail, and not even a very "user >> > facing" one? >> > > to the best of my understanding the vote process is on proposed >> > > API/behaviour. also - since we're willing to go with strings just >> > serialize >> > > a 0-sized header blob and IIUC you dont need any optionals anymore. >> > > >> > > 3. yes, we can :-) >> > > >> > > On Tue, Feb 14, 2017 at 11:56 PM, Michael Pearce < >> > michael.pea...@ig.com> >> > > wrote: >> > > >> > > > Hi Jay, >> > > > >> > > > 1) There was some initial debate on the value part, as youll note >> > String, >> > > > String headers were discounted early on. The reason for this is >> > > flexibility >> > > > and keeping in line with the flexibility of key, value of the >> > message >> > > > object itself. I don’t think it takes away from an ecosystem as >> > each >> > > plugin >> > > > will care for their own key, this way ints, booleans , exotic >> > custom >> > > binary >> > > > can all be catered for=. >> > > > a. If you really wanted to push for a typed value interface, I >> > wouldn’t >> > > > want just String values supported, but the the primatives plus >> > string and >> > > > also still keeping the ability to have a binary for custom >> > binaries that >> > > > some organisations may have. >> > > > i. I have written this slight alternative here, >> > > https://cwiki.apache.org/ >> > > > confluence/display/KAFKA/KIP-82+-+Add+Record+Headers+-+Typed >> > > > ii. Essentially the value bytes, has a leading byte overhead. >> > > > 1. This tells you what type the value is, before reading the rest >> > of the >> > > > bytes, allowing serialisation/deserialization to and from the >> > primitives, >> > > > string and byte[]. This is akin to some other messaging systems. >> > > > 2) We are making it optional, so that for those not wanting >> > headers have >> > > 0 >> > > > bytes overhead (think of it as a feature flag), I don’t think this >> > is >> > > > complex, especially if comparing to changes proposed in other kips >> > like >> > > > kip-98. >> > > > a. If you really really don’t like this, we can drop it, but it >> > would >> > > mean >> > > > buying into 4 bytes extra overhead for users who do not want to use >> > > headers. >> > > > 3) In the summary yes, it is at a higher level, but I think this >> > is well >> > > > documented in the proposed changes section. >> > > > a. Added getHeaders method to Producer/Consumer record (that is it) >> > > > b. We’ve also detailed the new Headers class that this method >> > returns >> > > that >> > > > encapsulates the headers protocol and logic. >> > > > >> > > > Best, >> > > > Mike >> > > > >> > > > ==Original questions from the vote thread from Jay.== >> > > > >> > > > Couple of things I think we still need to work out: >> > > > >> > > > 1. I think we agree about the key, but I think we haven't >> > talked about >> > > > the value yet. I think if our goal is an open ecosystem of these >> > > header >> > > > spread across many plugins from many systems we should consider >> > making >> > > > this >> > > > a string as well so it can be printed, set via a UI, set in >> > config, >> > > etc. >> > > > Basically encouraging pluggable serialization formats here will >> > lead >> > > to >> > > > a >> > > > bit of a tower of babel. >> > > > 2. This proposal still includes a pretty big change to our >> > > serialization >> > > > and protocol definition layer. Essentially it is introducing an >> > > optional >> > > > type, where the format is data dependent. I think this is >> > actually a >> > > big >> > > > change though it doesn't seem like it. It means you can no >> > longer >> > > > specify >> > > > this type with our type definition DSL, and likewise it requires >> > > custom >> > > > handling in client libs. This isn't a huge thing, since the >> > Record >> > > > definition is custom anyway, but I think this kind of protocol >> > > > inconsistency is very non-desirable and ties you to hand-coding >> > > things. >> > > > I >> > > > think the type should instead by [Key Value] in our BNF, where >> > key and >> > > > value are both short strings as used elsewhere. This brings it >> > in line >> > > > with >> > > > the rest of the protocol. >> > > > 3. Could we get more specific about the exact Java API change to >> > > > ProducerRecord, ConsumerRecord, Record, etc? >> > > > >> > > > -Jay >> > > > >> > > >> > >> > >> > The information contained in this email is strictly confidential and for >> > the use of the addressee only, unless otherwise indicated. If you are not >> > the intended recipient, please do not read, copy, use or disclose to others >> > this message or any attachment. Please also notify the sender by replying >> > to this email or by telephone (+44(020 7896 0011) and then delete the email >> > and any copies of it. Opinions, conclusion (etc) that do not relate to the >> > official business of this company shall be understood as neither given nor >> > endorsed by it. IG is a trading name of IG Markets Limited (a company >> > registered in England and Wales, company number 04008957) and IG Index >> > Limited (a company registered in England and Wales, company number >> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill, >> > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG >> > Index Limited (register number 114059) are authorised and regulated by the >> > Financial Conduct Authority. >> > -- Nacho - Ignacio Solis - iso...@igso.net