Re: [DISCUSS] KIP-82 - Add Record Headers

Magnus Edenhill Fri, 17 Feb 2017 11:35:48 -0800

Big +1 on VarInts.
CPUs are fast, memory is slow.

I agree with Jason that we'll want to continue verifying messages,
including their headers, so while I appreciate the idea of the opaque
header blob it won't be useful in practice.


/Magnus

2017-02-17 10:41 GMT-08:00 Jason Gustafson <[email protected]>:

> Sorry, my mistake. The consumer interceptor is per batch, though I'm not
> sure that's an actual limitation since you still have access to the
> individual records.
>
> -Jason
>
> On Fri, Feb 17, 2017 at 10:39 AM, Jason Gustafson <[email protected]>
> wrote:
>
> > Re headers as byte array and future use by broker. This doesn't take away
> >> from that at all. Nor makes it difficult at all in my opinion.
> >
> >
> > Yeah, I didn't say it was difficult, only awkward. You wouldn't write the
> > schema that way if you were planning to use it on the brokers from the
> > beginning. Note also that one of the benefits of letting the broker
> > understand headers is that it can validate that they are properly
> > formatted. If cost is the only concern, we should confirm its impact
> > through performance testing.
> >
> > One of the key use cases requires access on consume at per event/message
> >> level at the point that message is being processed, as such the batch
> >> interceptors and batch consume api isn't suitable. It needs to be at the
> >> record level.
> >
> >
> > I'm not sure I understand the point about batching. Interceptors are
> > applied per-message, right?
> >
> > My intent on interceptors is to keep the usage of headers well-defined so
> > that they don't start leaking unnecessarily into applications. My guess
> is
> > that it's probably inevitable, but isolating it in the interceptors would
> > at least give people a second thought before deciding to use it. The main
> > challenge in my mind is figuring out how an MM use case would work. It
> > would be more cumbersome to replicate headers through an interceptor,
> > though arguably MM should be working at a lower level anyway.
> >
> > -Jason
> >
> > On Fri, Feb 17, 2017 at 10:16 AM, Michael Pearce <[email protected]>
> > wrote:
> >
> >> Re headers available on the record  va interceptors only
> >>
> >> One of the key use cases requires access on consume at per event/message
> >> level at the point that message is being processed, as such the batch
> >> interceptors and batch consume api isn't suitable. It needs to be at the
> >> record level.
> >>
> >> This anyhow is similar to jms/http/amqp where headers are available to
> >> consuming applications.
> >>
> >> Re headers as byte array and future use by broker. This doesn't take
> away
> >> from that at all. Nor makes it difficult at all in my opinion.
> >>
> >>
> >>
> >> Sent using OWA for iPhone
> >> ________________________________________
> >> From: Jason Gustafson <[email protected]>
> >> Sent: Friday, February 17, 2017 5:55:42 PM
> >> To: [email protected]
> >> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> >>
> >> >
> >> > Would you be proposing in KIP-98 to convert the other message int’s
> (key
> >> > length, value length) also to varint to keep it uniform.
> >> > Also I assume there will be a static or helper method made to
> write/read
> >> > these in the client and server.
> >>
> >>
> >> Yes, that is what we are proposing, so using varints for headers would
> be
> >> consistent with the rest of the message. We have used static helper
> >> methods
> >> in our prototype implementation.
> >>
> >> The cost of parsing, we want to parse/interpret the headers lazily (this
> >> is
> >> > a key point brought up earlier in discussions)
> >>
> >>
> >> I'm a bit skeptical of this. Has anyone done the performance testing? I
> >> can
> >> probably implement it and test it if no one else has. I was also under
> the
> >> impression that there may be use cases down the road where the broker
> >> would
> >> need to interpret headers. That wouldn't be off the table in the future
> if
> >> it's represented as bytes, but it would be quite a bit more awkward,
> >> right?
> >>
> >> By the way, one question I have been wondering about. My understanding
> is
> >> that headers are primarily for use cases where a third-party components
> >> wants to enrich messages without needing to understand or modify the
> >> schema
> >> of the message key and value. For the applications which directly
> produce
> >> and consume the messages and control the key/value schema directly, it
> >> seems we would rather have them implement headers directly in their own
> >> schema. Supposing for the sake of argument that it was possible, my
> >> question is whether it be sufficient to expose the headers in the
> >> interceptor API and not in the common API?
> >>
> >> -Jason
> >>
> >> On Fri, Feb 17, 2017 at 3:26 AM, Michael Pearce <[email protected]>
> >> wrote:
> >>
> >> > On the point of varInts
> >> >
> >> > Would you be proposing in KIP-98 to convert the other message int’s
> (key
> >> > length, value length) also to varint to keep it uniform.
> >> > Also I assume there will be a static or helper method made to
> write/read
> >> > these in the client and server.
> >> >
> >> > Cheers
> >> > Mike
> >> >
> >> >
> >> >
> >> > On 17/02/2017, 11:22, "Michael Pearce" <[email protected]> wrote:
> >> >
> >> >     On the point re: headers in the message protocol being a byte
> array
> >> > and not a count of elements followed by the elements. Again this was
> >> > discussed/argued previously.
> >> >
> >> >     It was agreed on for a few reasons some of which you have
> obviously
> >> > picked up on:
> >> >
> >> >     Broker is able to pass it through opaquely
> >> >     The cost of parsing, we want to parse/interpret the headers lazily
> >> > (this is a key point brought up earlier in discussions)
> >> >     Headers can be copied from consumer record to producer record (aka
> >> > mirror makers etc) without parsing if no changes are being made or
> being
> >> > looked at.
> >> >     Keeps the broker agnostic to the format
> >> >     You need an int32 either for the byte size of the headers, or for
> >> the
> >> > count of elements, so overheads are the same, but with going with an
> >> opaque
> >> > byte array has the above advantages.
> >> >
> >> >     Cheers
> >> >     Mike
> >> >
> >> >
> >> >     On 17/02/2017, 02:50, "Jason Gustafson" <[email protected]>
> wrote:
> >> >
> >> >         Sorry, should have noted that the performance testing was done
> >> > using the
> >> >         producer performance tool shipped with Kafka.
> >> >
> >> >         -Jason
> >> >
> >> >         On Thu, Feb 16, 2017 at 6:44 PM, Jason Gustafson <
> >> > [email protected]> wrote:
> >> >
> >> >         > Hey Nacho,
> >> >         >
> >> >         > I've compared performance of our KIP-98 implementation with
> >> and
> >> > without
> >> >         > varints. For messages around 128 bytes, we see an increase
> in
> >> > throughput of
> >> >         > about 30% using the default configuration settings. At 256
> >> > bytes, the
> >> >         > increase is around 16%. Obviously the performance converge
> as
> >> > messages get
> >> >         > larger, but it seems well worth the cost. Note that we are
> >> also
> >> > seeing a
> >> >         > substantial performance increase against trunk primarily
> >> because
> >> > of the
> >> >         > much more efficient packing that varints provide us.
> Anything
> >> > adding to
> >> >         > message overhead, such as record headers, would only
> increase
> >> > the relative
> >> >         > difference. (Of course take these numbers with a grain of
> salt
> >> > since I have
> >> >         > only used the default settings with both the producer and
> >> broker
> >> > on my
> >> >         > local machine. We intend to provide more extensive
> performance
> >> > details as
> >> >         > part of the work for KIP-98.)
> >> >         >
> >> >         > The implementation we are using is from protobuf (
> >> >         > https://developers.google.com/
> protocol-buffers/docs/encoding
> >> ),
> >> > which is
> >> >         > also used in HBase. It is trivial to implement and as far
> as I
> >> > know doesn't
> >> >         > suffer from the aliasing problem you are describing. I
> checked
> >> > with Magnus
> >> >         > (the author of librdkafka) and he agreed that the savings
> >> seemed
> >> > worth the
> >> >         > cost of implementation.
> >> >         >
> >> >         > -Jason
> >> >         >
> >> >         > On Thu, Feb 16, 2017 at 4:32 PM, Ignacio Solis <
> >> [email protected]>
> >> > wrote:
> >> >         >
> >> >         >> -VarInts
> >> >         >>
> >> >         >> I'm one of the people (if not the most) opposed to VarInts.
> >> > VarInts
> >> >         >> have a place, but this is not it.   (We had a large
> >> discussion
> >> > about
> >> >         >> them at the beginning of KIP-82 time)
> >> >         >>
> >> >         >> If anybody has real life performance numbers of VarInts
> >> > improving
> >> >         >> things or significantly reducing resources I would like to
> >> know
> >> > what
> >> >         >> that case may be. Yes, you can save some bytes here and
> >> there,
> >> > but
> >> >         >> this is probably insignificant to the overall system
> behavior
> >> > and
> >> >         >> storage requirements.  -- I say this with respect to using
> >> > VarInts in
> >> >         >> the protocol itself, not as part of the data.
> >> >         >>
> >> >         >> VarInts require you to parse the Int before using it and
> >> > depending on
> >> >         >> the encoding they can suffer from aliasing (multiple
> >> > representations
> >> >         >> for the same value).
> >> >         >>
> >> >         >> Why add complexity?
> >> >         >>
> >> >         >> Nacho
> >> >         >>
> >> >         >>
> >> >         >> On Thu, Feb 16, 2017 at 10:29 AM, Colin McCabe <
> >> > [email protected]>
> >> >         >> wrote:
> >> >         >> > +1 for varints here-- it would save quite a bit of space.
> >> > They are
> >> >         >> > pretty quick to implement as well.
> >> >         >> >
> >> >         >> > I think it makes sense for values to be byte arrays.
> Users
> >> > might want
> >> >         >> > to attach arbitrary payloads; they shouldn't be forced to
> >> > serialize
> >> >         >> > everything to Java strings.
> >> >         >> >
> >> >         >> > best,
> >> >         >> > Colin
> >> >         >> >
> >> >         >> >
> >> >         >> > On Thu, Feb 16, 2017, at 09:52, Jason Gustafson wrote:
> >> >         >> >> Hey Michael,
> >> >         >> >>
> >> >         >> >> Hmm, I guess the point of representing it as bytes is to
> >> > allow the
> >> >         >> broker
> >> >         >> >> to pass it through opaquely? Is the cost of parsing
> them a
> >> > concern, or
> >> >         >> >> are
> >> >         >> >> we simply trying to ensure that the broker stays
> agnostic
> >> to
> >> > the
> >> >         >> format?
> >> >         >> >>
> >> >         >> >> On varints, I think adding support for them makes less
> >> sense
> >> > for an
> >> >         >> >> isolated use case, but as part of a more holistic change
> >> > (such as what
> >> >         >> we
> >> >         >> >> have proposed in KIP-98), I think they are justifiable.
> If
> >> > we add them,
> >> >         >> >> then the need to use attributes becomes quite a bit
> >> weaker,
> >> > right? The
> >> >         >> >> other thing I find slightly odd is the fact that null
> >> > headers has no
> >> >         >> >> actual
> >> >         >> >> semantic meaning for the message (unlike null keys and
> >> > values). It is
> >> >         >> >> just
> >> >         >> >> a space optimization. It seems a bit better to always
> use
> >> > size 0 to
> >> >         >> >> indicate having no headers.
> >> >         >> >>
> >> >         >> >> Overall, the main point is ensuring that the message
> >> schema
> >> > remains
> >> >         >> >> consistent, either within the larger protocol, or at a
> >> > minimum within
> >> >         >> the
> >> >         >> >> message itself.
> >> >         >> >>
> >> >         >> >> -Jason
> >> >         >> >>
> >> >         >> >> On Thu, Feb 16, 2017 at 6:39 AM, Michael Pearce <
> >> > [email protected]
> >> >         >> >
> >> >         >> >> wrote:
> >> >         >> >>
> >> >         >> >> > Hi Jason,
> >> >         >> >> >
> >> >         >> >> > On point 1) in the message protocol the headers are
> >> simply
> >> > a byte
> >> >         >> array,
> >> >         >> >> > as like the key or value, this is to clearly demarcate
> >> the
> >> > header in
> >> >         >> the
> >> >         >> >> > core message. Then the header byte array in the core
> >> > message is an
> >> >         >> array of
> >> >         >> >> > key, value pairs. This is what it is denoting.
> >> >         >> >> >
> >> >         >> >> > Then this would be I guess in the given notation:
> >> >         >> >> >
> >> >         >> >> > Headers => [KeyLength, Key, ValueLength, Value]
> >> >         >> >> >     KeyLength => int32 <-----------------NEW size of
> the
> >> > byte[] of
> >> >         >> the
> >> >         >> >> > serialised key value
> >> >         >> >> >     Key => bytes <---------------------- NEW
> serialised
> >> > string (UTF8)
> >> >         >> >> > bytes of the header key
> >> >         >> >> >     ValueLength => int32 <-------------- NEW size of
> the
> >> > byte[] of
> >> >         >> the
> >> >         >> >> > serialised header value
> >> >         >> >> >     Value => bytes <-------------------- NEW
> serialised
> >> > form of the
> >> >         >> header
> >> >         >> >> > value
> >> >         >> >> >
> >> >         >> >> > The key length and value length is matching the way
> the
> >> > protocol is
> >> >         >> >> > defined in the core message currently.
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> > On point 2)
> >> >         >> >> > Var sized ints, this was discussed much earlier on, in
> >> > fact I had
> >> >         >> >> > suggested it myself (with Hadoop references), the
> >> > complexity of this
> >> >         >> >> > compared to having a simpler protocol was argued and
> >> > agreed it
> >> >         >> wasn’t worth
> >> >         >> >> > the complexity as all other clients in other languages
> >> > would need to
> >> >         >> ensure
> >> >         >> >> > theyre using the right var size algorithm, as there
> is a
> >> > few.
> >> >         >> >> >
> >> >         >> >> > On point 3)
> >> >         >> >> > We did the attributes, optional approach as originally
> >> > there was
> >> >         >> marked
> >> >         >> >> > concern that headers would cause a message size
> overhead
> >> > for others,
> >> >         >> who
> >> >         >> >> > don’t want them. As such this is the clean solution to
> >> > achieve that.
> >> >         >> If
> >> >         >> >> > that no longer holds, and we don’t care that we add
> >> 4bytes
> >> > overhead,
> >> >         >> then
> >> >         >> >> > im happy to remove.
> >> >         >> >> >
> >> >         >> >> > I’m personally in favour of keeping the message as
> small
> >> > as possible
> >> >         >> so
> >> >         >> >> > people don’t get shocks in perf and throughputs dues
> to
> >> > message size,
> >> >         >> >> > unless they actively use the feature, as such I do
> >> prefer
> >> > the
> >> >         >> attribute bit
> >> >         >> >> > wise feature flag approach myself.
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> > On 16/02/2017, 05:40, "Jason Gustafson" <
> >> > [email protected]> wrote:
> >> >         >> >> >
> >> >         >> >> >     We have proposed a few significant changes to the
> >> > message format
> >> >         >> in
> >> >         >> >> > KIP-98
> >> >         >> >> >     which now seems likely to pass (perhaps with some
> >> > iterations on
> >> >         >> >> >     implementation details). It would be good to try
> and
> >> > coordinate
> >> >         >> the
> >> >         >> >> > changes
> >> >         >> >> >     in both of the proposals to make sure they are
> >> > consistent and
> >> >         >> >> > compatible.
> >> >         >> >> >
> >> >         >> >> >     I think using the attributes to indicate null
> >> headers
> >> > is a
> >> >         >> reasonable
> >> >         >> >> >     approach. We have proposed to do the same thing
> for
> >> > the message
> >> >         >> key and
> >> >         >> >> >     value. That said, I sympathize with Jay's
> argument.
> >> > Having
> >> >         >> multiple
> >> >         >> >> > ways to
> >> >         >> >> >     specify a null value increases the overall
> >> complexity
> >> > of the
> >> >         >> protocol.
> >> >         >> >> > You
> >> >         >> >> >     can see this just from the fact that you need the
> >> > extra verbiage
> >> >         >> in the
> >> >         >> >> >     protocol specification in this KIP and in KIP-98
> to
> >> > describe the
> >> >         >> >> > dependence
> >> >         >> >> >     between the fields and the attributes. It seems
> >> like a
> >> > slippery
> >> >         >> slope
> >> >         >> >> > if
> >> >         >> >> >     you start allowing different request types to
> >> > implement the
> >> >         >> protocol
> >> >         >> >> >     specification differently.
> >> >         >> >> >
> >> >         >> >> >     You can also argue that the messages already are
> and
> >> > are likely
> >> >         >> to
> >> >         >> >> > remain a
> >> >         >> >> >     special case. For example, there is currently no
> >> > generality in
> >> >         >> how
> >> >         >> >> >     compressed message sets are represented that would
> >> be
> >> > applicable
> >> >         >> for
> >> >         >> >> > other
> >> >         >> >> >     request types. Some might see this divergence as
> an
> >> > unfortunate
> >> >         >> >> > protocol
> >> >         >> >> >     deficiency which should be fixed; others might see
> >> it
> >> > as sort of
> >> >         >> the
> >> >         >> >> >     inevitability of needing to optimize where it
> counts
> >> > most. I'm
> >> >         >> probably
> >> >         >> >> >     somewhere in between, but I think we probably all
> >> > share the
> >> >         >> intuition
> >> >         >> >> > that
> >> >         >> >> >     the protocol should be kept as consistent as
> >> possible.
> >> > With that
> >> >         >> in
> >> >         >> >> > mind,
> >> >         >> >> >     here are a few comments:
> >> >         >> >> >
> >> >         >> >> >     1. One thing I found a little odd when reading the
> >> > current
> >> >         >> proposal is
> >> >         >> >> > that
> >> >         >> >> >     the headers are both represented as an array of
> >> bytes
> >> > and as an
> >> >         >> array
> >> >         >> >> > of
> >> >         >> >> >     key/value pairs. I'd probably suggest something
> like
> >> > this:
> >> >         >> >> >
> >> >         >> >> >     Headers => [HeaderKey HeaderValue]
> >> >         >> >> >      HeaderKey => String
> >> >         >> >> >      HeaderValue => Bytes
> >> >         >> >> >
> >> >         >> >> >     An array in the Kafka protocol is represented as a
> >> > 4-byte integer
> >> >         >> >> >     indicating the number of elements in the array
> >> > followed by the
> >> >         >> >> >     serialization of the elements. Unless I'm
> >> > misunderstanding, what
> >> >         >> you
> >> >         >> >> > have
> >> >         >> >> >     instead is the total size of the headers in bytes
> >> > followed by the
> >> >         >> >> > elements.
> >> >         >> >> >     I'm not sure I see any reason for this
> >> inconsistency.
> >> >         >> >> >
> >> >         >> >> >     2. In KIP-98, we've introduced variable-length
> >> integer
> >> > fields.
> >> >         >> >> > Effectively,
> >> >         >> >> >     we've enriched (or "complicated" as Jay might say
> ;)
> >> > the protocol
> >> >         >> >> >     specification to include the following types:
> >> VarInt,
> >> > VarLong,
> >> >         >> >> >     UnsignedVarInt and UnsignedVarLong.
> >> >         >> >> >
> >> >         >> >> >     Along with these primitives, we could introduce
> the
> >> > following
> >> >         >> types:
> >> >         >> >> >
> >> >         >> >> >     VarSizeArray => NumberOfItems Item1 Item2 .. ItemN
> >> >         >> >> >       NumberOfItems => UnsignedVarInt
> >> >         >> >> >
> >> >         >> >> >     VarSizeNullableArray => NumberOfItemsOrNull Item1
> >> > Item2 .. ItemN
> >> >         >> >> >       NumberOfItemsOrNull => VarInt (-1 means null)
> >> >         >> >> >
> >> >         >> >> >     And similarly for the `String` and `Bytes` types.
> >> > These types
> >> >         >> can save
> >> >         >> >> > a
> >> >         >> >> >     considerable amount of space in this proposal
> >> because
> >> > they can
> >> >         >> be used
> >> >         >> >> > for
> >> >         >> >> >     both the number of headers included in the message
> >> and
> >> > the
> >> >         >> lengths of
> >> >         >> >> > the
> >> >         >> >> >     header keys and values. We could do this instead:
> >> >         >> >> >
> >> >         >> >> >     Headers => VarSizeArray[HeaderKey HeaderValue]
> >> >         >> >> >       HeaderKey => VarSizeString
> >> >         >> >> >       HeaderValue => VarSizeBytes
> >> >         >> >> >
> >> >         >> >> >     Combining the savings from the use of variable
> >> length
> >> > fields, the
> >> >         >> >> > benefit
> >> >         >> >> >     of using the attributes to represent null seems
> >> pretty
> >> > small.
> >> >         >> >> >
> >> >         >> >> >     3. Whichever way we go (whether we use the
> >> attributes
> >> > or not), we
> >> >         >> >> > should at
> >> >         >> >> >     least be consistent between this KIP and KIP-98.
> It
> >> > would be very
> >> >         >> >> > strange
> >> >         >> >> >     to have two ways to represent null values in the
> >> same
> >> > schema.
> >> >         >> Either
> >> >         >> >> > way is
> >> >         >> >> >     OK with me. I think some message-level
> optimizations
> >> > are
> >> >         >> justifiable,
> >> >         >> >> > but
> >> >         >> >> >     the savings here seem minimal (a few bytes per
> >> > message), so
> >> >         >> maybe it's
> >> >         >> >> > not
> >> >         >> >> >     worth the cost of letting the message diverge even
> >> > further from
> >> >         >> the
> >> >         >> >> > rest of
> >> >         >> >> >     the protocol.
> >> >         >> >> >
> >> >         >> >> >     -Jason
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> >     On Wed, Feb 15, 2017 at 8:52 AM, radai <
> >> >         >> [email protected]>
> >> >         >> >> > wrote:
> >> >         >> >> >
> >> >         >> >> >     > I've trimmed the inline contents as this mail is
> >> > getting too
> >> >         >> big for
> >> >         >> >> > the
> >> >         >> >> >     > apache mailing list software to deliver :-(
> >> >         >> >> >     >
> >> >         >> >> >     > 1. the important thing for interoperability is
> for
> >> > different
> >> >         >> >> > "interested
> >> >         >> >> >     > parties" (plugins, infra layers/wrappers,
> >> user-code)
> >> > to be
> >> >         >> able to
> >> >         >> >> > stick
> >> >         >> >> >     > pieces of metadata onto msgs without getting in
> >> each
> >> > other's
> >> >         >> way. a
> >> >         >> >> > common
> >> >         >> >> >     > key scheme (Strings, as of the time of this
> >> > writing?) is all
> >> >         >> thats
> >> >         >> >> > required
> >> >         >> >> >     > for that. it is assumed that the other end
> >> > interested in any
> >> >         >> such
> >> >         >> >> > piece of
> >> >         >> >> >     > metadata knows the encoding, and byte[] provides
> >> for
> >> > the most
> >> >         >> >> > flexibility.
> >> >         >> >> >     > i believe this is the same logic behind core
> kafka
> >> > being
> >> >         >> >> > byte[]/byte[] -
> >> >         >> >> >     > Strings are more "usable" but bytes are flexible
> >> and
> >> > so were
> >> >         >> chosen.
> >> >         >> >> >     > Also - core kafka doesnt even do that good of a
> >> job
> >> > on
> >> >         >> usability of
> >> >         >> >> > the
> >> >         >> >> >     > payload (example - i have to specify the nop
> >> byte[]
> >> > "decoders"
> >> >         >> >> > explicitly
> >> >         >> >> >     > in conf), and again sacrificies usability for
> the
> >> > sake of
> >> >         >> >> > performance (no
> >> >         >> >> >     > convenient single-record processing as poll is a
> >> > batch, lots of
> >> >         >> >> > obscure
> >> >         >> >> >     > little config details exposing internals of the
> >> > batching
> >> >         >> mechanism,
> >> >         >> >> > etc)
> >> >         >> >> >     >
> >> >         >> >> >     > this is also why i really dislike the idea of a
> >> > "type system"
> >> >         >> for
> >> >         >> >> > header
> >> >         >> >> >     > values, it further degrades the usability, adds
> >> > complexity and
> >> >         >> will
> >> >         >> >> >     > eventually get in people's way, also, it would
> be
> >> > the 2nd/3rd
> >> >         >> >> > home-group
> >> >         >> >> >     > serialization mechanism in core kafka (counting
> 2
> >> > iterations
> >> >         >> of the
> >> >         >> >> > "type
> >> >         >> >> >     > definition DSL")
> >> >         >> >> >     >
> >> >         >> >> >     > 2. this is an implementation detail, and not
> even
> >> a
> >> > very "user
> >> >         >> >> > facing" one?
> >> >         >> >> >     > to the best of my understanding the vote process
> >> is
> >> > on proposed
> >> >         >> >> >     > API/behaviour. also - since we're willing to go
> >> with
> >> > strings
> >> >         >> just
> >> >         >> >> > serialize
> >> >         >> >> >     > a 0-sized header blob and IIUC you dont need any
> >> > optionals
> >> >         >> anymore.
> >> >         >> >> >     >
> >> >         >> >> >     > 3. yes, we can :-)
> >> >         >> >> >     >
> >> >         >> >> >     > On Tue, Feb 14, 2017 at 11:56 PM, Michael
> Pearce <
> >> >         >> >> > [email protected]>
> >> >         >> >> >     > wrote:
> >> >         >> >> >     >
> >> >         >> >> >     > > Hi Jay,
> >> >         >> >> >     > >
> >> >         >> >> >     > > 1) There was some initial debate on the value
> >> > part, as youll
> >> >         >> note
> >> >         >> >> > String,
> >> >         >> >> >     > > String headers were discounted early on. The
> >> > reason for this
> >> >         >> is
> >> >         >> >> >     > flexibility
> >> >         >> >> >     > > and keeping in line with the flexibility of
> key,
> >> > value of the
> >> >         >> >> > message
> >> >         >> >> >     > > object itself. I don’t think it takes away
> from
> >> an
> >> > ecosystem
> >> >         >> as
> >> >         >> >> > each
> >> >         >> >> >     > plugin
> >> >         >> >> >     > > will care for their own key, this way ints,
> >> > booleans , exotic
> >> >         >> >> > custom
> >> >         >> >> >     > binary
> >> >         >> >> >     > > can all be catered for=.
> >> >         >> >> >     > > a. If you really wanted to push for a typed
> >> value
> >> > interface,
> >> >         >> I
> >> >         >> >> > wouldn’t
> >> >         >> >> >     > > want just String values supported, but the the
> >> > primatives
> >> >         >> plus
> >> >         >> >> > string and
> >> >         >> >> >     > > also still keeping the ability to have a
> binary
> >> > for custom
> >> >         >> >> > binaries that
> >> >         >> >> >     > > some organisations may have.
> >> >         >> >> >     > > i. I have written this slight alternative
> here,
> >> >         >> >> >     > https://cwiki.apache.org/
> >> >         >> >> >     > > confluence/display/KAFKA/KIP-
> >> > 82+-+Add+Record+Headers+-+Typed
> >> >         >> >> >     > > ii. Essentially the value bytes, has a leading
> >> > byte overhead.
> >> >         >> >> >     > > 1.  This tells you what type the value is,
> >> before
> >> > reading
> >> >         >> the rest
> >> >         >> >> > of the
> >> >         >> >> >     > > bytes, allowing serialisation/deserialization
> to
> >> > and from the
> >> >         >> >> > primitives,
> >> >         >> >> >     > > string and byte[]. This is akin to some other
> >> > messaging
> >> >         >> systems.
> >> >         >> >> >     > > 2) We are making it optional, so that for
> those
> >> > not wanting
> >> >         >> >> > headers have
> >> >         >> >> >     > 0
> >> >         >> >> >     > > bytes overhead (think of it as a feature
> flag),
> >> I
> >> > don’t
> >> >         >> think this
> >> >         >> >> > is
> >> >         >> >> >     > > complex, especially if comparing to changes
> >> > proposed in
> >> >         >> other kips
> >> >         >> >> > like
> >> >         >> >> >     > > kip-98.
> >> >         >> >> >     > > a. If you really really don’t like this, we
> can
> >> > drop it, but
> >> >         >> it
> >> >         >> >> > would
> >> >         >> >> >     > mean
> >> >         >> >> >     > > buying into 4 bytes extra overhead for users
> who
> >> > do not want
> >> >         >> to use
> >> >         >> >> >     > headers.
> >> >         >> >> >     > > 3) In the summary yes, it is at a higher
> level,
> >> > but I think
> >> >         >> this
> >> >         >> >> > is well
> >> >         >> >> >     > > documented in the proposed changes section.
> >> >         >> >> >     > > a. Added getHeaders method to
> Producer/Consumer
> >> > record (that
> >> >         >> is it)
> >> >         >> >> >     > > b. We’ve also detailed the new Headers class
> >> that
> >> > this method
> >> >         >> >> > returns
> >> >         >> >> >     > that
> >> >         >> >> >     > > encapsulates the headers protocol and logic.
> >> >         >> >> >     > >
> >> >         >> >> >     > > Best,
> >> >         >> >> >     > > Mike
> >> >         >> >> >     > >
> >> >         >> >> >     > > ==Original questions from the vote thread from
> >> > Jay.==
> >> >         >> >> >     > >
> >> >         >> >> >     > > Couple of things I think we still need to work
> >> out:
> >> >         >> >> >     > >
> >> >         >> >> >     > >    1. I think we agree about the key, but I
> >> think
> >> > we haven't
> >> >         >> >> > talked about
> >> >         >> >> >     > >    the value yet. I think if our goal is an
> open
> >> > ecosystem
> >> >         >> of these
> >> >         >> >> >     > header
> >> >         >> >> >     > >    spread across many plugins from many
> systems
> >> we
> >> > should
> >> >         >> consider
> >> >         >> >> > making
> >> >         >> >> >     > > this
> >> >         >> >> >     > >    a string as well so it can be printed, set
> >> via
> >> > a UI, set
> >> >         >> in
> >> >         >> >> > config,
> >> >         >> >> >     > etc.
> >> >         >> >> >     > >    Basically encouraging pluggable
> serialization
> >> > formats
> >> >         >> here will
> >> >         >> >> > lead
> >> >         >> >> >     > to
> >> >         >> >> >     > > a
> >> >         >> >> >     > >    bit of a tower of babel.
> >> >         >> >> >     > >    2. This proposal still includes a pretty
> big
> >> > change to our
> >> >         >> >> >     > serialization
> >> >         >> >> >     > >    and protocol definition layer. Essentially
> >> it is
> >> >         >> introducing an
> >> >         >> >> >     > optional
> >> >         >> >> >     > >    type, where the format is data dependent. I
> >> > think this is
> >> >         >> >> > actually a
> >> >         >> >> >     > big
> >> >         >> >> >     > >    change though it doesn't seem like it. It
> >> means
> >> > you can no
> >> >         >> >> > longer
> >> >         >> >> >     > > specify
> >> >         >> >> >     > >    this type with our type definition DSL, and
> >> > likewise it
> >> >         >> requires
> >> >         >> >> >     > custom
> >> >         >> >> >     > >    handling in client libs. This isn't a huge
> >> > thing, since
> >> >         >> the
> >> >         >> >> > Record
> >> >         >> >> >     > >    definition is custom anyway, but I think
> this
> >> > kind of
> >> >         >> protocol
> >> >         >> >> >     > >    inconsistency is very non-desirable and
> ties
> >> > you to
> >> >         >> hand-coding
> >> >         >> >> >     > things.
> >> >         >> >> >     > > I
> >> >         >> >> >     > >    think the type should instead by [Key
> Value]
> >> in
> >> > our BNF,
> >> >         >> where
> >> >         >> >> > key and
> >> >         >> >> >     > >    value are both short strings as used
> >> elsewhere.
> >> > This
> >> >         >> brings it
> >> >         >> >> > in line
> >> >         >> >> >     > > with
> >> >         >> >> >     > >    the rest of the protocol.
> >> >         >> >> >     > >    3. Could we get more specific about the
> exact
> >> > Java API
> >> >         >> change to
> >> >         >> >> >     > >    ProducerRecord, ConsumerRecord, Record,
> etc?
> >> >         >> >> >     > >
> >> >         >> >> >     > > -Jay
> >> >         >> >> >     > >
> >> >         >> >> >     >
> >> >         >> >> >
> >> >         >> >> >
> >> >         >> >> > The information contained in this email is strictly
> >> > confidential and
> >> >         >> for
> >> >         >> >> > the use of the addressee only, unless otherwise
> >> indicated.
> >> > If you
> >> >         >> are not
> >> >         >> >> > the intended recipient, please do not read, copy, use
> or
> >> > disclose to
> >> >         >> others
> >> >         >> >> > this message or any attachment. Please also notify the
> >> > sender by
> >> >         >> replying
> >> >         >> >> > to this email or by telephone (+44(020 7896 0011) and
> >> then
> >> > delete
> >> >         >> the email
> >> >         >> >> > and any copies of it. Opinions, conclusion (etc) that
> do
> >> > not relate
> >> >         >> to the
> >> >         >> >> > official business of this company shall be understood
> as
> >> > neither
> >> >         >> given nor
> >> >         >> >> > endorsed by it. IG is a trading name of IG Markets
> >> Limited
> >> > (a company
> >> >         >> >> > registered in England and Wales, company number
> >> 04008957)
> >> > and IG
> >> >         >> Index
> >> >         >> >> > Limited (a company registered in England and Wales,
> >> > company number
> >> >         >> >> > 01190902). Registered address at Cannon Bridge House,
> 25
> >> > Dowgate
> >> >         >> Hill,
> >> >         >> >> > London EC4R 2YA. Both IG Markets Limited (register
> >> number
> >> > 195355)
> >> >         >> and IG
> >> >         >> >> > Index Limited (register number 114059) are authorised
> >> and
> >> > regulated
> >> >         >> by the
> >> >         >> >> > Financial Conduct Authority.
> >> >         >> >> >
> >> >         >>
> >> >         >>
> >> >         >>
> >> >         >> --
> >> >         >> Nacho - Ignacio Solis - [email protected]
> >> >         >>
> >> >         >
> >> >         >
> >> >
> >> >
> >> >
> >> >
> >> > The information contained in this email is strictly confidential and
> for
> >> > the use of the addressee only, unless otherwise indicated. If you are
> >> not
> >> > the intended recipient, please do not read, copy, use or disclose to
> >> others
> >> > this message or any attachment. Please also notify the sender by
> >> replying
> >> > to this email or by telephone (+44(020 7896 0011) and then delete the
> >> email
> >> > and any copies of it. Opinions, conclusion (etc) that do not relate to
> >> the
> >> > official business of this company shall be understood as neither given
> >> nor
> >> > endorsed by it. IG is a trading name of IG Markets Limited (a company
> >> > registered in England and Wales, company number 04008957) and IG Index
> >> > Limited (a company registered in England and Wales, company number
> >> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> >> > London EC4R 2YA. Both IG Markets Limited (register number 195355) and
> IG
> >> > Index Limited (register number 114059) are authorised and regulated by
> >> the
> >> > Financial Conduct Authority.
> >> >
> >> The information contained in this email is strictly confidential and for
> >> the use of the addressee only, unless otherwise indicated. If you are
> not
> >> the intended recipient, please do not read, copy, use or disclose to
> others
> >> this message or any attachment. Please also notify the sender by
> replying
> >> to this email or by telephone (+44(020 7896 0011) and then delete the
> email
> >> and any copies of it. Opinions, conclusion (etc) that do not relate to
> the
> >> official business of this company shall be understood as neither given
> nor
> >> endorsed by it. IG is a trading name of IG Markets Limited (a company
> >> registered in England and Wales, company number 04008957) and IG Index
> >> Limited (a company registered in England and Wales, company number
> >> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> >> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> >> Index Limited (register number 114059) are authorised and regulated by
> the
> >> Financial Conduct Authority.
> >>
> >
> >
>

Re: [DISCUSS] KIP-82 - Add Record Headers

Reply via email to