Re: [DISCUSS] KIP-82 - Add Record Headers

Michael Pearce Fri, 17 Feb 2017 03:27:40 -0800

On the point of varInts

Would you be proposing in KIP-98 to convert the other message int’s (key 
length, value length) also to varint to keep it uniform.
Also I assume there will be a static or helper method made to write/read these 
in the client and server.


Cheers
Mike



On 17/02/2017, 11:22, "Michael Pearce" <[email protected]> wrote:

    On the point re: headers in the message protocol being a byte array and not 
a count of elements followed by the elements. Again this was discussed/argued 
previously.

    It was agreed on for a few reasons some of which you have obviously picked 
up on:

    Broker is able to pass it through opaquely
    The cost of parsing, we want to parse/interpret the headers lazily (this is 
a key point brought up earlier in discussions)
    Headers can be copied from consumer record to producer record (aka mirror 
makers etc) without parsing if no changes are being made or being looked at.
    Keeps the broker agnostic to the format
    You need an int32 either for the byte size of the headers, or for the count 
of elements, so overheads are the same, but with going with an opaque byte 
array has the above advantages.

    Cheers
    Mike


    On 17/02/2017, 02:50, "Jason Gustafson" <[email protected]> wrote:

        Sorry, should have noted that the performance testing was done using the
        producer performance tool shipped with Kafka.

        -Jason

        On Thu, Feb 16, 2017 at 6:44 PM, Jason Gustafson <[email protected]> 
wrote:

        > Hey Nacho,
        >
        > I've compared performance of our KIP-98 implementation with and 
without
        > varints. For messages around 128 bytes, we see an increase in 
throughput of
        > about 30% using the default configuration settings. At 256 bytes, the
        > increase is around 16%. Obviously the performance converge as 
messages get
        > larger, but it seems well worth the cost. Note that we are also 
seeing a
        > substantial performance increase against trunk primarily because of 
the
        > much more efficient packing that varints provide us. Anything adding 
to
        > message overhead, such as record headers, would only increase the 
relative
        > difference. (Of course take these numbers with a grain of salt since 
I have
        > only used the default settings with both the producer and broker on my
        > local machine. We intend to provide more extensive performance 
details as
        > part of the work for KIP-98.)
        >
        > The implementation we are using is from protobuf (
        > https://developers.google.com/protocol-buffers/docs/encoding), which 
is
        > also used in HBase. It is trivial to implement and as far as I know 
doesn't
        > suffer from the aliasing problem you are describing. I checked with 
Magnus
        > (the author of librdkafka) and he agreed that the savings seemed 
worth the
        > cost of implementation.
        >
        > -Jason
        >
        > On Thu, Feb 16, 2017 at 4:32 PM, Ignacio Solis <[email protected]> 
wrote:
        >
        >> -VarInts
        >>
        >> I'm one of the people (if not the most) opposed to VarInts.  VarInts
        >> have a place, but this is not it.   (We had a large discussion about
        >> them at the beginning of KIP-82 time)
        >>
        >> If anybody has real life performance numbers of VarInts improving
        >> things or significantly reducing resources I would like to know what
        >> that case may be. Yes, you can save some bytes here and there, but
        >> this is probably insignificant to the overall system behavior and
        >> storage requirements.  -- I say this with respect to using VarInts in
        >> the protocol itself, not as part of the data.
        >>
        >> VarInts require you to parse the Int before using it and depending on
        >> the encoding they can suffer from aliasing (multiple representations
        >> for the same value).
        >>
        >> Why add complexity?
        >>
        >> Nacho
        >>
        >>
        >> On Thu, Feb 16, 2017 at 10:29 AM, Colin McCabe <[email protected]>
        >> wrote:
        >> > +1 for varints here-- it would save quite a bit of space.  They are
        >> > pretty quick to implement as well.
        >> >
        >> > I think it makes sense for values to be byte arrays.  Users might 
want
        >> > to attach arbitrary payloads; they shouldn't be forced to serialize
        >> > everything to Java strings.
        >> >
        >> > best,
        >> > Colin
        >> >
        >> >
        >> > On Thu, Feb 16, 2017, at 09:52, Jason Gustafson wrote:
        >> >> Hey Michael,
        >> >>
        >> >> Hmm, I guess the point of representing it as bytes is to allow the
        >> broker
        >> >> to pass it through opaquely? Is the cost of parsing them a 
concern, or
        >> >> are
        >> >> we simply trying to ensure that the broker stays agnostic to the
        >> format?
        >> >>
        >> >> On varints, I think adding support for them makes less sense for 
an
        >> >> isolated use case, but as part of a more holistic change (such as 
what
        >> we
        >> >> have proposed in KIP-98), I think they are justifiable. If we add 
them,
        >> >> then the need to use attributes becomes quite a bit weaker, 
right? The
        >> >> other thing I find slightly odd is the fact that null headers has 
no
        >> >> actual
        >> >> semantic meaning for the message (unlike null keys and values). 
It is
        >> >> just
        >> >> a space optimization. It seems a bit better to always use size 0 
to
        >> >> indicate having no headers.
        >> >>
        >> >> Overall, the main point is ensuring that the message schema 
remains
        >> >> consistent, either within the larger protocol, or at a minimum 
within
        >> the
        >> >> message itself.
        >> >>
        >> >> -Jason
        >> >>
        >> >> On Thu, Feb 16, 2017 at 6:39 AM, Michael Pearce 
<[email protected]
        >> >
        >> >> wrote:
        >> >>
        >> >> > Hi Jason,
        >> >> >
        >> >> > On point 1) in the message protocol the headers are simply a 
byte
        >> array,
        >> >> > as like the key or value, this is to clearly demarcate the 
header in
        >> the
        >> >> > core message. Then the header byte array in the core message is 
an
        >> array of
        >> >> > key, value pairs. This is what it is denoting.
        >> >> >
        >> >> > Then this would be I guess in the given notation:
        >> >> >
        >> >> > Headers => [KeyLength, Key, ValueLength, Value]
        >> >> >     KeyLength => int32 <-----------------NEW size of the byte[] 
of
        >> the
        >> >> > serialised key value
        >> >> >     Key => bytes <---------------------- NEW serialised string 
(UTF8)
        >> >> > bytes of the header key
        >> >> >     ValueLength => int32 <-------------- NEW size of the byte[] 
of
        >> the
        >> >> > serialised header value
        >> >> >     Value => bytes <-------------------- NEW serialised form of 
the
        >> header
        >> >> > value
        >> >> >
        >> >> > The key length and value length is matching the way the 
protocol is
        >> >> > defined in the core message currently.
        >> >> >
        >> >> >
        >> >> >
        >> >> >
        >> >> > On point 2)
        >> >> > Var sized ints, this was discussed much earlier on, in fact I 
had
        >> >> > suggested it myself (with Hadoop references), the complexity of 
this
        >> >> > compared to having a simpler protocol was argued and agreed it
        >> wasn’t worth
        >> >> > the complexity as all other clients in other languages would 
need to
        >> ensure
        >> >> > theyre using the right var size algorithm, as there is a few.
        >> >> >
        >> >> > On point 3)
        >> >> > We did the attributes, optional approach as originally there was
        >> marked
        >> >> > concern that headers would cause a message size overhead for 
others,
        >> who
        >> >> > don’t want them. As such this is the clean solution to achieve 
that.
        >> If
        >> >> > that no longer holds, and we don’t care that we add 4bytes 
overhead,
        >> then
        >> >> > im happy to remove.
        >> >> >
        >> >> > I’m personally in favour of keeping the message as small as 
possible
        >> so
        >> >> > people don’t get shocks in perf and throughputs dues to message 
size,
        >> >> > unless they actively use the feature, as such I do prefer the
        >> attribute bit
        >> >> > wise feature flag approach myself.
        >> >> >
        >> >> >
        >> >> >
        >> >> >
        >> >> > On 16/02/2017, 05:40, "Jason Gustafson" <[email protected]> 
wrote:
        >> >> >
        >> >> >     We have proposed a few significant changes to the message 
format
        >> in
        >> >> > KIP-98
        >> >> >     which now seems likely to pass (perhaps with some 
iterations on
        >> >> >     implementation details). It would be good to try and 
coordinate
        >> the
        >> >> > changes
        >> >> >     in both of the proposals to make sure they are consistent 
and
        >> >> > compatible.
        >> >> >
        >> >> >     I think using the attributes to indicate null headers is a
        >> reasonable
        >> >> >     approach. We have proposed to do the same thing for the 
message
        >> key and
        >> >> >     value. That said, I sympathize with Jay's argument. Having
        >> multiple
        >> >> > ways to
        >> >> >     specify a null value increases the overall complexity of the
        >> protocol.
        >> >> > You
        >> >> >     can see this just from the fact that you need the extra 
verbiage
        >> in the
        >> >> >     protocol specification in this KIP and in KIP-98 to 
describe the
        >> >> > dependence
        >> >> >     between the fields and the attributes. It seems like a 
slippery
        >> slope
        >> >> > if
        >> >> >     you start allowing different request types to implement the
        >> protocol
        >> >> >     specification differently.
        >> >> >
        >> >> >     You can also argue that the messages already are and are 
likely
        >> to
        >> >> > remain a
        >> >> >     special case. For example, there is currently no generality 
in
        >> how
        >> >> >     compressed message sets are represented that would be 
applicable
        >> for
        >> >> > other
        >> >> >     request types. Some might see this divergence as an 
unfortunate
        >> >> > protocol
        >> >> >     deficiency which should be fixed; others might see it as 
sort of
        >> the
        >> >> >     inevitability of needing to optimize where it counts most. 
I'm
        >> probably
        >> >> >     somewhere in between, but I think we probably all share the
        >> intuition
        >> >> > that
        >> >> >     the protocol should be kept as consistent as possible. With 
that
        >> in
        >> >> > mind,
        >> >> >     here are a few comments:
        >> >> >
        >> >> >     1. One thing I found a little odd when reading the current
        >> proposal is
        >> >> > that
        >> >> >     the headers are both represented as an array of bytes and 
as an
        >> array
        >> >> > of
        >> >> >     key/value pairs. I'd probably suggest something like this:
        >> >> >
        >> >> >     Headers => [HeaderKey HeaderValue]
        >> >> >      HeaderKey => String
        >> >> >      HeaderValue => Bytes
        >> >> >
        >> >> >     An array in the Kafka protocol is represented as a 4-byte 
integer
        >> >> >     indicating the number of elements in the array followed by 
the
        >> >> >     serialization of the elements. Unless I'm misunderstanding, 
what
        >> you
        >> >> > have
        >> >> >     instead is the total size of the headers in bytes followed 
by the
        >> >> > elements.
        >> >> >     I'm not sure I see any reason for this inconsistency.
        >> >> >
        >> >> >     2. In KIP-98, we've introduced variable-length integer 
fields.
        >> >> > Effectively,
        >> >> >     we've enriched (or "complicated" as Jay might say ;) the 
protocol
        >> >> >     specification to include the following types: VarInt, 
VarLong,
        >> >> >     UnsignedVarInt and UnsignedVarLong.
        >> >> >
        >> >> >     Along with these primitives, we could introduce the 
following
        >> types:
        >> >> >
        >> >> >     VarSizeArray => NumberOfItems Item1 Item2 .. ItemN
        >> >> >       NumberOfItems => UnsignedVarInt
        >> >> >
        >> >> >     VarSizeNullableArray => NumberOfItemsOrNull Item1 Item2 .. 
ItemN
        >> >> >       NumberOfItemsOrNull => VarInt (-1 means null)
        >> >> >
        >> >> >     And similarly for the `String` and `Bytes` types. These 
types
        >> can save
        >> >> > a
        >> >> >     considerable amount of space in this proposal because they 
can
        >> be used
        >> >> > for
        >> >> >     both the number of headers included in the message and the
        >> lengths of
        >> >> > the
        >> >> >     header keys and values. We could do this instead:
        >> >> >
        >> >> >     Headers => VarSizeArray[HeaderKey HeaderValue]
        >> >> >       HeaderKey => VarSizeString
        >> >> >       HeaderValue => VarSizeBytes
        >> >> >
        >> >> >     Combining the savings from the use of variable length 
fields, the
        >> >> > benefit
        >> >> >     of using the attributes to represent null seems pretty 
small.
        >> >> >
        >> >> >     3. Whichever way we go (whether we use the attributes or 
not), we
        >> >> > should at
        >> >> >     least be consistent between this KIP and KIP-98. It would 
be very
        >> >> > strange
        >> >> >     to have two ways to represent null values in the same 
schema.
        >> Either
        >> >> > way is
        >> >> >     OK with me. I think some message-level optimizations are
        >> justifiable,
        >> >> > but
        >> >> >     the savings here seem minimal (a few bytes per message), so
        >> maybe it's
        >> >> > not
        >> >> >     worth the cost of letting the message diverge even further 
from
        >> the
        >> >> > rest of
        >> >> >     the protocol.
        >> >> >
        >> >> >     -Jason
        >> >> >
        >> >> >
        >> >> >     On Wed, Feb 15, 2017 at 8:52 AM, radai <
        >> [email protected]>
        >> >> > wrote:
        >> >> >
        >> >> >     > I've trimmed the inline contents as this mail is getting 
too
        >> big for
        >> >> > the
        >> >> >     > apache mailing list software to deliver :-(
        >> >> >     >
        >> >> >     > 1. the important thing for interoperability is for 
different
        >> >> > "interested
        >> >> >     > parties" (plugins, infra layers/wrappers, user-code) to be
        >> able to
        >> >> > stick
        >> >> >     > pieces of metadata onto msgs without getting in each 
other's
        >> way. a
        >> >> > common
        >> >> >     > key scheme (Strings, as of the time of this writing?) is 
all
        >> thats
        >> >> > required
        >> >> >     > for that. it is assumed that the other end interested in 
any
        >> such
        >> >> > piece of
        >> >> >     > metadata knows the encoding, and byte[] provides for the 
most
        >> >> > flexibility.
        >> >> >     > i believe this is the same logic behind core kafka being
        >> >> > byte[]/byte[] -
        >> >> >     > Strings are more "usable" but bytes are flexible and so 
were
        >> chosen.
        >> >> >     > Also - core kafka doesnt even do that good of a job on
        >> usability of
        >> >> > the
        >> >> >     > payload (example - i have to specify the nop byte[] 
"decoders"
        >> >> > explicitly
        >> >> >     > in conf), and again sacrificies usability for the sake of
        >> >> > performance (no
        >> >> >     > convenient single-record processing as poll is a batch, 
lots of
        >> >> > obscure
        >> >> >     > little config details exposing internals of the batching
        >> mechanism,
        >> >> > etc)
        >> >> >     >
        >> >> >     > this is also why i really dislike the idea of a "type 
system"
        >> for
        >> >> > header
        >> >> >     > values, it further degrades the usability, adds 
complexity and
        >> will
        >> >> >     > eventually get in people's way, also, it would be the 
2nd/3rd
        >> >> > home-group
        >> >> >     > serialization mechanism in core kafka (counting 2 
iterations
        >> of the
        >> >> > "type
        >> >> >     > definition DSL")
        >> >> >     >
        >> >> >     > 2. this is an implementation detail, and not even a very 
"user
        >> >> > facing" one?
        >> >> >     > to the best of my understanding the vote process is on 
proposed
        >> >> >     > API/behaviour. also - since we're willing to go with 
strings
        >> just
        >> >> > serialize
        >> >> >     > a 0-sized header blob and IIUC you dont need any optionals
        >> anymore.
        >> >> >     >
        >> >> >     > 3. yes, we can :-)
        >> >> >     >
        >> >> >     > On Tue, Feb 14, 2017 at 11:56 PM, Michael Pearce <
        >> >> > [email protected]>
        >> >> >     > wrote:
        >> >> >     >
        >> >> >     > > Hi Jay,
        >> >> >     > >
        >> >> >     > > 1) There was some initial debate on the value part, as 
youll
        >> note
        >> >> > String,
        >> >> >     > > String headers were discounted early on. The reason for 
this
        >> is
        >> >> >     > flexibility
        >> >> >     > > and keeping in line with the flexibility of key, value 
of the
        >> >> > message
        >> >> >     > > object itself. I don’t think it takes away from an 
ecosystem
        >> as
        >> >> > each
        >> >> >     > plugin
        >> >> >     > > will care for their own key, this way ints, booleans , 
exotic
        >> >> > custom
        >> >> >     > binary
        >> >> >     > > can all be catered for=.
        >> >> >     > > a. If you really wanted to push for a typed value 
interface,
        >> I
        >> >> > wouldn’t
        >> >> >     > > want just String values supported, but the the 
primatives
        >> plus
        >> >> > string and
        >> >> >     > > also still keeping the ability to have a binary for 
custom
        >> >> > binaries that
        >> >> >     > > some organisations may have.
        >> >> >     > > i. I have written this slight alternative here,
        >> >> >     > https://cwiki.apache.org/
        >> >> >     > > 
confluence/display/KAFKA/KIP-82+-+Add+Record+Headers+-+Typed
        >> >> >     > > ii. Essentially the value bytes, has a leading byte 
overhead.
        >> >> >     > > 1.  This tells you what type the value is, before 
reading
        >> the rest
        >> >> > of the
        >> >> >     > > bytes, allowing serialisation/deserialization to and 
from the
        >> >> > primitives,
        >> >> >     > > string and byte[]. This is akin to some other messaging
        >> systems.
        >> >> >     > > 2) We are making it optional, so that for those not 
wanting
        >> >> > headers have
        >> >> >     > 0
        >> >> >     > > bytes overhead (think of it as a feature flag), I don’t
        >> think this
        >> >> > is
        >> >> >     > > complex, especially if comparing to changes proposed in
        >> other kips
        >> >> > like
        >> >> >     > > kip-98.
        >> >> >     > > a. If you really really don’t like this, we can drop 
it, but
        >> it
        >> >> > would
        >> >> >     > mean
        >> >> >     > > buying into 4 bytes extra overhead for users who do not 
want
        >> to use
        >> >> >     > headers.
        >> >> >     > > 3) In the summary yes, it is at a higher level, but I 
think
        >> this
        >> >> > is well
        >> >> >     > > documented in the proposed changes section.
        >> >> >     > > a. Added getHeaders method to Producer/Consumer record 
(that
        >> is it)
        >> >> >     > > b. We’ve also detailed the new Headers class that this 
method
        >> >> > returns
        >> >> >     > that
        >> >> >     > > encapsulates the headers protocol and logic.
        >> >> >     > >
        >> >> >     > > Best,
        >> >> >     > > Mike
        >> >> >     > >
        >> >> >     > > ==Original questions from the vote thread from Jay.==
        >> >> >     > >
        >> >> >     > > Couple of things I think we still need to work out:
        >> >> >     > >
        >> >> >     > >    1. I think we agree about the key, but I think we 
haven't
        >> >> > talked about
        >> >> >     > >    the value yet. I think if our goal is an open 
ecosystem
        >> of these
        >> >> >     > header
        >> >> >     > >    spread across many plugins from many systems we 
should
        >> consider
        >> >> > making
        >> >> >     > > this
        >> >> >     > >    a string as well so it can be printed, set via a UI, 
set
        >> in
        >> >> > config,
        >> >> >     > etc.
        >> >> >     > >    Basically encouraging pluggable serialization formats
        >> here will
        >> >> > lead
        >> >> >     > to
        >> >> >     > > a
        >> >> >     > >    bit of a tower of babel.
        >> >> >     > >    2. This proposal still includes a pretty big change 
to our
        >> >> >     > serialization
        >> >> >     > >    and protocol definition layer. Essentially it is
        >> introducing an
        >> >> >     > optional
        >> >> >     > >    type, where the format is data dependent. I think 
this is
        >> >> > actually a
        >> >> >     > big
        >> >> >     > >    change though it doesn't seem like it. It means you 
can no
        >> >> > longer
        >> >> >     > > specify
        >> >> >     > >    this type with our type definition DSL, and likewise 
it
        >> requires
        >> >> >     > custom
        >> >> >     > >    handling in client libs. This isn't a huge thing, 
since
        >> the
        >> >> > Record
        >> >> >     > >    definition is custom anyway, but I think this kind of
        >> protocol
        >> >> >     > >    inconsistency is very non-desirable and ties you to
        >> hand-coding
        >> >> >     > things.
        >> >> >     > > I
        >> >> >     > >    think the type should instead by [Key Value] in our 
BNF,
        >> where
        >> >> > key and
        >> >> >     > >    value are both short strings as used elsewhere. This
        >> brings it
        >> >> > in line
        >> >> >     > > with
        >> >> >     > >    the rest of the protocol.
        >> >> >     > >    3. Could we get more specific about the exact Java 
API
        >> change to
        >> >> >     > >    ProducerRecord, ConsumerRecord, Record, etc?
        >> >> >     > >
        >> >> >     > > -Jay
        >> >> >     > >
        >> >> >     >
        >> >> >
        >> >> >
        >> >> > The information contained in this email is strictly 
confidential and
        >> for
        >> >> > the use of the addressee only, unless otherwise indicated. If 
you
        >> are not
        >> >> > the intended recipient, please do not read, copy, use or 
disclose to
        >> others
        >> >> > this message or any attachment. Please also notify the sender by
        >> replying
        >> >> > to this email or by telephone (+44(020 7896 0011) and then 
delete
        >> the email
        >> >> > and any copies of it. Opinions, conclusion (etc) that do not 
relate
        >> to the
        >> >> > official business of this company shall be understood as neither
        >> given nor
        >> >> > endorsed by it. IG is a trading name of IG Markets Limited (a 
company
        >> >> > registered in England and Wales, company number 04008957) and IG
        >> Index
        >> >> > Limited (a company registered in England and Wales, company 
number
        >> >> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
        >> Hill,
        >> >> > London EC4R 2YA. Both IG Markets Limited (register number 
195355)
        >> and IG
        >> >> > Index Limited (register number 114059) are authorised and 
regulated
        >> by the
        >> >> > Financial Conduct Authority.
        >> >> >
        >>
        >>
        >>
        >> --
        >> Nacho - Ignacio Solis - [email protected]
        >>
        >
        >




The information contained in this email is strictly confidential and for the 
use of the addressee only, unless otherwise indicated. If you are not the 
intended recipient, please do not read, copy, use or disclose to others this 
message or any attachment. Please also notify the sender by replying to this 
email or by telephone (+44(020 7896 0011) and then delete the email and any 
copies of it. Opinions, conclusion (etc) that do not relate to the official 
business of this company shall be understood as neither given nor endorsed by 
it. IG is a trading name of IG Markets Limited (a company registered in England 
and Wales, company number 04008957) and IG Index Limited (a company registered 
in England and Wales, company number 01190902). Registered address at Cannon 
Bridge House, 25 Dowgate Hill, London EC4R 2YA. Both IG Markets Limited 
(register number 195355) and IG Index Limited (register number 114059) are 
authorised and regulated by the Financial Conduct Authority.

Re: [DISCUSS] KIP-82 - Add Record Headers

Reply via email to