Hi Ismael,
The response below was in regards to your comments, got my wires crossed,
Hi Jun,
I’m happy with the change, I see Jason updated our KIP, many thanks for this,
and thanks for implementing for us ☺
On 20/03/2017, 13:19, "Michael Pearce" wrote:
Hi Jun,
Thanks the comments I’ve updated the KIP a little where agreement.
My comments:
1) Good point, removed from the interface. See updated KIP
2) I think, Radai’s suggested header(String key) is a cleaner method name, but
happy to change if community believe lastHeader is better. I’ll keep
Jun, the message format you proposed seems reasonable to me. I have a few
minor comments with regards to the user facing API:
1. Do we want to expose the `close()` method in the Headers interface? It
seems that this method should only be called by the producer after the
headers have been passed to
Hi, Everyone,
Jason has been working on the new message format related to EOS (
https://github.com/apache/kafka/pull/2614). He has included the header
changes proposed in the KIP, which reduces the overhead for supporting an
additional message format change if done separately. Since the message
Thanks Radai. Great to have a concrete example of the intended usage.
Regarding performance, we would need to benchmark, as you said. But there
would be a lot of reuse (in essence, we are copying 5 references plus a new
object header), so I'd be surprised if that would be the bottleneck
compared t
the common "stack" we envision at linkedin would consist of (at least) the
following components that add headers to every outgoing request:
1. auditing/"lineage" - appends a header containing "node" (hostname etc),
time (UTC time) and destination (cluster/topic). these accumulate as
requests get m
t;> >> >>> > storing
> >> > >> >> >>> > >> the schema id together with serialized bytes in the
> value
> >> is
> >> > >> >> better?
> >> > >> >> >>> > >>
> >> > >> >
Hi Ismael,
Yes, it makes sense to do benchmark. My concern was based on the
observation in KAFKA-3994 where we saw GC problem when creating new lists
in the purgatory.
Jiangjie (Becket) Qin
On Fri, Mar 10, 2017 at 8:54 AM, Ismael Juma wrote:
> Hi Becket,
> Sorry for the delay and t
Hi Becket,
Sorry for the delay and thanks for your comments. Comments inline.
On Wed, Mar 1, 2017 at 8:59 PM, Becket Qin wrote:
> > The difference is that the user chooses the value type. They are free to
> > choose a mutable or immutable type. A generic interceptor cannot mutate
> the
> > valu
just to clarify - ListIterator is a nice API, and doesnt constrain the
implementation a lot more than Iterator (especially if we implement
previous() very inefficiently :-) ), but changing
Iterable headers(String key)
ListIterator headers(String key)
would lose us the ability to easily write w
where do you see insert-in-the-middle/replace being commonly used?
lineage tracing, as you call it, would probably be implemented by way of:
1. every "stop" along the way appending itself (at the end)
2. some replication technologies, instead of just doing #1, may clear out
everything when they re
As others have mentioned, it seems clear that we want to preserve the
ordering of message headers, so that we can implement things like
lineage tracing. (For example, each stage could add a "lineage:"
header.) I also think that we want the ability to add and remove
headers as needed. It would be
Hi Ismael,
Thanks for the reply. Please see the comments inline.
On Wed, Mar 1, 2017 at 6:47 AM, Ismael Juma wrote:
> Hi Becket,
> Thanks for sharing your thoughts. More inline.
> On Wed, Mar 1, 2017 at 2:54 AM, Becket Qin wrote:
> > As you can imagine if the ProducerRecord has a value a
i used void because im used to java beans. thinking about it, i dont see
much use for returning false from adding a header: if the headers are in
read-only you should probably thrown an IllegalStateException because lets
face it, 99% of users dont check return values.
returning "this" is
Hi Becket,
Thanks for sharing your thoughts. More inline.
On Wed, Mar 1, 2017 at 2:54 AM, Becket Qin wrote:
> As you can imagine if the ProducerRecord has a value as a List and the
> Interceptor.onSend() can actually add an element to the List. If the
> producer.send() is called on the same Pro
f the
operation succeeded?
From: Michael Pearce
Sent: Wednesday, March 1, 2017 5:55 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
Hi Radai:
Header header(String key) - returns JUST ONE (the very last)
ne is needed:
From: Becket Qin
Sent: Wednesday, March 1, 2017 2:54 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
Hi Ismael,
Yes, there is a difference between Batch and Headers. I was just trying to
see if that would work.
Hi Ismael,
Yes, there is a difference between Batch and Headers. I was just trying to
see if that would work. Good point about sending the same ProducerRecord
twice, but in fact in that case any reuse of objects would cause problem.
As you can imagine if the ProducerRecord has a value as a List a
I will settle for any API really, but just wanted to point out that as it
stands right now the API targets the most "advanced" (hence obscure and
rare) use cases, at the expense of the simple and common ones. i'd suggest
(as the minimal set):
Header header(String key) - returns JUST ONE (the very
Hi Becket,
Comments inline.
On Sat, Feb 25, 2017 at 10:33 PM, Becket Qin wrote:
> 1. Regarding the mutability.
> I think it would be a big convenience to have headers mutable during
> certain stage in the message life cycle for the use cases you mentioned. I
> agree there is a material benef
u? Or any other ideas?
From: Jason Gustafson
Sent: Tuesday, February 28, 2017 1:38 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
If I understand correctly, the suggestion is to let headers be mutable on
the producer side basically until after they'
> > release and we're almost now there on getting this KIP to the state
> > everyone is happy. As you note address that later if theres the need.
> >
> >
> > Ill leave it 24hrs and update the kip if no strong objections based on
> > your solution for 1
elease and we're almost now there on getting this KIP to the state
> everyone is happy. As you note address that later if theres the need.
> Ill leave it 24hrs and update the kip if no strong objections based on
> your solution for 1 & 2.
> Cheers
> Mike
if theres the need.
Ill leave it 24hrs and update the kip if no strong objections based on your
solution for 1 & 2.
__ __
From: Becket Qin
Sent: Saturday, February 25, 2017 10:33 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 -
ke the other arguements of let's implement
> simple and
> > then we
> > > can always add pattern later as well if it's found it's
> needed. (As
> > noted
> > > it's easier to add me
> it's easier to add methods than to take away)
> >
> > Great I'll update kip with extra methods on producerecord
and a
> note
> > that new objects are returned by method calls.
> >
> >
> >
gt; it's easier to add methods than to take away)
> >
> > Great I'll update kip with extra methods on producerecord and a
> note
> > that new objects are returned by method calls.
> >
> >
> >
> > Sent using OWA for iPhone
> noted
> > it's easier to add methods than to take away)
> >
> > Great I'll update kip with extra methods on producerecord and a
> note
> > that new objects are returned by method calls.
> >
> >
> >
> > Sent u
> > > > will be a dominant use case. We want a user friendly
> > Having as
> > > a user
> > > > having to code this instead of having the headers handle
> this
and a note
> that new objects are returned by method calls.
> Sent using OWA for iPhone
> From: Jason Gustafson
> Sent: Friday, February 24, 2017 6:51:45 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
Sent: Friday, February 24, 2017 6:51:45 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
The APIs in the current KIP look good to me. Just a couple questions: why
does filter not return Headers? Also would it be useful if the key is a
Sent using OWA for iPhone
From: Jason Gustafson
Sent: Friday, February 24, 2017 6:51:45 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
The APIs in the current KIP look good to me. Just a couple questions: why
does filter not return Headers? Also
gt; > > unnecessarily
> > > (i.e. if they were not accessed). And note that making
> the
> > Headers
> > > immutable doesn't necessarily mean that they need to be
> > copied:
> > > you
> > wrote:
> >
> > > If the argument for not having a map holding the key,
> > pairs is due
> > > to garbage creation of HashMap en
> >
> > > If the argument for not having a map holding the key, value
> > pairs is due
> > > to garbage creation of HashMap entry's, forcing the
> creation of
> > a whole new
> > > producer rec
argument for not having a map holding the key,
> > pairs is due
> > > to garbage creation of HashMap entry's, forcing the
> creation of
> > a whole new
> > > producer reco
> If the argument for not having a map holding the key, value
> > pairs is due
> > > to garbage creation of HashMap entry's, forcing the
> creation of
> > a whole new
> > > producer record to simply add a head, surely is creating
> > to garbage creation of HashMap entry's, forcing the creation of
> a whole new
> > producer record to simply add a head, surely is creating a-lot
> more?
> > ____________________
> > From: J
ing the creation of
> a whole new
> > producer record to simply add a head, surely is creating a-lot
> more?
> > ____________
> > From: Jason Gustafson
> > Sent: Wednesday, February 22, 2017 10:09 PM
> &
ly add a head, surely is creating a-lot more?
> From: Jason Gustafson
> Sent: Wednesday, February 22, 2017 10:09 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
y, February 22, 2017 10:09 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> The current producer interceptor API is this:
> ProducerRecord onSend(ProducerRecord record);
> So adding a header means creatin
ly add a head, surely is creating a-lot more?
> From: Jason Gustafson
> Sent: Wednesday, February 22, 2017 10:09 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> The current producer inter
Sent: Wednesday, February 22, 2017 10:09 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
The current producer interceptor API is this:
ProducerRecord onSend(ProducerRecord record);
So adding a header means creating a new ProducerRecord with a new header
added to the cu
> So how would you have this work if not mutable where interceptors would
> add headers?
> Sent using OWA for iPhone
> From: Jason Gustafson
> Sent: Wednesday, February 22, 2017 8:42:27 PM
> To: dev@kafka.apache.org
> Su
So how would you have this work if not mutable where interceptors would add
Sent using OWA for iPhone
From: Jason Gustafson
Sent: Wednesday, February 22, 2017 8:42:27 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
I think the point on the mutability of Headers is worth discussing a little
more. As far as I can tell, once the ProducerRecord (or ConsumerRecord) is
constructed, there should be no need to further change the headers. Is that
correct? If so, then why not enforce that that is the case through the A
Hi Ismael
On point 1,
Sure makes sense will update shortly.
On point 2,
Setter/getter typical to properties/headers api’s traditionally are map styled
interfaces and what I believe is most expected styled thus the Key, Value
Also it would mean rather than an interface, we would be ma
Hi all,
Great to see the progress that has been achieved on this one. :) A few
comments regarding the APIs (I'm still reviewing the message format
1. Nit: `getHeaders` in `ProducerRecord` and `ConsumerRecord` should be
named `headers` (we avoid the `get` prefix in Kafka)
2. The `Header
Hi Jason,
Have converted the interface/api bullets into interface code snippets.
Agreed implementation won’t take too long. We have early versions already.
Maybe a week before you think about merging I would assume it would be more
stabilised? I was thinking then we could fork from your conflue
Hey Michael,
Awesome. I have a minor request. The APIs are currently documented as a
wiki list. Would you mind adding a code snippet instead? It's a bit easier
to process.
How will be best to manage this, as we will obviously build off your KIP’s
> protocol changes, to avoid a merge hell, should
> > > > to
> > > > > serialize
> > > > > their own headers, and it would be more
> > ideal if we can
> > > avoid
> > > > > leaking
> > > > >
oposal for that problem.
> > > >
> > > > -Jason
> > > >
> > > >
> > > >
> > >
ay’s requests:
> > > >
> > > > “2. I think we should think about creating the
> > lazily to
> > > avoid
> > > > parsing out all the headers into little obj
> > message
> > > body, but
> > > > would cause us not to lazy initialise/parse the headers,
> as
> > > obviously, we
> > > > would have to traverse these reading the message.
> > > >
> > > wrote:
> > >
> > > > I am happy to move the definition of the header into the
> > message
> > > body, but
> > > > would cause us not to lazy initialise/parse the headers,
> as
> > > obviously, we
> > > > would hav
think we should think about creating the Map
> lazily to
> > avoid
> > > parsing out all the headers into little objects.
> HashMaps
> > themselves
> > > are
> > > kind of expensive and the consumer is very perf
> sensitive so
> >
> >
> > Yes exactly we have access to the records thus why the
> header should
> > be accessible via it and not hidden for only interceptors to
> access.
> >
> > Sent using OWA for iPhone
t; > idea.”
> >
> >
> >
> >
> >
> > On 17/02/2017, 19:44, "Michael Pearce"
> wrote:
> >
> > Yes exactly we have access to the records thus why the
> header should
> > be accessible v
OWA for iPhone
> From: Magnus Edenhill
> Sent: Friday, February 17, 2017 7:34:49 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> From: Magnus Edenhill
> Sent: Friday, February 17, 2017 7:34:49 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> Big +1 on VarInts.
> CPUs are fast, memory is slow.
d not hidden for only interceptors to access.
> Sent using OWA for iPhone
> From: Magnus Edenhill
> Sent: Friday, February 17, 2017 7:34:49 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
17 7:34:49 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
Big +1 on VarInts.
CPUs are fast, memory is slow.
I agree with Jason that we'll want to continue verifying messages,
including their headers, so while I appreciate the ide
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
Big +1 on VarInts.
CPUs are fast, memory is slow.
I agree with Jason that we'll want to continue verifying messages,
including their headers, so while I appreciate the idea of the opaque
header blob it won't be useful in practice.
. It needs to be at the
> >> record level.
> >>
> >> This anyhow is similar to jms/http/amqp where headers are available to
> >> consuming applications.
> >>
> >> Re headers as byte array and future use by broker. This doesn't take
> away
. This doesn't take away
>> from that at all. Nor makes it difficult at all in my opinion.
>> Sent using OWA for iPhone
>> From: Jason Gustafson
>> Sent: Friday, February 17, 2017 5:55:42 PM
OWA for iPhone
> From: Jason Gustafson
> Sent: Friday, February 17, 2017 5:55:42 PM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> >
> > Would you be proposing in KIP-98 to convert the ot
for iPhone
From: Jason Gustafson
Sent: Friday, February 17, 2017 5:55:42 PM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> Would you be proposing in KIP-98 to convert the other message int’s (key
> length, value length) also to varint to keep it uniform.
> Would you be proposing in KIP-98 to convert the other message int’s (key
> length, value length) also to varint to keep it uniform.
> Also I assume there will be a static or helper method made to write/read
> these in the client and server.
Yes, that is what we are proposing, so using varints
On the point of varInts
Would you be proposing in KIP-98 to convert the other message int’s (key
length, value length) also to varint to keep it uniform.
Also I assume there will be a static or helper method made to write/read these
in the client and server.
On 17/02/2017, 11:22,
On the point re: headers in the message protocol being a byte array and not a
count of elements followed by the elements. Again this was discussed/argued
It was agreed on for a few reasons some of which you have obviously picked up
Broker is able to pass it through opaquely
Sorry, should have noted that the performance testing was done using the
producer performance tool shipped with Kafka.
On Thu, Feb 16, 2017 at 6:44 PM, Jason Gustafson wrote:
> Hey Nacho,
> I've compared performance of our KIP-98 implementation with and without
> varints. For messages
Hey Nacho,
I've compared performance of our KIP-98 implementation with and without
varints. For messages around 128 bytes, we see an increase in throughput of
about 30% using the default configuration settings. At 256 bytes, the
increase is around 16%. Obviously the performance converge as message
I'm one of the people (if not the most) opposed to VarInts. VarInts
have a place, but this is not it. (We had a large discussion about
them at the beginning of KIP-82 time)
If anybody has real life performance numbers of VarInts improving
things or significantly reducing resources I w
+1 for varints here-- it would save quite a bit of space. They are
pretty quick to implement as well.
I think it makes sense for values to be byte arrays. Users might want
to attach arbitrary payloads; they shouldn't be forced to serialize
everything to Java strings.
On Thu, Feb 1
Hey Michael,
Hmm, I guess the point of representing it as bytes is to allow the broker
to pass it through opaquely? Is the cost of parsing them a concern, or are
we simply trying to ensure that the broker stays agnostic to the format?
On varints, I think adding support for them makes less sense f
Hi Jason,
On point 1) in the message protocol the headers are simply a byte array, as
like the key or value, this is to clearly demarcate the header in the core
message. Then the header byte array in the core message is an array of key,
value pairs. This is what it is denoting.
Then this would
We have proposed a few significant changes to the message format in KIP-98
which now seems likely to pass (perhaps with some iterations on
implementation details). It would be good to try and coordinate the changes
in both of the proposals to make sure they are consistent and compatible.
I think u
I've trimmed the inline contents as this mail is getting too big for the
apache mailing list software to deliver :-(
1. the important thing for interoperability is for different "interested
parties" (plugins, infra layers/wrappers, user-code) to be able to stick
pieces of metadata onto msgs withou
while HTTP-style (string, string) are the most common and most familiar,
there is a very significant impact on msg size, especially given that some
payloads are literally a few integers (think stock quotes) and would be
dwarfed by an http-like header segment.
I think we're ok with not allowing for
The details about headers for control messages are still to define. But
yes, the idea is to have some common default behavior that clients would
need to implement.
The point is, that "regular headers" add meta data to regular messages.
Thus, those messages will be returned to the user via .poll().
Thanks for your input. I'm +1 on control messages as they seem to be the
simplest way to implement watermarks (
https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102), a
feature that would add a lot of value to Kafka Streams IMHO.
Your argument that the control-message ind
@Matthias - oh.
I think over the course of this thread enough use cases have been presented
for things that can be done/solved with headers that even if every single
potential use case has a better custom implementation (which I dont
believe) headers are clearly one of the best possible kafka modi
Hi, Michael,
Thanks for the response.
100. Is there any other metadata associated with the uuid that APM sends to
the central coordinator? What kind of things could you do once the tracing
is embedded in each message?
103. How do you preserve the per key ordering when switching to a different
Yes and no. I did overload the term "control message".
EOS control messages are for client-broker communication and thus never
exposed to any application. And I think this is a good design because
broker needs to understand those control messages. Thus, this should be
a protocol change.
The type
arent control messages getting pushed as their own top level protocol
change (and a fairly massive one) for the transactions KIP ?
On Tue, Dec 13, 2016 at 5:54 PM, Matthias J. Sax
> Hi,
> I want to add a completely new angle to this discussion. For this, I
> want to propose an extension
I want to add a completely new angle to this discussion. For this, I
want to propose an extension for the headers feature that enables new
uses cases -- and those new use cases might convince people to support
headers (of course including the larger scoped proposal).
Extended Proposal:
Hi, Michael,
Thanks for the reply. I find it very helpful.
Data lineage:
100. I'd like to understand the APM use case a bit more. It sounds like
that those APM plugins can generate a transaction id that we could
potentially put in the header of every message. How would you typically
make use of s
for iPhone
From: James Cheng
Sent: Monday, December 5, 2016 8:50:30 AM
To: dev@kafka.apache.org
Subject: Re: [DISCUSS] KIP-82 - Add Record Headers
> On Dec 2, 2016, at 4:57 PM, Michael Pearce wrote:
> Hi Jun.
> RE Mirroring,
> [...]
> Lastly around
> On Dec 2, 2016, at 4:57 PM, Michael Pearce wrote:
> Hi Jun.
> RE Mirroring,
> [...]
> Lastly around mirroring we have a partionKey field, as the key used for
> portioning logic != compaction key all the time but we want to preserve it
> for when we mirror so that if source cluster
> > >> > >> >> >>> > >> C. per message encryption
> > >> > >> >> >>> > >> One drawback of this approach is that this
> >
> >> > >> in
> > >> > >> >> the
> > >> > >> >> >>> > >> third-party use case category.
> > >> > >> >> >>> > >>
> &g
. schema id
> >> > >> >> >>> > >> Since the value is mostly useless without schema id, it
> >> > seems
> >> > >> that
> >> > >> >> >>> > storing
> >> > >> >> >>> > >> the schema id together with serialized bytes in the
> value
> >> is
>> > on
>> > >> >> >>> > >> the storage system (e.g. LUKS) for at rest encryption.
>> > >> >> >>> > >>
>> > >> >> >>> > >> D. cluster ID for mirroring across Kafka clusters
L for wire
>> > encryption
>> > >> and
>> > >> >> >>> rely
>> > >> >> >>> > on
>> > >> >> >>> > >> the storage system (e.g. LUKS) for at rest encryption.
>> > >
> > >> >> >>> > >> the producing cluster ID in the header. MirrorMaker could
> > then
> > >> >> avoid
> > >> >> >>> > >> mirroring messages to a cluster if they are tagged with
> the
> > >> same
gt;>> > >> the producing cluster ID in the header. MirrorMaker could
> > then
> > >> >> avoid
> > >> >> >>> > >> mirroring messages to a cluster if they are tagged with
> the
> > >> same
> >> >> >>> > >> though since the same key may show up in different
> partitions.
> >> >> >>> > >>
> >> >> >>> > >> E. record-level lineage
> >> >> >>> > &g
gt; how widely useful record-level lineage is though since the
>> >> overhead
>> >> >>> > could
>> >> >>> > >> be significant.
>> >> >>> > >>
>> >> >>> > >> F. auditing metadat
;>> > send
> >> >>> > >> the producerId -> metadata mapping independently. KIP-98 is
> >> actually
> >> >>> > >> proposing including such a producerId natively in the message.
> >> >>> > >>
> >> >>> > >> So, ove
gt; Currently, those systems just deal with key/value pairs. Should we
>> >>> > expose a
>> >>> > >> third thing header there too or somehow map header to key or
>> value?
>> >>> > >>
>> >>> > >> Thanks
1 - 100 of 196 matches
Mail list logo