Re: [DISCUSS] Control Messages - [Was: KIP-82 - Add Record Headers]

Matthias J. Sax Wed, 21 Dec 2016 01:55:22 -0800

I agree with all. Just want to elaborate a few things:

3. There are two different use cases:
   (a) the one you describe -- I want to shutdown NOW and don't want to
wait -- I agree with your observations etc
   (b) we intentionally want to "drain" the stream processing topology
before shutting down -- yes, if I have lot of intermediate data this
might take some time, but I want/need a clean shutdown like this


Case 3(b) is currently not possible and exactly want we need for
"Incremental Batch KIP" -- there are other use case for 3(b), too.


4. The point about "it's just a client thing is true, but it should work
for client that are not aware of the messages, too. Ie, we need an
opt-in mechanism -- so some changes are required -- not to the brokers
though -- but it cannot be done "external" to the clients -- otherwise
people would need to change their client code.



About "embedded control message" vs "extra control message stream".
IMHO, there a use cases for both and both approaches complete each other
(they are not conflicting).


-Matthias



On 12/14/16 8:36 PM, Ignacio Solis wrote:
> I'm renaming this thread in case we start deep diving.
> 
> I'm in favor of so called "control messages", at least the notion of
> those.  However, I'm not sure about the design.
> 
> What I understood from the original mail:
> 
> A. Provide a message that does not get returned by poll()
> B. Provide a way for applications to consume these messages (sign up?)
> C. Control messages would be associated with a topic.
> D. Control messages should be _in_ the topic.
> 
> 
> 
> 1. The first thing to point out is that this can be done with headers.
> I assume that's why you sent it on the header thread. As you state, if
> we had headers, you would not require a separate KIP.  So, in a way,
> you're trying to provide a concrete use case for headers.  I wanted to
> separate the discussion to a separate thread mostly because while I
> like the idea, and I like the fact that it can be done by headers,
> people might want to discuss alternatives.
> 
> 2. I'm also assuming that you're intentionally trying to preserve
> order. Headers could do this natively of course. You could also
> achieve this with the separate topic given identifiers, sequence
> numbers, headers, etc.  However...
> 
> 3. There are a few use cases where ordering is important but
> out-of-band is even more important. We have a few large workloads
> where this is of interest to us.  Obviously we can achieve this with a
> separate topic, but having a control channel for a topic that can send
> high priority data would be interesting.   And yes, we would learn a
> lot form the TCP experiences with the urgent pointer (
> https://tools.ietf.org/html/rfc6093 ) and other out-of-band
> communication techniques.
> 
> You have an example of a "shutdown marker".  This works ok as a
> terminator, however, it is not very fast.  If I have 4 TB of data
> because of asynchronous processing, then a shutdown marker at the end
> of the 4TB is not as useful as having an out-of-band message that will
> tell me immediately that those 4TB should not be processed.   So, from
> this perspective, I prefer to have a separate topic and not embed
> control messages with the data.
> 
> If the messages are part of the data, or associated to specific data,
> then they should be in the data. If they are about process, we need an
> out-of-band mechanism.
> 
> 
> 4. The general feeling I have gotten from a few people on the list is:
> Why not just do this above the kafka clients?  After all, you could
> have a system to ignore certain schemas.
> 
> Effectively, if we had headers, it would be done from a client
> perspective, without the need to modify anything major.
> 
> If we wanted to do it with a separate topic, that could also be done
> without any broker changes. But you could imagine wanting some broker
> changes if the broker understands that 2 streams are tied together
> then it may make decisions based on that.  This would be similar to
> the handling of file system forks (
> https://en.wikipedia.org/wiki/Fork_(file_system) )
> 
> 
> 5. Also heard on discussions about headers: we don't know if this is
> generally useful. Maybe only a couple of institutions?  It may not be
> worth it to modify the whole stack for that.
> 
> I would again say that with headers you could pull it off easily, even
> if only for a subset of clients/applications wanted to use it.
> 
> 
> So, in summary. I like the idea.  I see benefits in implementing it
> through headers, but I also see benefits of having it as a separate
> stream.  I'm not too in favor of having a separate message handling
> pipeline for the same topic though.
> 
> Nacho
> 
> 
> 
> 
> 
> On Wed, Dec 14, 2016 at 9:51 AM, Matthias J. Sax <matth...@confluent.io> 
> wrote:
>> Yes and no. I did overload the term "control message".
>>
>> EOS control messages are for client-broker communication and thus never
>> exposed to any application. And I think this is a good design because
>> broker needs to understand those control messages. Thus, this should be
>> a protocol change.
>>
>> The type of control messages I have in mind are for client-client
>> (application-application) communication and the broker is agnostic to
>> them. Thus, it should not be a protocol change.
>>
>>
>> -Matthias
>>
>>
>>
>> On 12/14/16 9:42 AM, radai wrote:
>>> arent control messages getting pushed as their own top level protocol
>>> change (and a fairly massive one) for the transactions KIP ?
>>>
>>> On Tue, Dec 13, 2016 at 5:54 PM, Matthias J. Sax <matth...@confluent.io>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I want to add a completely new angle to this discussion. For this, I
>>>> want to propose an extension for the headers feature that enables new
>>>> uses cases -- and those new use cases might convince people to support
>>>> headers (of course including the larger scoped proposal).
>>>>
>>>> Extended Proposal:
>>>>
>>>> Allow messages with a certain header key to be special "control
>>>> messages" (w/ o w/o payload) that are not exposed to an application via
>>>> .poll().
>>>>
>>>> Thus, a consumer client would automatically skip over those messages. If
>>>> an application knows about embedded control messages, it can "sing up"
>>>> to those messages by the consumer client and either get a callback or
>>>> the consumer auto-drop for this messages gets disabled (allowing to
>>>> consumer those messages via poll()).
>>>>
>>>> (The details need further considerations/discussion. I just want to
>>>> sketch the main idea.)
>>>>
>>>> Usage:
>>>>
>>>> There is a shared topic (ie, used by multiple applications) and a
>>>> producer application wants to embed a special message in the topic for a
>>>> dedicated consumer application. Because only one application will
>>>> understand this message, it cannot be a regular message as this would
>>>> break all applications that do not understand this message. The producer
>>>> application would set a special metadata key and no consumer application
>>>> would see this control message by default because they did not enable
>>>> their consumer client to return this message in poll() (and the client
>>>> would just drop this message with special metadata key). Only the single
>>>> application that should receive this message, will subscribe to this
>>>> message on its consumer client and process it.
>>>>
>>>>
>>>> Concrete Use Case: Kafka Streams
>>>>
>>>> In Kafka Streams, we would like to propagate "control messages" from
>>>> subtopology to subtopology. There are multiple scenarios for which this
>>>> would be useful. For example, currently we do not guarantee a
>>>> "consistent shutdown" of an application. By this, I mean that input
>>>> records might not be completely processed by the whole topology because
>>>> the application shutdown happens "in between" and an intermediate result
>>>> topic gets "stock" in an intermediate topic. Thus, a user would see an
>>>> committed offset of the source topic of the application, but no
>>>> corresponding result record in the output topic.
>>>>
>>>> Having "shutdown markers" would allow us, to first stop the upstream
>>>> subtopology and write this marker into the intermediate topic and the
>>>> downstream subtopology would only shut down itself after is sees the
>>>> "shutdown marker". Thus, we can guarantee on shutdown, that no
>>>> "in-flight" messages got stuck in intermediate topics.
>>>>
>>>>
>>>> A similar usage would be for KIP-95 (Incremental Batch Processing).
>>>> There was a discussion about the proposed metadata topic, and we could
>>>> avoid this metadata topic if we would have "control messages".
>>>>
>>>>
>>>> Right now, we cannot insert an "application control message" because
>>>> Kafka Streams does not own all topics it read/writes and thus might
>>>> break other consumer application (as described above) if we inject
>>>> random messages that are not understood by other apps.
>>>>
>>>>
>>>> Of course, one can work around "embedded control messaged" by using an
>>>> additional topic to propagate control messaged between application (as
>>>> suggestion in KIP-95 via a metadata topic for Kafka Streams). But there
>>>> are major concerns about adding this metadata topic in the KIP and this
>>>> shows that other application that need a similar pattern might profit
>>>> from topic embedded "control messages", too.
>>>>
>>>>
>>>> One last important consideration: those "control messages" are used for
>>>> client to client communication and are not understood by the broker.
>>>> Thus, those messages should not be enabled within the message format
>>>> (c.f. tombstone flag -- KIP-87). However, "client land" record headers
>>>> would be a nice way to implement them. Because KIP-82 did consider key
>>>> namespaces for metatdata keys, this extension should not be an own KIP
>>>> but should be included in KIP-82 to reserve a namespace for "control
>>>> message" in the first place.
>>>>
>>>>
>>>> Sorry for the long email... Looking forward to your feedback.
>>>>
>>>>
>>>> -Matthias
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 12/8/16 12:12 AM, Michael Pearce wrote:
>>>>> Hi Jun
>>>>>
>>>>> 100) each time a transaction exits a jvm for a remote system (HTTP/JMS/
>>>> Hopefully one day kafka) the APM tools stich in a unique id (though I
>>>> believe it contains the end2end uuid embedded in this id), on receiving the
>>>> message at the receiving JVM the apm code takes this out, and continues its
>>>> tracing on the that new thread. Both JVM’s (and other languages the APM
>>>> tool supports) send this data async back to the central controllers where
>>>> the stiching togeather occurs. For this they need some header space for
>>>> them to put this id.
>>>>>
>>>>> 101) Yes indeed we have a business transaction Id in the payload. Though
>>>> this is a system level tracing, that we need to have marry up. Also as per
>>>> note on end2end encryption we’d be unable to prove the flow if the payload
>>>> is encrypted as we’d not have access to this at certain points of the flow
>>>> through the infrastructure/platform.
>>>>>
>>>>>
>>>>> 103) As said we use this mechanism in IG very successfully, as stated
>>>> per key we guarantee the transaction producing app to handle the
>>>> transaction of a key at one DC unless at point of critical failure where we
>>>> have to flip processing to another. We care about key ordering.
>>>>> I disagree on the offset comment for the partition solution unless you
>>>> do full ISR, or expensive full XA transactions even with partitions you
>>>> cannot fully guarantee offsets would match.
>>>>>
>>>>> 105) Very much so, I need to have access at the platform level to the
>>>> other meta data all mentioned, without having to need to have access to the
>>>> encryption keys of the payload.
>>>>>
>>>>> 106)
>>>>> Techincally yes for AZ/Region/Cluster, but then we’d need to have a
>>>> global producerId register which would be very hard to enforce/ensure is
>>>> current and correct, just to understand the message origins of its
>>>> region/az/cluster for routing.
>>>>> The client wrapper version, producerId can be the same, as obviously the
>>>> producer could upgrade its wrapper, as such we need to know what wrapper
>>>> version the message is created with.
>>>>> Likewise the IP address, as stated we can have our producer move, where
>>>> its IP would change.
>>>>>
>>>>> 107)
>>>>> UUID is set on the message by interceptors before actual producer
>>>> transport send. This is for platform level message dedupe guarantee, the
>>>> business payload should be agnostic to this. Please see
>>>> https://activemq.apache.org/artemis/docs/1.5.0/duplicate-detection.html
>>>> note this is not touching business payloads.
>>>>>
>>>>>
>>>>>
>>>>> On 06/12/2016, 18:22, "Jun Rao" <j...@confluent.io> wrote:
>>>>>
>>>>>     Hi, Michael,
>>>>>
>>>>>     Thanks for the reply. I find it very helpful.
>>>>>
>>>>>     Data lineage:
>>>>>     100. I'd like to understand the APM use case a bit more. It sounds
>>>> like
>>>>>     that those APM plugins can generate a transaction id that we could
>>>>>     potentially put in the header of every message. How would you
>>>> typically
>>>>>     make use of such transaction ids? Are there other metadata
>>>> associated with
>>>>>     the transaction id and if so, how are they propagated downstream?
>>>>>
>>>>>     101. For the finance use case, if the concept of transaction is
>>>> important,
>>>>>     wouldn't it be typically included in the message payload instead of
>>>> as an
>>>>>     optional header field?
>>>>>
>>>>>     102. The data lineage that Altas and Navigator support seems to be
>>>> at the
>>>>>     dataset level, not per record level? So, not sure if per message
>>>> headers
>>>>>     are relevant there.
>>>>>
>>>>>     Mirroring:
>>>>>     103. The benefit of using separate partitions is that it potentially
>>>> makes
>>>>>     it easy to preserve offsets during mirroring. This will make it
>>>> easier for
>>>>>     consumer to switch clusters. Currently, the consumers can switch
>>>> clusters
>>>>>     by using the timestampToOffset() api, but it has to deal with
>>>> duplicates.
>>>>>     Good point on the issue with log compact and I am not sure how to
>>>> address
>>>>>     this. However, even if we mirror into the existing partitions, the
>>>> ordering
>>>>>     for messages generated from different clusters seems
>>>> non-deterministic
>>>>>     anyway. So, it seems that the consumers already have to deal with
>>>> that? If
>>>>>     a topic is compacted, does that mean which messages are preserved is
>>>> also
>>>>>     non-deterministic across clusters?
>>>>>
>>>>>     104. Good point on partition key.
>>>>>
>>>>>     End-to-end encryption:
>>>>>     105. So, it seems end-to-end encryption is useful. Are headers
>>>> useful there?
>>>>>
>>>>>     Auditing:
>>>>>     106. It seems other than the UUID, all other metadata are per
>>>> producer?
>>>>>
>>>>>     EOS:
>>>>>     107. How are those UUIDs generated? I am not sure if they can be
>>>> generated
>>>>>     in the producer library. An application may send messages through a
>>>> load
>>>>>     balancer and on retry, the same message could be routed to a
>>>> different
>>>>>     producer instance. So, it seems that the application has to generate
>>>> the
>>>>>     UUIDs. In that case, shouldn't the application just put the UUID in
>>>> the
>>>>>     payload?
>>>>>
>>>>>     Thanks,
>>>>>
>>>>>     Jun
>>>>>
>>>>>
>>>>>     On Fri, Dec 2, 2016 at 4:57 PM, Michael Pearce <
>>>> michael.pea...@ig.com>
>>>>>     wrote:
>>>>>
>>>>>     > Hi Jun.
>>>>>     >
>>>>>     > Per Transaction Tracing / Data Lineage.
>>>>>     >
>>>>>     > As Stated in the KIP this has the first use case of how many APM
>>>> tools now
>>>>>     > work.
>>>>>     > I would find it impossible for any one to argue this is not
>>>> important or a
>>>>>     > niche market as it has its own gartner report for this space. Such
>>>>>     > companies as Appdynamics, NewRelic, Dynatrace, Hawqular are but a
>>>> few.
>>>>>     >
>>>>>     > Likewise these APM tools can help very rapidly track down issues
>>>> and
>>>>>     > automatically capture metrics, perform actions based on unexpected
>>>> behavior
>>>>>     > to auto recover services.
>>>>>     >
>>>>>     > Before mentioning looking at aggregated stats, in these cases where
>>>>>     > actually on critical flows we cannot afford to have aggregated
>>>> rolled up
>>>>>     > stats only.
>>>>>     >
>>>>>     > With the APM tool we use its actually able to detect a single
>>>> transaction
>>>>>     > failure and capture the thread traces in the JVM where it failed
>>>> and
>>>>>     > everything for us, to the point it sends us alerts where we have
>>>> this
>>>>>     > giving the line number of the code that caused it, the transaction
>>>> trace
>>>>>     > through all the services and endpoints (supported) upto the point
>>>> of
>>>>>     > failure, it can also capture the data in and out (so we can
>>>> replay).
>>>>>     > Because atm Kafka doesn’t support us being able to stich in these
>>>> tracing
>>>>>     > transaction ids natively, we cannot get these benefits as such is
>>>> limiting
>>>>>     > our ability support apps and monitor them to the same standards we
>>>> come to
>>>>>     > expect when on a kafka flow.
>>>>>     >
>>>>>     > This actually ties in with Data Lineage, as the same tracing can
>>>> be used
>>>>>     > to back stich this. Essentially many times due to the sums of money
>>>>>     > involved there are disputes, and typically as a financial
>>>> institute the
>>>>>     > easiest and cleanest way to prove when disputes arise is to
>>>> present the
>>>>>     > actual flow and processes involved in a transaction.
>>>>>     >
>>>>>     > Likewise as Hadoop matures its evident this case is important, as
>>>> tools
>>>>>     > such as Atlas (Hortonworks led) and Navigator (cloudera led) are
>>>> evident
>>>>>     > also I believe the importance here is very much NOT just a
>>>> financial issue.
>>>>>     >
>>>>>     > From a MDM point of view any company wanting to care about Data
>>>> Quality
>>>>>     > and Data Governance - Data Lineage is a key piece in this puzzle.
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE Mirroring,
>>>>>     >
>>>>>     > As per the KIP in-fact this is exactly what we do re cluster id,
>>>> to mirror
>>>>>     > a network of clusters between AZ’s / Regions. We know a
>>>> transaction for a
>>>>>     > key will be done within a  AZ/Region, as such we know the write to
>>>> kafka
>>>>>     > would be ordered per key. But we need eventual view of that across
>>>> in our
>>>>>     > other regions/az’s. When we have complete AZ or Region failure we
>>>> know
>>>>>     > there will be a brief interruption whilst those transactions are
>>>> moved to
>>>>>     > another region but we expect after it to continue.
>>>>>     >
>>>>>     > As mentioned having separate Partions to do this starts to get
>>>>>     > ugly/complicated for us:
>>>>>     > how would I do compaction where a key is in two partitions?
>>>>>     > How do we balance consumers so where multiple partitions with the
>>>> same key
>>>>>     > goto the same consumer
>>>>>     > What do you do if cluster 1 has 5 partitions but cluster 20 has 10
>>>> because
>>>>>     > its larger kit in our more core DC’s, as such key to partition
>>>> mappings for
>>>>>     > consumers get even more complicated.
>>>>>     > What do you do if we add or remove a complete region
>>>>>     >
>>>>>     > Where as simple mirror will work we just need to ensure we don’t
>>>> have a
>>>>>     > cycle which we can do with clusterId.
>>>>>     >
>>>>>     > We even have started to look at shortest path mirror routing based
>>>> on
>>>>>     > clusterId, if we also had the region and az info on the originating
>>>>>     > message, this we have not implemented but some ideas come from
>>>> network
>>>>>     > routing, and also the dispatcher router in apache qpid.
>>>>>     >
>>>>>     > Also we need to have data perimeters e.g. certain data cannot leave
>>>>>     > certain countries borders. We want this all automated so that at
>>>> the
>>>>>     > platform level without having to touch or look at the business
>>>> data inside
>>>>>     > we can have headers we can put tags into so that we can ensure
>>>> this doesn’t
>>>>>     > occur when we mirror. (actually links in to data lineage / tracing
>>>> as again
>>>>>     > we need to tag messages at a platform level) Examples are we are
>>>> not
>>>>>     > allowed Private customer details to leave Switzerland, yet we need
>>>> those
>>>>>     > systems integrated.
>>>>>     >
>>>>>     > Lastly around mirroring we have a partionKey field, as the key
>>>> used for
>>>>>     > portioning logic != compaction key all the time but we want to
>>>> preserve it
>>>>>     > for when we mirror so that if source cluster partition count !=
>>>> destination
>>>>>     > cluster partition count we can honour the same partitioning logic.
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE End 2 End encryption
>>>>>     >
>>>>>     > As I believe mentioned just before, the solution you mention just
>>>> doesn’t
>>>>>     > cut the mustard these days with many regulators. An operations
>>>> person with
>>>>>     > access to the box should not be able to have access to the data.
>>>> Many now
>>>>>     > actually impose quite literally the implementation expected being
>>>> end2end
>>>>>     > encryption for certain data (Singapore for us is one that I am
>>>> most aware
>>>>>     > of). In fact we’re even now needing encrypt the data and store the
>>>> keys in
>>>>>     > HSM modules.
>>>>>     >
>>>>>     > Likewise the performance penalty on encrypting decrypting as you
>>>> produce
>>>>>     > over wire, then again encrypt decrypt as the data is stored on the
>>>> brokers
>>>>>     > disks and back again, then again encrypted and decrypted back over
>>>> the wire
>>>>>     > each time for each consumer all adds up, ignoring this doubling
>>>> with mirror
>>>>>     > makers etc. simply encrypting the value once on write by the
>>>> client and
>>>>>     > again decrypting on consume by the consumer is far more
>>>> performant, but
>>>>>     > then the routing and platform meta data needs to be separate (thus
>>>> headers)
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE Auditing:
>>>>>     >
>>>>>     > Our Auditing needs are:
>>>>>     > Producer Id,
>>>>>     > Origin Cluster Id that message first produced into
>>>>>     > Origin AZ – agreed we can derive this if we have cluster id, but
>>>> it makes
>>>>>     > resolving this for audit reporting a lot easier.
>>>>>     > Origin Region – agreed we can derive this if we have cluster id,
>>>> but it
>>>>>     > makes resolving this for audit reporting a lot easier.
>>>>>     > Unique Message Identification (this is not the same as transaction
>>>>>     > tracing) – note offset and partition are not the same, as when we
>>>> mirror or
>>>>>     > have for what ever system failure duplicate send,
>>>>>     > Custom Client wrapper version (where organizations have to wrap
>>>> the kafka
>>>>>     > client for added features) so we know what version of the wrapper
>>>> is used
>>>>>     > Producer IP address (in case of clients being in our vm/open stack
>>>> infra
>>>>>     > where they can move around, producer id will stay the same but
>>>> this would
>>>>>     > change)
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > RE Once and only once delivery case
>>>>>     >
>>>>>     > Using the same Message UUID for auditing we can achieve this quite
>>>> simply.
>>>>>     >
>>>>>     > As per how some other brokers do this (cough qpid, artemis)
>>>> message uuid
>>>>>     > are used to dedupe where message is sent and produced but the
>>>> client didn’t
>>>>>     > receive the ack, and there for replays the send, by having a
>>>> unique message
>>>>>     > id per message, this can be filtered out, on consumers where
>>>> message
>>>>>     > delivery may occur twice for what ever reasons a message uuid can
>>>> be used
>>>>>     > to remove duplicates being deliverd , like wise we can do this in
>>>> the
>>>>>     > mirrormakers so if we detect a dupe message we can avoid
>>>> replicating it.
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > Cheers
>>>>>     > Mike
>>>>>     >
>>>>>     >
>>>>>     >
>>>>>     > On 02/12/2016, 22:09, "Jun Rao" <j...@confluent.io> wrote:
>>>>>     >
>>>>>     >     Since this KIP affects message format, wire protocol, apis, I
>>>> think
>>>>>     > it's
>>>>>     >     worth spending a bit more time to nail down the concrete use
>>>> cases. It
>>>>>     >     would be bad if we add this feature, but when start
>>>> implementing it
>>>>>     > for say
>>>>>     >     mirroring, we then realize that header is not the best
>>>> approach.
>>>>>     > Initially,
>>>>>     >     I thought I was convinced of the use cases of headers and was
>>>> trying to
>>>>>     >     write down a few use cases to convince others. That's when I
>>>> became
>>>>>     > less
>>>>>     >     certain. For me to be convinced, I just want to see two strong
>>>> use
>>>>>     > cases
>>>>>     >     (instead of 10 maybe use cases) in the third-party space. The
>>>> reason is
>>>>>     >     that when we discussed the use cases within a company, often
>>>> it ends
>>>>>     > with
>>>>>     >     "we can't force everyone to use this standard since we may
>>>> have to
>>>>>     >     integrate with third-party tools".
>>>>>     >
>>>>>     >     At present, I am not sure why headers are useful for things
>>>> like
>>>>>     > schemaId
>>>>>     >     or encryption. In order to do anything useful to the value,
>>>> one needs
>>>>>     > to
>>>>>     >     know the schemaId or how data is encrypted, but header is
>>>> optional.
>>>>>     > But, I
>>>>>     >     can be convinced if someone (Radai, Sean, Todd?) provides more
>>>> details
>>>>>     > on
>>>>>     >     the argument.
>>>>>     >
>>>>>     >     I am not very sure header is the best approach for mirroring
>>>> either. If
>>>>>     >     someone has thought about this more, I'd be happy to hear.
>>>>>     >
>>>>>     >     I can see the data lineage use case. I am just not sure how
>>>> widely
>>>>>     >     applicable this is. If someone familiar with this space can
>>>> justify
>>>>>     > this is
>>>>>     >     a significant use case, say in the finance industry, this
>>>> would be a
>>>>>     > strong
>>>>>     >     use case.
>>>>>     >
>>>>>     >     I can see the auditing use case. I am just not sure if a native
>>>>>     > producer id
>>>>>     >     solves that problem. If there are additional metadata that's
>>>> worth
>>>>>     >     collecting but not covered by the producer id, that would make
>>>> this a
>>>>>     >     strong use case.
>>>>>     >
>>>>>     >     Thanks,
>>>>>     >
>>>>>     >     Jun
>>>>>     >
>>>>>     >
>>>>>     >     On Fri, Dec 2, 2016 at 1:41 PM, radai <
>>>> radai.rosenbl...@gmail.com>
>>>>>     > wrote:
>>>>>     >
>>>>>     >     > this KIP is about enabling headers, nothing more nothing
>>>> less - so
>>>>>     > no,
>>>>>     >     > broker-side use of headers is not in the KIP scope.
>>>>>     >     >
>>>>>     >     > obviously though, once you have headers potential use cases
>>>> could
>>>>>     > include
>>>>>     >     > broker-side header-aware interceptors (which would be the
>>>> topic of
>>>>>     > other
>>>>>     >     > future KIPs).
>>>>>     >     >
>>>>>     >     > a trivially clear use case (to me) would be using such
>>>> broker-side
>>>>>     >     > interceptors to enforce compliance with organizational
>>>> policies - it
>>>>>     > would
>>>>>     >     > make our SREs lives much easier if instead of retroactively
>>>>>     > discovering
>>>>>     >     > "rogue" topics/users those messages would have been rejected
>>>>>     > up-front.
>>>>>     >     >
>>>>>     >     > the kafka broker code is lacking any such extensibility
>>>> support
>>>>>     > (beyond
>>>>>     >     > maybe authorizer) which is why these use cases were left out
>>>> of the
>>>>>     > "case
>>>>>     >     > for headers" doc - broker extensibility is a separate
>>>> discussion.
>>>>>     >     >
>>>>>     >     > On Fri, Dec 2, 2016 at 12:59 PM, Gwen Shapira <
>>>> g...@confluent.io>
>>>>>     > wrote:
>>>>>     >     >
>>>>>     >     > > Woah, I wasn't aware this is something we'll do. It wasn't
>>>> in the
>>>>>     > KIP,
>>>>>     >     > > right?
>>>>>     >     > >
>>>>>     >     > > I guess we could do it the same way ACLs currently work.
>>>>>     >     > > I had in mind something that will allow admins to apply
>>>> rules to
>>>>>     > the
>>>>>     >     > > new create/delete/config topic APIs. So Todd can decide to
>>>> reject
>>>>>     >     > > "create topic" requests that ask for more than 40
>>>> partitions, or
>>>>>     >     > > require exactly 3 replicas, or no more than 50GB partition
>>>> size,
>>>>>     > etc.
>>>>>     >     > >
>>>>>     >     > > ACLs were added a bit ad-hoc, if we are planning to apply
>>>> more
>>>>>     > rules
>>>>>     >     > > to requests (and I think we should), we may want a bit
>>>> more generic
>>>>>     >     > > design around that.
>>>>>     >     > >
>>>>>     >     > > On Fri, Dec 2, 2016 at 7:16 AM, radai <
>>>> radai.rosenbl...@gmail.com>
>>>>>     >     > wrote:
>>>>>     >     > > > "wouldn't you be in the business of making sure everyone
>>>> uses
>>>>>     > them
>>>>>     >     > > > properly?"
>>>>>     >     > > >
>>>>>     >     > > > thats where a broker-side plugin would come handy - any
>>>> incoming
>>>>>     >     > message
>>>>>     >     > > > that does not conform to org policy (read - does not
>>>> have the
>>>>>     > proper
>>>>>     >     > > > headers) gets thrown out (with an error returned to user)
>>>>>     >     > > >
>>>>>     >     > > > On Thu, Dec 1, 2016 at 8:44 PM, Todd Palino <
>>>> tpal...@gmail.com>
>>>>>     > wrote:
>>>>>     >     > > >
>>>>>     >     > > >> Come on, I’ve done at least 2 talks on this one :)
>>>>>     >     > > >>
>>>>>     >     > > >> Producing counts to a topic is part of it, but that’s
>>>> only
>>>>>     > part. So
>>>>>     >     > you
>>>>>     >     > > >> count you have 100 messages in topic A. When you mirror
>>>> topic A
>>>>>     > to
>>>>>     >     > > another
>>>>>     >     > > >> cluster, you have 99 messages. Where was your problem?
>>>> Or
>>>>>     > worse, you
>>>>>     >     > > have
>>>>>     >     > > >> 100 messages, but one producer duplicated messages and
>>>> another
>>>>>     > one
>>>>>     >     > lost
>>>>>     >     > > >> messages. You need details about where the message came
>>>> from in
>>>>>     > order
>>>>>     >     > to
>>>>>     >     > > >> pinpoint problems when they happen. Source producer
>>>> info, where
>>>>>     > it was
>>>>>     >     > > >> produced into your infrastructure, and when it was
>>>> produced.
>>>>>     > This
>>>>>     >     > > requires
>>>>>     >     > > >> you to add the information to the message.
>>>>>     >     > > >>
>>>>>     >     > > >> And yes, you still need to maintain your clients. So
>>>> maybe my
>>>>>     > original
>>>>>     >     > > >> example was not the best. My thoughts on not wanting to
>>>> be
>>>>>     > responsible
>>>>>     >     > > for
>>>>>     >     > > >> message formats stands, because that’s very much
>>>> separate from
>>>>>     > the
>>>>>     >     > > client.
>>>>>     >     > > >> As you know, we have our own internal client library
>>>> that can
>>>>>     > insert
>>>>>     >     > the
>>>>>     >     > > >> right headers, and right now inserts the right audit
>>>>>     > information into
>>>>>     >     > > the
>>>>>     >     > > >> message fields. If they exist, and assuming the message
>>>> is Avro
>>>>>     >     > encoded.
>>>>>     >     > > >> What if someone wants to use JSON instead for a good
>>>> reason?
>>>>>     > What if
>>>>>     >     > > user X
>>>>>     >     > > >> wants to encrypt messages, but user Y does not?
>>>> Maintaining the
>>>>>     > client
>>>>>     >     > > >> library is still much easier than maintaining the
>>>> message
>>>>>     > formats.
>>>>>     >     > > >>
>>>>>     >     > > >> -Todd
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >> On Thu, Dec 1, 2016 at 6:21 PM, Gwen Shapira <
>>>> g...@confluent.io
>>>>>     > >
>>>>>     >     > wrote:
>>>>>     >     > > >>
>>>>>     >     > > >> > Based on your last sentence, consider me convinced :)
>>>>>     >     > > >> >
>>>>>     >     > > >> > I get why headers are critical for Mirroring (you
>>>> need tags to
>>>>>     >     > prevent
>>>>>     >     > > >> > loops and sometimes to route messages to the correct
>>>>>     > destination).
>>>>>     >     > > >> > But why do you need headers to audit? We are auditing
>>>> by
>>>>>     > producing
>>>>>     >     > > >> > counts to a side topic (and I was under the
>>>> impression you do
>>>>>     > the
>>>>>     >     > > >> > same), so we never need to modify the message.
>>>>>     >     > > >> >
>>>>>     >     > > >> > Another thing - after we added headers, wouldn't you
>>>> be in the
>>>>>     >     > > >> > business of making sure everyone uses them properly?
>>>> Making
>>>>>     > sure
>>>>>     >     > > >> > everyone includes the right headers you need, not
>>>> using the
>>>>>     > header
>>>>>     >     > > >> > names you intend to use, etc. I don't think the
>>>> "policing"
>>>>>     > business
>>>>>     >     > > >> > will ever go away.
>>>>>     >     > > >> >
>>>>>     >     > > >> > On Thu, Dec 1, 2016 at 5:25 PM, Todd Palino <
>>>>>     > tpal...@gmail.com>
>>>>>     >     > > wrote:
>>>>>     >     > > >> > > Got it. As an ops guy, I'm not very happy with the
>>>>>     > workaround.
>>>>>     >     > Avro
>>>>>     >     > > >> means
>>>>>     >     > > >> > > that I have to be concerned with the format of the
>>>> messages
>>>>>     > in
>>>>>     >     > > order to
>>>>>     >     > > >> > run
>>>>>     >     > > >> > > the infrastructure (audit, mirroring, etc.). That
>>>> means
>>>>>     > that I
>>>>>     >     > have
>>>>>     >     > > to
>>>>>     >     > > >> > > handle the schemas, and I have to enforce rules
>>>> about good
>>>>>     >     > formats.
>>>>>     >     > > >> This
>>>>>     >     > > >> > is
>>>>>     >     > > >> > > not something I want to be in the business of,
>>>> because I
>>>>>     > should be
>>>>>     >     > > able
>>>>>     >     > > >> > to
>>>>>     >     > > >> > > run a service infrastructure without needing to be
>>>> in the
>>>>>     > weeds of
>>>>>     >     > > >> > dealing
>>>>>     >     > > >> > > with customer data formats.
>>>>>     >     > > >> > >
>>>>>     >     > > >> > > Trust me, a sizable portion of my support time is
>>>> spent
>>>>>     > dealing
>>>>>     >     > with
>>>>>     >     > > >> > schema
>>>>>     >     > > >> > > issues. I really would like to get away from that.
>>>> Maybe
>>>>>     > I'd have
>>>>>     >     > > more
>>>>>     >     > > >> > time
>>>>>     >     > > >> > > for other hobbies. Like writing. ;)
>>>>>     >     > > >> > >
>>>>>     >     > > >> > > -Todd
>>>>>     >     > > >> > >
>>>>>     >     > > >> > > On Thu, Dec 1, 2016 at 4:04 PM Gwen Shapira <
>>>>>     > g...@confluent.io>
>>>>>     >     > > wrote:
>>>>>     >     > > >> > >
>>>>>     >     > > >> > >> I'm pretty satisfied with the current workarounds
>>>> (Avro
>>>>>     > container
>>>>>     >     > > >> > >> format), so I'm not too excited about the extra
>>>> work
>>>>>     > required to
>>>>>     >     > do
>>>>>     >     > > >> > >> headers in Kafka. I absolutely don't mind it if
>>>> you do
>>>>>     > it...
>>>>>     >     > > >> > >> I think the Apache convention for "good idea, but
>>>> not
>>>>>     > willing to
>>>>>     >     > > put
>>>>>     >     > > >> > >> any work toward it" is +0.5? anyway, that's what I
>>>> was
>>>>>     > trying to
>>>>>     >     > > >> > >> convey :)
>>>>>     >     > > >> > >>
>>>>>     >     > > >> > >> On Thu, Dec 1, 2016 at 3:05 PM, Todd Palino <
>>>>>     > tpal...@gmail.com>
>>>>>     >     > > >> wrote:
>>>>>     >     > > >> > >> > Well I guess my question for you, then, is what
>>>> is
>>>>>     > holding you
>>>>>     >     > > back
>>>>>     >     > > >> > from
>>>>>     >     > > >> > >> > full support for headers? What’s the bit that
>>>> you’re
>>>>>     > missing
>>>>>     >     > that
>>>>>     >     > > >> has
>>>>>     >     > > >> > you
>>>>>     >     > > >> > >> > under a full +1?
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> > -Todd
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> > On Thu, Dec 1, 2016 at 1:59 PM, Gwen Shapira <
>>>>>     >     > g...@confluent.io>
>>>>>     >     > > >> > wrote:
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> >> I know why people who support headers support
>>>> them, and
>>>>>     > I've
>>>>>     >     > > seen
>>>>>     >     > > >> > what
>>>>>     >     > > >> > >> >> the discussion is like.
>>>>>     >     > > >> > >> >>
>>>>>     >     > > >> > >> >> This is why I'm asking people who are against
>>>> headers
>>>>>     >     > > (especially
>>>>>     >     > > >> > >> >> committers) what will make them change their
>>>> mind - so
>>>>>     > we can
>>>>>     >     > > get
>>>>>     >     > > >> > this
>>>>>     >     > > >> > >> >> part over one way or another.
>>>>>     >     > > >> > >> >>
>>>>>     >     > > >> > >> >> If I sound frustrated it is not at Radai, Jun
>>>> or you
>>>>>     > (Todd)...
>>>>>     >     > > I am
>>>>>     >     > > >> > >> >> just looking for something concrete we can do
>>>> to move
>>>>>     > the
>>>>>     >     > > >> discussion
>>>>>     >     > > >> > >> >> along to the yummy design details (which is the
>>>>>     > argument I
>>>>>     >     > > really
>>>>>     >     > > >> am
>>>>>     >     > > >> > >> >> looking forward to).
>>>>>     >     > > >> > >> >>
>>>>>     >     > > >> > >> >> On Thu, Dec 1, 2016 at 1:53 PM, Todd Palino <
>>>>>     >     > tpal...@gmail.com>
>>>>>     >     > > >> > wrote:
>>>>>     >     > > >> > >> >> > So, Gwen, to your question (even though I’m
>>>> not a
>>>>>     >     > > committer)...
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > I have always been a strong supporter of
>>>> introducing
>>>>>     > the
>>>>>     >     > > concept
>>>>>     >     > > >> > of an
>>>>>     >     > > >> > >> >> > envelope to messages, which headers
>>>> accomplishes. The
>>>>>     >     > message
>>>>>     >     > > key
>>>>>     >     > > >> > is
>>>>>     >     > > >> > >> >> > already an example of a piece of envelope
>>>>>     > information. By
>>>>>     >     > > >> > providing a
>>>>>     >     > > >> > >> >> means
>>>>>     >     > > >> > >> >> > to do this within Kafka itself, and not
>>>> relying on
>>>>>     > use-case
>>>>>     >     > > >> > specific
>>>>>     >     > > >> > >> >> > implementations, you make it much easier for
>>>>>     > components to
>>>>>     >     > > >> > >> interoperate.
>>>>>     >     > > >> > >> >> It
>>>>>     >     > > >> > >> >> > simplifies development of all these things
>>>> (message
>>>>>     > routing,
>>>>>     >     > > >> > auditing,
>>>>>     >     > > >> > >> >> > encryption, etc.) because each one does not
>>>> have to
>>>>>     > reinvent
>>>>>     >     > > the
>>>>>     >     > > >> > >> wheel.
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > It also makes it much easier from a client
>>>> point of
>>>>>     > view if
>>>>>     >     > > the
>>>>>     >     > > >> > >> headers
>>>>>     >     > > >> > >> >> are
>>>>>     >     > > >> > >> >> > defined as part of the protocol and/or
>>>> message format
>>>>>     > in
>>>>>     >     > > general
>>>>>     >     > > >> > >> because
>>>>>     >     > > >> > >> >> > you can easily produce and consume messages
>>>> without
>>>>>     > having
>>>>>     >     > to
>>>>>     >     > > >> take
>>>>>     >     > > >> > >> into
>>>>>     >     > > >> > >> >> > account specific cases. For example, I want
>>>> to route
>>>>>     >     > messages,
>>>>>     >     > > >> but
>>>>>     >     > > >> > >> >> client A
>>>>>     >     > > >> > >> >> > doesn’t support the way audit implemented
>>>> headers, and
>>>>>     >     > client
>>>>>     >     > > B
>>>>>     >     > > >> > >> doesn’t
>>>>>     >     > > >> > >> >> > support the way encryption or routing
>>>> implemented
>>>>>     > headers,
>>>>>     >     > so
>>>>>     >     > > now
>>>>>     >     > > >> > my
>>>>>     >     > > >> > >> >> > application has to create some really fragile
>>>> (my
>>>>>     >     > autocorrect
>>>>>     >     > > >> just
>>>>>     >     > > >> > >> tried
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> > make that “tragic”, which is probably
>>>> appropriate
>>>>>     > too) code
>>>>>     >     > to
>>>>>     >     > > >> > strip
>>>>>     >     > > >> > >> >> > everything off, rather than just consuming the
>>>>>     > messages,
>>>>>     >     > > picking
>>>>>     >     > > >> > out
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> 1
>>>>>     >     > > >> > >> >> > or 2 headers it’s interested in, and
>>>> performing its
>>>>>     >     > function.
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > Honestly, this discussion has been going on
>>>> for a
>>>>>     > long time,
>>>>>     >     > > and
>>>>>     >     > > >> > it’s
>>>>>     >     > > >> > >> >> > always “Oh, you came up with 2 use cases, and
>>>> yeah,
>>>>>     > those
>>>>>     >     > use
>>>>>     >     > > >> cases
>>>>>     >     > > >> > >> are
>>>>>     >     > > >> > >> >> > real things that someone would want to do.
>>>> Here’s an
>>>>>     >     > alternate
>>>>>     >     > > >> way
>>>>>     >     > > >> > to
>>>>>     >     > > >> > >> >> > implement them so let’s not do headers.” If
>>>> we have a
>>>>>     > few
>>>>>     >     > use
>>>>>     >     > > >> cases
>>>>>     >     > > >> > >> that
>>>>>     >     > > >> > >> >> we
>>>>>     >     > > >> > >> >> > actually came up with, you can be sure that
>>>> over the
>>>>>     > next
>>>>>     >     > year
>>>>>     >     > > >> > >> there’s a
>>>>>     >     > > >> > >> >> > dozen others that we didn’t think of that
>>>> someone
>>>>>     > would like
>>>>>     >     > > to
>>>>>     >     > > >> > do. I
>>>>>     >     > > >> > >> >> > really think it’s time to stop rehashing this
>>>>>     > discussion and
>>>>>     >     > > >> > instead
>>>>>     >     > > >> > >> >> focus
>>>>>     >     > > >> > >> >> > on a workable standard that we can adopt.
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > -Todd
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> > On Thu, Dec 1, 2016 at 1:39 PM, Todd Palino <
>>>>>     >     > > tpal...@gmail.com>
>>>>>     >     > > >> > >> wrote:
>>>>>     >     > > >> > >> >> >
>>>>>     >     > > >> > >> >> >> C. per message encryption
>>>>>     >     > > >> > >> >> >>> One drawback of this approach is that this
>>>>>     > significantly
>>>>>     >     > > reduce
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> effectiveness of compression, which happens
>>>> on a
>>>>>     > set of
>>>>>     >     > > >> > serialized
>>>>>     >     > > >> > >> >> >>> messages. An alternative is to enable SSL
>>>> for wire
>>>>>     >     > > encryption
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> rely
>>>>>     >     > > >> > >> >> on
>>>>>     >     > > >> > >> >> >>> the storage system (e.g. LUKS) for at rest
>>>>>     > encryption.
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >> Jun, this is not sufficient. While this does
>>>> cover
>>>>>     > the case
>>>>>     >     > > of
>>>>>     >     > > >> > >> removing
>>>>>     >     > > >> > >> >> a
>>>>>     >     > > >> > >> >> >> drive from the system, it will not satisfy
>>>> most
>>>>>     > compliance
>>>>>     >     > > >> > >> requirements
>>>>>     >     > > >> > >> >> for
>>>>>     >     > > >> > >> >> >> encryption of data as whoever has access to
>>>> the
>>>>>     > broker
>>>>>     >     > itself
>>>>>     >     > > >> > still
>>>>>     >     > > >> > >> has
>>>>>     >     > > >> > >> >> >> access to the unencrypted data. For
>>>> end-to-end
>>>>>     > encryption
>>>>>     >     > you
>>>>>     >     > > >> > need to
>>>>>     >     > > >> > >> >> >> encrypt at the producer, before it enters the
>>>>>     > system, and
>>>>>     >     > > >> decrypt
>>>>>     >     > > >> > at
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> >> consumer, after it exits the system.
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >> -Todd
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >> On Thu, Dec 1, 2016 at 1:03 PM, radai <
>>>>>     >     > > >> radai.rosenbl...@gmail.com
>>>>>     >     > > >> > >
>>>>>     >     > > >> > >> >> wrote:
>>>>>     >     > > >> > >> >> >>
>>>>>     >     > > >> > >> >> >>> another big plus of headers in the protocol
>>>> is that
>>>>>     > it
>>>>>     >     > would
>>>>>     >     > > >> > enable
>>>>>     >     > > >> > >> >> rapid
>>>>>     >     > > >> > >> >> >>> iteration on ideas outside of core kafka
>>>> and would
>>>>>     > reduce
>>>>>     >     > > the
>>>>>     >     > > >> > >> number of
>>>>>     >     > > >> > >> >> >>> future wire format changes required.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> a lot of what is currently a KIP represents
>>>> use
>>>>>     > cases that
>>>>>     >     > > are
>>>>>     >     > > >> > not
>>>>>     >     > > >> > >> 100%
>>>>>     >     > > >> > >> >> >>> relevant to all users, and some of them
>>>> require
>>>>>     > rather
>>>>>     >     > > invasive
>>>>>     >     > > >> > wire
>>>>>     >     > > >> > >> >> >>> protocol changes. a thing a good recent
>>>> example of
>>>>>     > this is
>>>>>     >     > > >> > kip-98.
>>>>>     >     > > >> > >> >> >>> tx-utilizing traffic is expected to be a
>>>> very small
>>>>>     >     > > fraction of
>>>>>     >     > > >> > >> total
>>>>>     >     > > >> > >> >> >>> traffic and yet the changes are invasive.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> every such wire format change translates
>>>> into
>>>>>     > painful and
>>>>>     >     > > slow
>>>>>     >     > > >> > >> >> adoption of
>>>>>     >     > > >> > >> >> >>> new versions.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> i think a lot of functionality currently in
>>>> KIPs
>>>>>     > could be
>>>>>     >     > > "spun
>>>>>     >     > > >> > out"
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> implemented as opt-in plugins transmitting
>>>> data over
>>>>>     >     > > headers.
>>>>>     >     > > >> > this
>>>>>     >     > > >> > >> >> would
>>>>>     >     > > >> > >> >> >>> keep the core wire format stable(r), core
>>>> codebase
>>>>>     >     > smaller,
>>>>>     >     > > and
>>>>>     >     > > >> > >> avoid
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> "burden of proof" thats sometimes required
>>>> to prove
>>>>>     > a
>>>>>     >     > > certain
>>>>>     >     > > >> > >> feature
>>>>>     >     > > >> > >> >> is
>>>>>     >     > > >> > >> >> >>> useful enough for a wide-enough audience to
>>>> warrant
>>>>>     > a wire
>>>>>     >     > > >> format
>>>>>     >     > > >> > >> >> change
>>>>>     >     > > >> > >> >> >>> and code complexity additions.
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> (to be clear - kip-98 goes beyond "mere"
>>>> wire format
>>>>>     >     > changes
>>>>>     >     > > >> and
>>>>>     >     > > >> > im
>>>>>     >     > > >> > >> not
>>>>>     >     > > >> > >> >> >>> saying it could have been completely done
>>>> with
>>>>>     > headers,
>>>>>     >     > but
>>>>>     >     > > >> > >> >> exactly-once
>>>>>     >     > > >> > >> >> >>> delivery certainly could)
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> On Thu, Dec 1, 2016 at 11:20 AM, Gwen
>>>> Shapira <
>>>>>     >     > > >> g...@confluent.io
>>>>>     >     > > >> > >
>>>>>     >     > > >> > >> >> wrote:
>>>>>     >     > > >> > >> >> >>>
>>>>>     >     > > >> > >> >> >>> > On Thu, Dec 1, 2016 at 10:24 AM, radai <
>>>>>     >     > > >> > >> radai.rosenbl...@gmail.com>
>>>>>     >     > > >> > >> >> >>> wrote:
>>>>>     >     > > >> > >> >> >>> > > "For use cases within an organization,
>>>> one could
>>>>>     >     > always
>>>>>     >     > > use
>>>>>     >     > > >> > >> other
>>>>>     >     > > >> > >> >> >>> > > approaches such as company-wise
>>>> containers"
>>>>>     >     > > >> > >> >> >>> > > this is what linkedin has traditionally
>>>> done
>>>>>     > but there
>>>>>     >     > > are
>>>>>     >     > > >> > now
>>>>>     >     > > >> > >> >> cases
>>>>>     >     > > >> > >> >> >>> > (read
>>>>>     >     > > >> > >> >> >>> > > - topics) where this is not acceptable.
>>>> this
>>>>>     > makes
>>>>>     >     > > headers
>>>>>     >     > > >> > >> useful
>>>>>     >     > > >> > >> >> even
>>>>>     >     > > >> > >> >> >>> > > within single orgs for cases where
>>>>>     >     > > one-container-fits-all
>>>>>     >     > > >> > cannot
>>>>>     >     > > >> > >> >> >>> apply.
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > > as for the particular use cases listed,
>>>> i dont
>>>>>     > want
>>>>>     >     > > this to
>>>>>     >     > > >> > >> devolve
>>>>>     >     > > >> > >> >> >>> to a
>>>>>     >     > > >> > >> >> >>> > > discussion of particular use cases - i
>>>> think its
>>>>>     >     > enough
>>>>>     >     > > >> that
>>>>>     >     > > >> > >> some
>>>>>     >     > > >> > >> >> of
>>>>>     >     > > >> > >> >> >>> them
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I think a main point of contention is
>>>> that: We
>>>>>     >     > identified
>>>>>     >     > > few
>>>>>     >     > > >> > >> >> >>> > use-cases where headers are useful, do we
>>>> want
>>>>>     > Kafka to
>>>>>     >     > > be a
>>>>>     >     > > >> > >> system
>>>>>     >     > > >> > >> >> >>> > that supports those use-cases?
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > For example, Jun said:
>>>>>     >     > > >> > >> >> >>> > "Not sure how widely useful record-level
>>>> lineage
>>>>>     > is
>>>>>     >     > though
>>>>>     >     > > >> > since
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> >>> > overhead could
>>>>>     >     > > >> > >> >> >>> > be significant."
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > We know NiFi supports record level
>>>> lineage. I
>>>>>     > don't
>>>>>     >     > think
>>>>>     >     > > it
>>>>>     >     > > >> > was
>>>>>     >     > > >> > >> >> >>> > developed for lols, I think it is safe to
>>>> assume
>>>>>     > that
>>>>>     >     > the
>>>>>     >     > > NSA
>>>>>     >     > > >> > >> needed
>>>>>     >     > > >> > >> >> >>> > that functionality. We also know that
>>>> certain
>>>>>     > financial
>>>>>     >     > > >> > institutes
>>>>>     >     > > >> > >> >> >>> > need to track tampering with records at a
>>>> record
>>>>>     > level
>>>>>     >     > and
>>>>>     >     > > >> > there
>>>>>     >     > > >> > >> are
>>>>>     >     > > >> > >> >> >>> > federal regulations that absolutely
>>>> require
>>>>>     > this.  They
>>>>>     >     > > also
>>>>>     >     > > >> > need
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> >>> > prove that routing apps that "touches" the
>>>>>     > messages and
>>>>>     >     > > >> either
>>>>>     >     > > >> > >> reads
>>>>>     >     > > >> > >> >> >>> > or updates headers couldn't have possibly
>>>>>     > modified the
>>>>>     >     > > >> payload
>>>>>     >     > > >> > >> >> itself.
>>>>>     >     > > >> > >> >> >>> > They use record level encryption to do
>>>> that -
>>>>>     > apps can
>>>>>     >     > > read
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> >> >>> > (sometimes) modify headers but can't
>>>> touch the
>>>>>     > payload.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > We can totally say "those are corner
>>>> cases and
>>>>>     > not worth
>>>>>     >     > > >> adding
>>>>>     >     > > >> > >> >> >>> > headers to Kafka for", they should use a
>>>> different
>>>>>     >     > pubsub
>>>>>     >     > > >> > message
>>>>>     >     > > >> > >> for
>>>>>     >     > > >> > >> >> >>> > that (Nifi or one of the other 1000 that
>>>> cater
>>>>>     >     > > specifically
>>>>>     >     > > >> to
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> > financial industry).
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > But this gets us into a catch 22:
>>>>>     >     > > >> > >> >> >>> > If we discuss a specific use-case,
>>>> someone can
>>>>>     > always
>>>>>     >     > say
>>>>>     >     > > it
>>>>>     >     > > >> > isn't
>>>>>     >     > > >> > >> >> >>> > interesting enough for Kafka. If we
>>>> discuss more
>>>>>     > general
>>>>>     >     > > >> > trends,
>>>>>     >     > > >> > >> >> >>> > others can say "well, we are not sure any
>>>> of them
>>>>>     > really
>>>>>     >     > > >> needs
>>>>>     >     > > >> > >> >> headers
>>>>>     >     > > >> > >> >> >>> > specifically. This is just hand waving
>>>> and not
>>>>>     >     > > interesting.".
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I think discussing use-cases in specifics
>>>> is super
>>>>>     >     > > important
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> >> decide
>>>>>     >     > > >> > >> >> >>> > implementation details for headers (my
>>>> use-cases
>>>>>     > lean
>>>>>     >     > > toward
>>>>>     >     > > >> > >> >> numerical
>>>>>     >     > > >> > >> >> >>> > keys with namespaces and object values,
>>>> others
>>>>>     > differ),
>>>>>     >     > > but I
>>>>>     >     > > >> > >> think
>>>>>     >     > > >> > >> >> we
>>>>>     >     > > >> > >> >> >>> > need to answer the general "Are we going
>>>> to have
>>>>>     >     > headers"
>>>>>     >     > > >> > question
>>>>>     >     > > >> > >> >> >>> > first.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I'd love to hear from the other
>>>> committers in the
>>>>>     >     > > discussion:
>>>>>     >     > > >> > >> >> >>> > What would it take to convince you that
>>>> headers
>>>>>     > in Kafka
>>>>>     >     > > are
>>>>>     >     > > >> a
>>>>>     >     > > >> > >> good
>>>>>     >     > > >> > >> >> >>> > idea in general, so we can move ahead and
>>>> try to
>>>>>     > agree
>>>>>     >     > on
>>>>>     >     > > the
>>>>>     >     > > >> > >> >> details?
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > I feel like we keep moving the goal posts
>>>> and
>>>>>     > this is
>>>>>     >     > > truly
>>>>>     >     > > >> > >> >> exhausting.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > For the record, I mildly support adding
>>>> headers
>>>>>     > to Kafka
>>>>>     >     > > >> > (+0.5?).
>>>>>     >     > > >> > >> >> >>> > The community can continue to find
>>>> workarounds to
>>>>>     > the
>>>>>     >     > > issue
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> there
>>>>>     >     > > >> > >> >> >>> > are some benefits to keeping the message
>>>> format
>>>>>     > and
>>>>>     >     > > clients
>>>>>     >     > > >> > >> simpler.
>>>>>     >     > > >> > >> >> >>> > But I see the usefulness of headers to
>>>> many
>>>>>     > use-cases
>>>>>     >     > and
>>>>>     >     > > if
>>>>>     >     > > >> we
>>>>>     >     > > >> > >> can
>>>>>     >     > > >> > >> >> >>> > find a good and generally useful way to
>>>> add it to
>>>>>     > Kafka,
>>>>>     >     > > it
>>>>>     >     > > >> > will
>>>>>     >     > > >> > >> make
>>>>>     >     > > >> > >> >> >>> > Kafka easier to use for many - worthy
>>>> goal in my
>>>>>     > eyes.
>>>>>     >     > > >> > >> >> >>> >
>>>>>     >     > > >> > >> >> >>> > > are interesting/feasible, but:
>>>>>     >     > > >> > >> >> >>> > > A+B. i think there are use cases for
>>>> polyglot
>>>>>     > topics.
>>>>>     >     > > >> > >> especially if
>>>>>     >     > > >> > >> >> >>> kafka
>>>>>     >     > > >> > >> >> >>> > > is being used to "trunk" something else.
>>>>>     >     > > >> > >> >> >>> > > D. multiple topics would make it harder
>>>> to write
>>>>>     >     > > portable
>>>>>     >     > > >> > >> consumer
>>>>>     >     > > >> > >> >> >>> code.
>>>>>     >     > > >> > >> >> >>> > > partition remapping would mess with
>>>> locality of
>>>>>     >     > > consumption
>>>>>     >     > > >> > >> >> >>> guarantees.
>>>>>     >     > > >> > >> >> >>> > > E+F. a use case I see for
>>>> lineage/metadata is
>>>>>     >     > > >> > >> billing/chargeback.
>>>>>     >     > > >> > >> >> for
>>>>>     >     > > >> > >> >> >>> > that
>>>>>     >     > > >> > >> >> >>> > > use case it is not enough to simply
>>>> record the
>>>>>     > point
>>>>>     >     > of
>>>>>     >     > > >> > origin,
>>>>>     >     > > >> > >> but
>>>>>     >     > > >> > >> >> >>> every
>>>>>     >     > > >> > >> >> >>> > > replication stop (think mirror maker)
>>>> must also
>>>>>     > add a
>>>>>     >     > > >> record
>>>>>     >     > > >> > to
>>>>>     >     > > >> > >> >> form a
>>>>>     >     > > >> > >> >> >>> > > "transit log".
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > > as for stream processing on top of
>>>> kafka - i
>>>>>     > know
>>>>>     >     > samza
>>>>>     >     > > >> has a
>>>>>     >     > > >> > >> >> metadata
>>>>>     >     > > >> > >> >> >>> > map
>>>>>     >     > > >> > >> >> >>> > > which they carry around in addition to
>>>> user
>>>>>     > values.
>>>>>     >     > > headers
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> >>> > perfect
>>>>>     >     > > >> > >> >> >>> > > fit for these things.
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > > On Wed, Nov 30, 2016 at 6:50 PM, Jun
>>>> Rao <
>>>>>     >     > > j...@confluent.io
>>>>>     >     > > >> >
>>>>>     >     > > >> > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >
>>>>>     >     > > >> > >> >> >>> > >> Hi, Michael,
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> In order to answer the first two
>>>> questions, it
>>>>>     > would
>>>>>     >     > be
>>>>>     >     > > >> > helpful
>>>>>     >     > > >> > >> >> if we
>>>>>     >     > > >> > >> >> >>> > could
>>>>>     >     > > >> > >> >> >>> > >> identify 1 or 2 strong use cases for
>>>> headers
>>>>>     > in the
>>>>>     >     > > space
>>>>>     >     > > >> > for
>>>>>     >     > > >> > >> >> >>> > third-party
>>>>>     >     > > >> > >> >> >>> > >> vendors. For use cases within an
>>>> organization,
>>>>>     > one
>>>>>     >     > > could
>>>>>     >     > > >> > always
>>>>>     >     > > >> > >> >> use
>>>>>     >     > > >> > >> >> >>> > other
>>>>>     >     > > >> > >> >> >>> > >> approaches such as company-wise
>>>> containers to
>>>>>     > get
>>>>>     >     > > around
>>>>>     >     > > >> w/o
>>>>>     >     > > >> > >> >> >>> headers. I
>>>>>     >     > > >> > >> >> >>> > >> went through the use cases in the KIP
>>>> and in
>>>>>     > Radai's
>>>>>     >     > > wiki
>>>>>     >     > > >> (
>>>>>     >     > > >> > >> >> >>> > >> https://cwiki.apache.org/confl
>>>>>     > uence/display/KAFKA/A+
>>>>>     >     > > >> > >> >> >>> > Case+for+Kafka+Headers
>>>>>     >     > > >> > >> >> >>> > >> ).
>>>>>     >     > > >> > >> >> >>> > >> The following are the ones that that I
>>>>>     > understand and
>>>>>     >     > > >> could
>>>>>     >     > > >> > be
>>>>>     >     > > >> > >> in
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> third-party use case category.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> A. content-type
>>>>>     >     > > >> > >> >> >>> > >> It seems that in general, content-type
>>>> should
>>>>>     > be set
>>>>>     >     > at
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> topic
>>>>>     >     > > >> > >> >> >>> level.
>>>>>     >     > > >> > >> >> >>> > >> Not sure if mixing messages with
>>>> different
>>>>>     > content
>>>>>     >     > > types
>>>>>     >     > > >> > >> should be
>>>>>     >     > > >> > >> >> >>> > >> encouraged.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> B. schema id
>>>>>     >     > > >> > >> >> >>> > >> Since the value is mostly useless
>>>> without
>>>>>     > schema id,
>>>>>     >     > it
>>>>>     >     > > >> > seems
>>>>>     >     > > >> > >> that
>>>>>     >     > > >> > >> >> >>> > storing
>>>>>     >     > > >> > >> >> >>> > >> the schema id together with serialized
>>>> bytes
>>>>>     > in the
>>>>>     >     > > value
>>>>>     >     > > >> is
>>>>>     >     > > >> > >> >> better?
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> C. per message encryption
>>>>>     >     > > >> > >> >> >>> > >> One drawback of this approach is that
>>>> this
>>>>>     >     > > significantly
>>>>>     >     > > >> > reduce
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> effectiveness of compression, which
>>>> happens on
>>>>>     > a set
>>>>>     >     > of
>>>>>     >     > > >> > >> serialized
>>>>>     >     > > >> > >> >> >>> > >> messages. An alternative is to enable
>>>> SSL for
>>>>>     > wire
>>>>>     >     > > >> > encryption
>>>>>     >     > > >> > >> and
>>>>>     >     > > >> > >> >> >>> rely
>>>>>     >     > > >> > >> >> >>> > on
>>>>>     >     > > >> > >> >> >>> > >> the storage system (e.g. LUKS) for at
>>>> rest
>>>>>     >     > encryption.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> D. cluster ID for mirroring across
>>>> Kafka
>>>>>     > clusters
>>>>>     >     > > >> > >> >> >>> > >> This is actually interesting. Today,
>>>> to avoid
>>>>>     >     > > introducing
>>>>>     >     > > >> > >> cycles
>>>>>     >     > > >> > >> >> when
>>>>>     >     > > >> > >> >> >>> > doing
>>>>>     >     > > >> > >> >> >>> > >> mirroring across data centers, one
>>>> would
>>>>>     > either have
>>>>>     >     > to
>>>>>     >     > > >> set
>>>>>     >     > > >> > up
>>>>>     >     > > >> > >> two
>>>>>     >     > > >> > >> >> >>> Kafka
>>>>>     >     > > >> > >> >> >>> > >> clusters (a local and an aggregate)
>>>> per data
>>>>>     > center
>>>>>     >     > or
>>>>>     >     > > >> > rename
>>>>>     >     > > >> > >> >> topics.
>>>>>     >     > > >> > >> >> >>> > >> Neither is ideal. With headers, the
>>>> producer
>>>>>     > could
>>>>>     >     > tag
>>>>>     >     > > >> each
>>>>>     >     > > >> > >> >> message
>>>>>     >     > > >> > >> >> >>> with
>>>>>     >     > > >> > >> >> >>> > >> the producing cluster ID in the header.
>>>>>     > MirrorMaker
>>>>>     >     > > could
>>>>>     >     > > >> > then
>>>>>     >     > > >> > >> >> avoid
>>>>>     >     > > >> > >> >> >>> > >> mirroring messages to a cluster if
>>>> they are
>>>>>     > tagged
>>>>>     >     > with
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> same
>>>>>     >     > > >> > >> >> >>> cluster
>>>>>     >     > > >> > >> >> >>> > >> id.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> However, an alternative approach is to
>>>>>     > introduce sth
>>>>>     >     > > like
>>>>>     >     > > >> > >> >> >>> hierarchical
>>>>>     >     > > >> > >> >> >>> > >> topic and store messages from different
>>>>>     > clusters in
>>>>>     >     > > >> > different
>>>>>     >     > > >> > >> >> >>> partitions
>>>>>     >     > > >> > >> >> >>> > >> under the same topic. This approach
>>>> avoids
>>>>>     > filtering
>>>>>     >     > > out
>>>>>     >     > > >> > >> unneeded
>>>>>     >     > > >> > >> >> >>> data
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> makes offset preserving easier to
>>>> support. It
>>>>>     > may
>>>>>     >     > make
>>>>>     >     > > >> > >> compaction
>>>>>     >     > > >> > >> >> >>> > trickier
>>>>>     >     > > >> > >> >> >>> > >> though since the same key may show up
>>>> in
>>>>>     > different
>>>>>     >     > > >> > partitions.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> E. record-level lineage
>>>>>     >     > > >> > >> >> >>> > >> For example, a source connector could
>>>> store in
>>>>>     > the
>>>>>     >     > > message
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> metadata
>>>>>     >     > > >> > >> >> >>> > >> (e.g. UUID) of the source record.
>>>> Similarly,
>>>>>     > if a
>>>>>     >     > > stream
>>>>>     >     > > >> job
>>>>>     >     > > >> > >> >> >>> transforms
>>>>>     >     > > >> > >> >> >>> > >> messages from topic A to topic B, the
>>>> library
>>>>>     > could
>>>>>     >     > > >> include
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> source
>>>>>     >     > > >> > >> >> >>> > >> message offset in each of the
>>>> transformed
>>>>>     > message in
>>>>>     >     > > the
>>>>>     >     > > >> > >> header.
>>>>>     >     > > >> > >> >> Not
>>>>>     >     > > >> > >> >> >>> > sure
>>>
>>>>>     >     > > >> > >> >> >>> > >> how widely useful record-level lineage
>>>> is
>>>>>     > though
>>>>>     >     > since
>>>>>     >     > > the
>>>>>     >     > > >> > >> >> overhead
>>>>>     >     > > >> > >> >> >>> > could
>>>>>     >     > > >> > >> >> >>> > >> be significant.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> F. auditing metadata
>>>>>     >     > > >> > >> >> >>> > >> We could put things like
>>>> clientId/host/user in
>>>>>     > the
>>>>>     >     > > header
>>>>>     >     > > >> in
>>>>>     >     > > >> > >> each
>>>>>     >     > > >> > >> >> >>> > message
>>>>>     >     > > >> > >> >> >>> > >> for auditing. These metadata are
>>>> really at the
>>>>>     >     > producer
>>>>>     >     > > >> > level
>>>>>     >     > > >> > >> >> though.
>>>>>     >     > > >> > >> >> >>> > So, a
>>>>>     >     > > >> > >> >> >>> > >> more efficient way is to only include a
>>>>>     > "producerId"
>>>>>     >     > > per
>>>>>     >     > > >> > >> message
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> > send
>>>>>     >     > > >> > >> >> >>> > >> the producerId -> metadata mapping
>>>>>     > independently.
>>>>>     >     > > KIP-98
>>>>>     >     > > >> is
>>>>>     >     > > >> > >> >> actually
>>>>>     >     > > >> > >> >> >>> > >> proposing including such a producerId
>>>> natively
>>>>>     > in the
>>>>>     >     > > >> > message.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> So, overall, I not sure that I am fully
>>>>>     > convinced of
>>>>>     >     > > the
>>>>>     >     > > >> > strong
>>>>>     >     > > >> > >> >> >>> > third-party
>>>>>     >     > > >> > >> >> >>> > >> use cases of headers yet. Perhaps we
>>>> could
>>>>>     > discuss a
>>>>>     >     > > bit
>>>>>     >     > > >> > more
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> make
>>>>>     >     > > >> > >> >> >>> > one
>>>>>     >     > > >> > >> >> >>> > >> or two really convincing use cases.
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> Another orthogonal  question is
>>>> whether header
>>>>>     > should
>>>>>     >     > > be
>>>>>     >     > > >> > >> exposed
>>>>>     >     > > >> > >> >> in
>>>>>     >     > > >> > >> >> >>> > stream
>>>>>     >     > > >> > >> >> >>> > >> processing systems such Kafka stream,
>>>> Samza,
>>>>>     > and
>>>>>     >     > Spark
>>>>>     >     > > >> > >> streaming.
>>>>>     >     > > >> > >> >> >>> > >> Currently, those systems just deal with
>>>>>     > key/value
>>>>>     >     > > pairs.
>>>>>     >     > > >> > >> Should we
>>>>>     >     > > >> > >> >> >>> > expose a
>>>>>     >     > > >> > >> >> >>> > >> third thing header there too or
>>>> somehow map
>>>>>     > header to
>>>>>     >     > > key
>>>>>     >     > > >> or
>>>>>     >     > > >> > >> >> value?
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> Thanks,
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> Jun
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> On Tue, Nov 29, 2016 at 3:35 AM,
>>>> Michael
>>>>>     > Pearce <
>>>>>     >     > > >> > >> >> >>> michael.pea...@ig.com>
>>>>>     >     > > >> > >> >> >>> > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >>
>>>>>     >     > > >> > >> >> >>> > >> > I assume, that after a period of a
>>>> week,
>>>>>     > that there
>>>>>     >     > > is
>>>>>     >     > > >> no
>>>>>     >     > > >> > >> >> concerns
>>>>>     >     > > >> > >> >> >>> now
>>>>>     >     > > >> > >> >> >>> > >> > with points 1, and 2 and now we have
>>>>>     > agreement that
>>>>>     >     > > >> > headers
>>>>>     >     > > >> > >> are
>>>>>     >     > > >> > >> >> >>> useful
>>>>>     >     > > >> > >> >> >>> > >> and
>>>>>     >     > > >> > >> >> >>> > >> > needed in Kafka. As such if put to a
>>>> KIP
>>>>>     > vote, this
>>>>>     >     > > >> > wouldn’t
>>>>>     >     > > >> > >> be
>>>>>     >     > > >> > >> >> a
>>>>>     >     > > >> > >> >> >>> > reason
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> > reject.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> > @
>>>>>     >     > > >> > >> >> >>> > >> > Ignacio on point 4).
>>>>>     >     > > >> > >> >> >>> > >> > I think for purpose of getting this
>>>> KIP
>>>>>     > moving past
>>>>>     >     > > >> this,
>>>>>     >     > > >> > we
>>>>>     >     > > >> > >> can
>>>>>     >     > > >> > >> >> >>> state
>>>>>     >     > > >> > >> >> >>> > >> the
>>>>>     >     > > >> > >> >> >>> > >> > key will be a 4 bytes space that can
>>>> will be
>>>>>     >     > > naturally
>>>>>     >     > > >> > >> >> interpreted
>>>>>     >     > > >> > >> >> >>> as
>>>>>     >     > > >> > >> >> >>> > an
>>>>>     >     > > >> > >> >> >>> > >> > Int32 (if namespacing is later
>>>> wanted you can
>>>>>     >     > easily
>>>>>     >     > > >> split
>>>>>     >     > > >> > >> this
>>>>>     >     > > >> > >> >> >>> into
>>>>>     >     > > >> > >> >> >>> > two
>>>>>     >     > > >> > >> >> >>> > >> > int16 spaces), from the wire protocol
>>>>>     >     > implementation
>>>>>     >     > > >> this
>>>>>     >     > > >> > >> makes
>>>>>     >     > > >> > >> >> no
>>>>>     >     > > >> > >> >> >>> > >> > difference I don’t believe. Is this
>>>>>     > reasonable to
>>>>>     >     > > all?
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> > On 5) as per point 4 therefor happy
>>>> we keep
>>>>>     > with 32
>>>>>     >     > > >> bits.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> > On 18/11/2016, 20:34, "
>>>>>     > ignacio.so...@gmail.com on
>>>>>     >     > > >> behalf
>>>>>     >     > > >> > of
>>>>>     >     > > >> > >> >> >>> Ignacio
>>>>>     >     > > >> > >> >> >>> > >> > Solis" <ignacio.so...@gmail.com on
>>>> behalf of
>>>>>     >     > > >> > iso...@igso.net
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> > >> >> >>> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     Summary:
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     3) Yes - Header value as byte[]
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     4a) Int,Int - No
>>>>>     >     > > >> > >> >> >>> > >> >     4b) Int - Yes
>>>>>     >     > > >> > >> >> >>> > >> >     4c) String - Reluctant maybe
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     5) I believe the header system
>>>> should
>>>>>     > take a
>>>>>     >     > > single
>>>>>     >     > > >> > >> int.  I
>>>>>     >     > > >> > >> >> >>> think
>>>>>     >     > > >> > >> >> >>> > >> > 32bits is
>>>>>     >     > > >> > >> >> >>> > >> >     a good size, if you want to
>>>> interpret
>>>>>     > this as
>>>>>     >     > to
>>>>>     >     > > >> 16bit
>>>>>     >     > > >> > >> >> numbers
>>>>>     >     > > >> > >> >> >>> in
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > layer
>>>>>     >     > > >> > >> >> >>> > >> >     above go right ahead.  If
>>>> somebody wants
>>>>>     > to
>>>>>     >     > argue
>>>>>     >     > > >> for
>>>>>     >     > > >> > 16
>>>>>     >     > > >> > >> >> bits
>>>>>     >     > > >> > >> >> >>> or
>>>>>     >     > > >> > >> >> >>> > 64
>>>>>     >     > > >> > >> >> >>> > >> > bits of
>>>>>     >     > > >> > >> >> >>> > >> >     header key space I would listen.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     Discussion:
>>>>>     >     > > >> > >> >> >>> > >> >     Dividing the key space into
>>>> sub_key_1 and
>>>>>     >     > > sub_key_2
>>>>>     >     > > >> > >> makes no
>>>>>     >     > > >> > >> >> >>> > sense to
>>>>>     >     > > >> > >> >> >>> > >> > me at
>>>>>     >     > > >> > >> >> >>> > >> >     this layer.  Are we going to
>>>> start
>>>>>     > providing
>>>>>     >     > > APIs to
>>>>>     >     > > >> > get
>>>>>     >     > > >> > >> all
>>>>>     >     > > >> > >> >> >>> the
>>>>>     >     > > >> > >> >> >>> > >> >     sub_key_1s? or all the
>>>> sub_key_2s?  If
>>>>>     > there is
>>>>>     >     > > no
>>>>>     >     > > >> > >> >> >>> distinguishing
>>>>>     >     > > >> > >> >> >>> > >> > functions
>>>>>     >     > > >> > >> >> >>> > >> >     that are applied to each one
>>>> then they
>>>>>     > should
>>>>>     >     > be
>>>>>     >     > > a
>>>>>     >     > > >> > single
>>>>>     >     > > >> > >> >> >>> value.
>>>>>     >     > > >> > >> >> >>> > At
>>>>>     >     > > >> > >> >> >>> > >> > this
>>>>>     >     > > >> > >> >> >>> > >> >     layer all we're doing is
>>>> equality.
>>>>>     >     > > >> > >> >> >>> > >> >     If the above layer wants to
>>>> interpret
>>>>>     > this as
>>>>>     >     > 2,
>>>>>     >     > > 3
>>>>>     >     > > >> or
>>>>>     >     > > >> > >> more
>>>>>     >     > > >> > >> >> >>> values
>>>>>     >     > > >> > >> >> >>> > >> > that's a
>>>>>     >     > > >> > >> >> >>> > >> >     different question.  I
>>>> personally think
>>>>>     > it's
>>>>>     >     > all
>>>>>     >     > > one
>>>>>     >     > > >> > >> >> keyspace
>>>>>     >     > > >> > >> >> >>> > that is
>>>>>     >     > > >> > >> >> >>> > >> >     getting assigned using some
>>>> structure,
>>>>>     > but if
>>>>>     >     > you
>>>>>     >     > > >> > want to
>>>>>     >     > > >> > >> >> >>> > sub-assign
>>>>>     >     > > >> > >> >> >>> > >> > parts
>>>>>     >     > > >> > >> >> >>> > >> >     of it then that's fine.
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     The same discussion applies to
>>>> strings.
>>>>>     > If
>>>>>     >     > > somebody
>>>>>     >     > > >> > >> argued
>>>>>     >     > > >> > >> >> for
>>>>>     >     > > >> > >> >> >>> > >> > strings,
>>>>>     >     > > >> > >> >> >>> > >> >     would we be arguing to divide the
>>>>>     > strings with
>>>>>     >     > > dots
>>>>>     >     > > >> > ('.')
>>>>>     >     > > >> > >> >> as a
>>>>>     >     > > >> > >> >> >>> > >> > requirement?
>>>>>     >     > > >> > >> >> >>> > >> >     Would we want them to give us the
>>>>>     > different
>>>>>     >     > name
>>>>>     >     > > >> > segments
>>>>>     >     > > >> > >> >> >>> > separately?
>>>>>     >     > > >> > >> >> >>> > >> >     Would we be performing any
>>>> actions on
>>>>>     > this key
>>>>>     >     > > other
>>>>>     >     > > >> > than
>>>>>     >     > > >> > >> >> >>> > matching?
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     Nacho
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     On Fri, Nov 18, 2016 at 9:30 AM,
>>>> Michael
>>>>>     >     > Pearce <
>>>>>     >     > > >> > >> >> >>> > >> michael.pea...@ig.com
>>>>>     >     > > >> > >> >> >>> > >> > >
>>>>>     >     > > >> > >> >> >>> > >> >     wrote:
>>>>>     >     > > >> > >> >> >>> > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > #jay #jun any concerns on 1
>>>> and 2
>>>>>     > still?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > @all
>>>>>     >     > > >> > >> >> >>> > >> >     > To get this moving along a bit
>>>> more
>>>>>     > I'd also
>>>>>     >     > > like
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> ask
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> >>> get
>>>>>     >     > > >> > >> >> >>> > >> > clarity on
>>>>>     >     > > >> > >> >> >>> > >> >     > the below last points:
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 3) I believe we're all roughly
>>>> happy
>>>>>     > with the
>>>>>     >     > > >> header
>>>>>     >     > > >> > >> value
>>>>>     >     > > >> > >> >> >>> > being a
>>>>>     >     > > >> > >> >> >>> > >> > byte[]?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 4) I believe consensus has
>>>> been for an
>>>>>     >     > > namespace
>>>>>     >     > > >> > based
>>>>>     >     > > >> > >> int
>>>>>     >     > > >> > >> >> >>> > approach
>>>>>     >     > > >> > >> >> >>> > >> >     > {int,int} for the key. Any
>>>> objections
>>>>>     > if this
>>>>>     >     > > is
>>>>>     >     > > >> > what
>>>>>     >     > > >> > >> we
>>>>>     >     > > >> > >> >> go
>>>>>     >     > > >> > >> >> >>> > with?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 5) as we have if assumption in
>>>> (4)  is
>>>>>     >     > correct,
>>>>>     >     > > >> > >> {int,int}
>>>>>     >     > > >> > >> >> >>> keys.
>>>>>     >     > > >> > >> >> >>> > >> >     > Should both int's be int16 or
>>>> int32?
>>>>>     >     > > >> > >> >> >>> > >> >     > I'm for them being int16(2
>>>> bytes) as
>>>>>     > combined
>>>>>     >     > > is
>>>>>     >     > > >> > space
>>>>>     >     > > >> > >> of
>>>>>     >     > > >> > >> >> >>> > 4bytes as
>>>>>     >     > > >> > >> >> >>> > >> > per
>>>>>     >     > > >> > >> >> >>> > >> >     > original and gives plenty of
>>>>>     > combinations for
>>>>>     >     > > the
>>>>>     >     > > >> > >> >> >>> foreseeable,
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > keeps
>>>>>     >     > > >> > >> >> >>> > >> >     > the overhead small.
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > Do we see any benefit in
>>>> another kip
>>>>>     > call to
>>>>>     >     > > >> discuss
>>>>>     >     > > >> > >> >> these at
>>>>>     >     > > >> > >> >> >>> > all?
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > Cheers
>>>>>     >     > > >> > >> >> >>> > >> >     > Mike
>>>>>     >     > > >> > >> >> >>> > >> >     > ______________________________
>>>>>     > __________
>>>>>     >     > > >> > >> >> >>> > >> >     > From: K Burstev <
>>>> k.burs...@yandex.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > Sent: Friday, November 18, 2016
>>>>>     > 7:07:07 AM
>>>>>     >     > > >> > >> >> >>> > >> >     > To: dev@kafka.apache.org
>>>>>     >     > > >> > >> >> >>> > >> >     > Subject: Re: [DISCUSS] KIP-82
>>>> - Add
>>>>>     > Record
>>>>>     >     > > Headers
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > For what it is worth also i
>>>> agree. As
>>>>>     > a user:
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     >  1) Yes - Headers are
>>>> worthwhile
>>>>>     >     > > >> > >> >> >>> > >> >     >  2) Yes - Headers should be a
>>>> top level
>>>>>     >     > option
>>>>>     >     > > >> > >> >> >>> > >> >     >
>>>>>     >     > > >> > >> >> >>> > >> >     > 14.11.2016, 21:15, "Ignacio
>>>> Solis" <
>>>>>     >     > > >> iso...@igso.net
>>>>>     >     > > >> > >:
>>>>>     >     > > >> > >> >> >>> > >> >     > > 1) Yes - Headers are
>>>> worthwhile
>>>>>     >     > > >> > >> >> >>> > >> >     > > 2) Yes - Headers should be a
>>>> top
>>>>>     > level
>>>>>     >     > option
>>>>>     >     > > >> > >> >> >>> > >> >     > >
>>>>>     >     > > >> > >> >> >>> > >> >     > > On Mon, Nov 14, 2016 at 9:16
>>>> AM,
>>>>>     > Michael
>>>>>     >     > > Pearce
>>>>>     >     > > >> <
>>>>>     >     > > >> > >> >> >>> > >> > michael.pea...@ig.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Hi Roger,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  The kip details/examples
>>>> the
>>>>>     > original
>>>>>     >     > > proposal
>>>>>     >     > > >> > for
>>>>>     >     > > >> > >> key
>>>>>     >     > > >> > >> >> >>> > spacing
>>>>>     >     > > >> > >> >> >>> > >> ,
>>>>>     >     > > >> > >> >> >>> > >> > not
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  new mentioned as per
>>>> discussion
>>>>>     > namespace
>>>>>     >     > > >> idea.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  We will need to update the
>>>> kip,
>>>>>     > when we
>>>>>     >     > get
>>>>>     >     > > >> > >> agreement
>>>>>     >     > > >> > >> >> >>> this
>>>>>     >     > > >> > >> >> >>> > is a
>>>>>     >     > > >> > >> >> >>> > >> > better
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  approach (which seems to
>>>> be the
>>>>>     > case if I
>>>>>     >     > > have
>>>>>     >     > > >> > >> >> understood
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > general
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  feeling in the
>>>> conversation)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Re the variable ints, at
>>>> very
>>>>>     > early stage
>>>>>     >     > > we
>>>>>     >     > > >> did
>>>>>     >     > > >> > >> think
>>>>>     >     > > >> > >> >> >>> about
>>>>>     >     > > >> > >> >> >>> > >> > this. I
>>>>>     >     > > >> > >> >> >>> > >> >     > think
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  the added complexity for
>>>> the
>>>>>     > saving isn't
>>>>>     >     > > >> worth
>>>>>     >     > > >> > it.
>>>>>     >     > > >> > >> >> I'd
>>>>>     >     > > >> > >> >> >>> > rather
>>>>>     >     > > >> > >> >> >>> > >> go
>>>>>     >     > > >> > >> >> >>> > >> >     > with, if
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  we want to reduce
>>>> overheads and
>>>>>     > size
>>>>>     >     > int16
>>>>>     >     > > >> > (2bytes)
>>>>>     >     > > >> > >> >> keys
>>>>>     >     > > >> > >> >> >>> as
>>>>>     >     > > >> > >> >> >>> > it
>>>>>     >     > > >> > >> >> >>> > >> > keeps it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  simple.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  On the note of no headers,
>>>> there
>>>>>     > is as
>>>>>     >     > per
>>>>>     >     > > the
>>>>>     >     > > >> > kip
>>>>>     >     > > >> > >> as
>>>>>     >     > > >> > >> >> we
>>>>>     >     > > >> > >> >> >>> > use an
>>>>>     >     > > >> > >> >> >>> > >> >     > attribute
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  bit to denote if headers
>>>> are
>>>>>     > present or
>>>>>     >     > > not as
>>>>>     >     > > >> > such
>>>>>     >     > > >> > >> >> >>> > provides a
>>>>>     >     > > >> > >> >> >>> > >> > zero
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead currently if
>>>> headers are
>>>>>     > not
>>>>>     >     > used.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  I think as radai mentions
>>>> would be
>>>>>     > good
>>>>>     >     > > first
>>>>>     >     > > >> > if we
>>>>>     >     > > >> > >> >> can
>>>>>     >     > > >> > >> >> >>> get
>>>>>     >     > > >> > >> >> >>> > >> > clarity if
>>>>>     >     > > >> > >> >> >>> > >> >     > do
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  we now have general
>>>> consensus that
>>>>>     > (1)
>>>>>     >     > > headers
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> >> >>> > worthwhile
>>>>>     >     > > >> > >> >> >>> > >> and
>>>>>     >     > > >> > >> >> >>> > >> >     > useful,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  and (2) we want it as a
>>>> top level
>>>>>     > entity.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Just to state the obvious i
>>>>>     > believe (1)
>>>>>     >     > > >> headers
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> >> >>> > worthwhile
>>>>>     >     > > >> > >> >> >>> > >> > and (2)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  agree as a top level
>>>> entity.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Mike
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>> ______________________________
>>>>>     > __________
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  From: Roger Hoover <
>>>>>     >     > roger.hoo...@gmail.com
>>>>>     >     > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Sent: Wednesday, November
>>>> 9, 2016
>>>>>     > 9:10:47
>>>>>     >     > > PM
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  To: dev@kafka.apache.org
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Subject: Re: [DISCUSS]
>>>> KIP-82 - Add
>>>>>     >     > Record
>>>>>     >     > > >> > Headers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Sorry for going a little
>>>> in the
>>>>>     > weeds but
>>>>>     >     > > >> thanks
>>>>>     >     > > >> > >> for
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> replies
>>>>>     >     > > >> > >> >> >>> > >> >     > regarding
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  varint.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Agreed that a prefix and
>>>> {int,
>>>>>     > int} can
>>>>>     >     > be
>>>>>     >     > > the
>>>>>     >     > > >> > >> same.
>>>>>     >     > > >> > >> >> It
>>>>>     >     > > >> > >> >> >>> > doesn't
>>>>>     >     > > >> > >> >> >>> > >> > look
>>>>>     >     > > >> > >> >> >>> > >> >     > like
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  that's what the KIP is
>>>> saying the
>>>>>     > "Open"
>>>>>     >     > > >> > section.
>>>>>     >     > > >> > >> The
>>>>>     >     > > >> > >> >> >>> > example
>>>>>     >     > > >> > >> >> >>> > >> > shows
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  2100001
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  for New Relic and 210002
>>>> for App
>>>>>     > Dynamics
>>>>>     >     > > >> > implying
>>>>>     >     > > >> > >> >> that
>>>>>     >     > > >> > >> >> >>> the
>>>>>     >     > > >> > >> >> >>> > New
>>>>>     >     > > >> > >> >> >>> > >> > Relic
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  organization will have
>>>> only a
>>>>>     > single
>>>>>     >     > > header id
>>>>>     >     > > >> > to
>>>>>     >     > > >> > >> work
>>>>>     >     > > >> > >> >> >>> > with. Or
>>>>>     >     > > >> > >> >> >>> > >> > is
>>>>>     >     > > >> > >> >> >>> > >> >     > 2100001
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  a prefix? The main point
>>>> of a
>>>>>     > namespace
>>>>>     >     > or
>>>>>     >     > > >> > prefix
>>>>>     >     > > >> > >> is
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> >>> > reduce
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  overhead of config mapping
>>>> or
>>>>>     >     > registration
>>>>>     >     > > >> > >> depending
>>>>>     >     > > >> > >> >> on
>>>>>     >     > > >> > >> >> >>> how
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  namespaces/prefixes are
>>>> managed.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Would love to hear more
>>>> feedback
>>>>>     > on the
>>>>>     >     > > >> > >> higher-level
>>>>>     >     > > >> > >> >> >>> > questions
>>>>>     >     > > >> > >> >> >>> > >> >     > though...
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Cheers,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Roger
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  On Wed, Nov 9, 2016 at
>>>> 11:38 AM,
>>>>>     > radai <
>>>>>     >     > > >> > >> >> >>> > >> > radai.rosenbl...@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > I think this discussion
>>>> is
>>>>>     > getting a
>>>>>     >     > bit
>>>>>     >     > > >> into
>>>>>     >     > > >> > the
>>>>>     >     > > >> > >> >> >>> weeds on
>>>>>     >     > > >> > >> >> >>> > >> > technical
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > implementation details.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > I'd liek to step back a
>>>> minute
>>>>>     > and try
>>>>>     >     > > and
>>>>>     >     > > >> > >> establish
>>>>>     >     > > >> > >> >> >>> > where we
>>>>>     >     > > >> > >> >> >>> > >> > are in
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > larger picture:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > (re-wording nacho's last
>>>>>     > paragraph)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > 1. are we all in
>>>> agreement that
>>>>>     > headers
>>>>>     >     > > are
>>>>>     >     > > >> a
>>>>>     >     > > >> > >> >> >>> worthwhile
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > useful
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > addition to have? this
>>>> was
>>>>>     > contested
>>>>>     >     > > early
>>>>>     >     > > >> on
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > 2. are we all in
>>>> agreement on
>>>>>     > headers
>>>>>     >     > as
>>>>>     >     > > top
>>>>>     >     > > >> > >> level
>>>>>     >     > > >> > >> >> >>> entity
>>>>>     >     > > >> > >> >> >>> > vs
>>>>>     >     > > >> > >> >> >>> > >> > headers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > squirreled-away in V?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > if there are still
>>>> concerns
>>>>>     > around
>>>>>     >     > these
>>>>>     >     > > #2
>>>>>     >     > > >> > >> points
>>>>>     >     > > >> > >> >> >>> (#jay?
>>>>>     >     > > >> > >> >> >>> > >> > #jun?)?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > (and now back to our
>>>> normal
>>>>>     > programming
>>>>>     >     > > ...)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > varints are nice. having
>>>> said
>>>>>     > that, its
>>>>>     >     > > >> adding
>>>>>     >     > > >> > >> >> >>> complexity
>>>>>     >     > > >> > >> >> >>> > >> (see
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>> https://github.com/addthis/
>>>>>     >     > > >> > >> >> stream-lib/blob/master/src/
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>> main/java/com/clearspring/
>>>>>     >     > > >> > >> >> analytics/util/Varint.java
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > as 1st google result)
>>>> and would
>>>>>     > require
>>>>>     >     > > >> anyone
>>>>>     >     > > >> > >> >> writing
>>>>>     >     > > >> > >> >> >>> > other
>>>>>     >     > > >> > >> >> >>> > >> > clients
>>>>>     >     > > >> > >> >> >>> > >> >     > (C?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > Python? Go? Bash? ;-) )
>>>> to
>>>>>     >     > get/implement
>>>>>     >     > > the
>>>>>     >     > > >> > >> same,
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> for
>>>>>     >     > > >> > >> >> >>> > >> > relatively
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > little gain (int vs
>>>> string is
>>>>>     > order of
>>>>>     >     > > >> > magnitude,
>>>>>     >     > > >> > >> >> this
>>>>>     >     > > >> > >> >> >>> > isnt).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > int namespacing vs {int,
>>>> int}
>>>>>     >     > namespacing
>>>>>     >     > > >> are
>>>>>     >     > > >> > >> >> basically
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > same
>>>>>     >     > > >> > >> >> >>> > >> >     > thing -
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > youre just namespacing
>>>> an int64
>>>>>     > and
>>>>>     >     > > giving
>>>>>     >     > > >> > people
>>>>>     >     > > >> > >> >> while
>>>>>     >     > > >> > >> >> >>> > 2^32
>>>>>     >     > > >> > >> >> >>> > >> > ranges
>>>>>     >     > > >> > >> >> >>> > >> >     > at a
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > time. the part i like
>>>> about this
>>>>>     > is
>>>>>     >     > > letting
>>>>>     >     > > >> > >> people
>>>>>     >     > > >> > >> >> >>> have a
>>>>>     >     > > >> > >> >> >>> > >> large
>>>>>     >     > > >> > >> >> >>> > >> >     > swath of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > numbers with one
>>>> registration so
>>>>>     > they
>>>>>     >     > > dont
>>>>>     >     > > >> > have
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> come
>>>>>     >     > > >> > >> >> >>> > back
>>>>>     >     > > >> > >> >> >>> > >> > for
>>>>>     >     > > >> > >> >> >>> > >> >     > every
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > single plugin/header
>>>> they want to
>>>>>     >     > > "reserve".
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > On Wed, Nov 9, 2016 at
>>>> 11:01 AM,
>>>>>     > Roger
>>>>>     >     > > >> Hoover
>>>>>     >     > > >> > <
>>>>>     >     > > >> > >> >> >>> > >> >     > roger.hoo...@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > Since some of the
>>>> debate has
>>>>>     > been
>>>>>     >     > about
>>>>>     >     > > >> > >> overhead +
>>>>>     >     > > >> > >> >> >>> > >> > performance, I'm
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wondering if we have
>>>>>     > considered a
>>>>>     >     > > varint
>>>>>     >     > > >> > >> encoding
>>>>>     >     > > >> > >> >> (
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>> https://developers.google.com/
>>>>>     >     > > >> > >> >> protocol-buffers/docs/
>>>>>     >     > > >> > >> >> >>> > >> >     > encoding#varints)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > the header length
>>>> field (int32
>>>>>     > in the
>>>>>     >     > > >> > proposal)
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> for
>>>>>     >     > > >> > >> >> >>> > >> > header
>>>>>     >     > > >> > >> >> >>> > >> >     > ids? If
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > you
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > don't use headers, the
>>>>>     > overhead would
>>>>>     >     > > be a
>>>>>     >     > > >> > >> single
>>>>>     >     > > >> > >> >> >>> byte
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > for each
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > id < 128 would also
>>>> need only a
>>>>>     >     > single
>>>>>     >     > > >> byte?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > On Wed, Nov 9, 2016 at
>>>> 6:43 AM,
>>>>>     >     > radai <
>>>>>     >     > > >> > >> >> >>> > >> > radai.rosenbl...@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > @magnus - and very
>>>> dangerous
>>>>>     > (youre
>>>>>     >     > > >> > >> essentially
>>>>>     >     > > >> > >> >> >>> > >> > downloading and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > executing
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > arbitrary code off
>>>> the
>>>>>     > internet on
>>>>>     >     > > your
>>>>>     >     > > >> > >> servers
>>>>>     >     > > >> > >> >> ...
>>>>>     >     > > >> > >> >> >>> > bad
>>>>>     >     > > >> > >> >> >>> > >> > idea
>>>>>     >     > > >> > >> >> >>> > >> >     > without
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  a
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sandbox, even with)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > as for it being a
>>>> purely
>>>>>     >     > > administrative
>>>>>     >     > > >> > task
>>>>>     >     > > >> > >> - i
>>>>>     >     > > >> > >> >> >>> > >> disagree.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > i wish it would,
>>>> really,
>>>>>     > because
>>>>>     >     > > then my
>>>>>     >     > > >> > >> earlier
>>>>>     >     > > >> > >> >> >>> > point on
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > complexity
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the remapping
>>>> process would
>>>>>     > be
>>>>>     >     > > invalid,
>>>>>     >     > > >> > but
>>>>>     >     > > >> > >> at
>>>>>     >     > > >> > >> >> >>> > linkedin,
>>>>>     >     > > >> > >> >> >>> > >> > for
>>>>>     >     > > >> > >> >> >>> > >> >     > example,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > we
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (the team im in) run
>>>> kafka
>>>>>     > as a
>>>>>     >     > > service.
>>>>>     >     > > >> > we
>>>>>     >     > > >> > >> dont
>>>>>     >     > > >> > >> >> >>> > really
>>>>>     >     > > >> > >> >> >>> > >> > know
>>>>>     >     > > >> > >> >> >>> > >> >     > what our
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > users
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (developing
>>>> applications
>>>>>     > that use
>>>>>     >     > > kafka)
>>>>>     >     > > >> > are
>>>>>     >     > > >> > >> up
>>>>>     >     > > >> > >> >> to
>>>>>     >     > > >> > >> >> >>> at
>>>>>     >     > > >> > >> >> >>> > any
>>>>>     >     > > >> > >> >> >>> > >> > given
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  moment.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > is very possible
>>>> (given the
>>>>>     >     > > existance of
>>>>>     >     > > >> > >> headers
>>>>>     >     > > >> > >> >> >>> and a
>>>>>     >     > > >> > >> >> >>> > >> >     > corresponding
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > ecosystem) for some
>>>>>     > application to
>>>>>     >     > > >> "equip"
>>>>>     >     > > >> > >> their
>>>>>     >     > > >> > >> >> >>> > >> producers
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > consumers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > with the required
>>>> plugin
>>>>>     > without us
>>>>>     >     > > >> > knowing.
>>>>>     >     > > >> > >> i
>>>>>     >     > > >> > >> >> dont
>>>>>     >     > > >> > >> >> >>> > mean
>>>>>     >     > > >> > >> >> >>> > >> > to imply
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  thats
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > bad, i just want to
>>>> make the
>>>>>     > point
>>>>>     >     > > that
>>>>>     >     > > >> > its
>>>>>     >     > > >> > >> not
>>>>>     >     > > >> > >> >> as
>>>>>     >     > > >> > >> >> >>> > simple
>>>>>     >     > > >> > >> >> >>> > >> >     > keeping it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > sync across a
>>>> large-enough
>>>>>     >     > > organization.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > On Wed, Nov 9, 2016
>>>> at 6:17
>>>>>     > AM,
>>>>>     >     > > Magnus
>>>>>     >     > > >> > >> Edenhill
>>>>>     >     > > >> > >> >> <
>>>>>     >     > > >> > >> >> >>> > >> >     > mag...@edenhill.se>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > I think there is a
>>>> piece
>>>>>     > missing
>>>>>     >     > in
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> >> Strings
>>>>>     >     > > >> > >> >> >>> > >> > discussion,
>>>>>     >     > > >> > >> >> >>> > >> >     > where
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > pro-Stringers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > reason that by
>>>> providing
>>>>>     > unique
>>>>>     >     > > string
>>>>>     >     > > >> > >> >> >>> identifiers
>>>>>     >     > > >> > >> >> >>> > for
>>>>>     >     > > >> > >> >> >>> > >> > each
>>>>>     >     > > >> > >> >> >>> > >> >     > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > everything will
>>>> just
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > magically work for
>>>> all
>>>>>     > parts of
>>>>>     >     > the
>>>>>     >     > > >> > stream
>>>>>     >     > > >> > >> >> >>> pipeline.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > But the strings
>>>> dont mean
>>>>>     >     > anything
>>>>>     >     > > by
>>>>>     >     > > >> > >> >> themselves,
>>>>>     >     > > >> > >> >> >>> > and
>>>>>     >     > > >> > >> >> >>> > >> > while we
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  could
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > probably envision
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some auto plugin
>>>> loader
>>>>>     > that
>>>>>     >     > > >> downloads,
>>>>>     >     > > >> > >> >> compiles,
>>>>>     >     > > >> > >> >> >>> > links
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > runs
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > on-demand
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > as soon as they're
>>>> seen by
>>>>>     > a
>>>>>     >     > > >> consumer, I
>>>>>     >     > > >> > >> dont
>>>>>     >     > > >> > >> >> >>> really
>>>>>     >     > > >> > >> >> >>> > >> see
>>>>>     >     > > >> > >> >> >>> > >> > a
>>>>>     >     > > >> > >> >> >>> > >> >     > use-case
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > for
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > something
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > so dynamic (and
>>>> fragile) in
>>>>>     >     > > practice.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > In the real world
>>>> an
>>>>>     > application
>>>>>     >     > > will
>>>>>     >     > > >> be
>>>>>     >     > > >> > >> >> >>> configured
>>>>>     >     > > >> > >> >> >>> > >> with
>>>>>     >     > > >> > >> >> >>> > >> > a set
>>>>>     >     > > >> > >> >> >>> > >> >     > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > plugins
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > to either add
>>>> (producer)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > or read (consumer)
>>>> headers.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > This is an
>>>> administrative
>>>>>     > task
>>>>>     >     > > based
>>>>>     >     > > >> on
>>>>>     >     > > >> > >> what
>>>>>     >     > > >> > >> >> >>> > features a
>>>>>     >     > > >> > >> >> >>> > >> > client
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > needs/provides and
>>>> results
>>>>>     > in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > some sort of
>>>> configuration
>>>>>     > to
>>>>>     >     > > enable
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> >> >>> configure
>>>>>     >     > > >> > >> >> >>> > the
>>>>>     >     > > >> > >> >> >>> > >> > desired
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > plugins.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > Since this needs
>>>> to be kept
>>>>>     >     > > somewhat
>>>>>     >     > > >> in
>>>>>     >     > > >> > >> sync
>>>>>     >     > > >> > >> >> >>> across
>>>>>     >     > > >> > >> >> >>> > an
>>>>>     >     > > >> > >> >> >>> > >> >     > organisation
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > (there
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > is no point in
>>>> having
>>>>>     > producers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > add headers no
>>>> consumers
>>>>>     > will
>>>>>     >     > read,
>>>>>     >     > > >> and
>>>>>     >     > > >> > >> vice
>>>>>     >     > > >> > >> >> >>> versa),
>>>>>     >     > > >> > >> >> >>> > >> the
>>>>>     >     > > >> > >> >> >>> > >> > added
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > complexity
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > of assigning an id
>>>>>     > namespace
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > for each plugin as
>>>> it is
>>>>>     > being
>>>>>     >     > > >> > configured
>>>>>     >     > > >> > >> >> should
>>>>>     >     > > >> > >> >> >>> be
>>>>>     >     > > >> > >> >> >>> > >> > tolerable.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > /Magnus
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > 2016-11-09 13:06
>>>> GMT+01:00
>>>>>     >     > Michael
>>>>>     >     > > >> > Pearce <
>>>>>     >     > > >> > >> >> >>> > >> >     > michael.pea...@ig.com>:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Just
>>>> following/catching
>>>>>     > up on
>>>>>     >     > > what
>>>>>     >     > > >> > seems
>>>>>     >     > > >> > >> to
>>>>>     >     > > >> > >> >> be
>>>>>     >     > > >> > >> >> >>> an
>>>>>     >     > > >> > >> >> >>> > >> > active
>>>>>     >     > > >> > >> >> >>> > >> >     > night :)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > @Radai sorry if
>>>> it may
>>>>>     > seem
>>>>>     >     > > obvious
>>>>>     >     > > >> > but
>>>>>     >     > > >> > >> what
>>>>>     >     > > >> > >> >> >>> does
>>>>>     >     > > >> > >> >> >>> > MD
>>>>>     >     > > >> > >> >> >>> > >> > stand
>>>>>     >     > > >> > >> >> >>> > >> >     > for?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > My take on
>>>> String vs Int:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I will state
>>>> first I am
>>>>>     > pro Int
>>>>>     >     > > (16
>>>>>     >     > > >> or
>>>>>     >     > > >> > >> 32).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > I do though
>>>> playing
>>>>>     > devils
>>>>>     >     > > advocate
>>>>>     >     > > >> > see a
>>>>>     >     > > >> > >> >> big
>>>>>     >     > > >> > >> >> >>> plus
>>>>>     >     > > >> > >> >> >>> > >> > with the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > argument
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > String keys,
>>>> this is
>>>>>     > around
>>>>>     >     > > >> > integrating
>>>>>     >     > > >> > >> >> into an
>>>>>     >     > > >> > >> >> >>> > >> > existing
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > eco-system.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > As many other
>>>> systems use
>>>>>     >     > String
>>>>>     >     > > >> based
>>>>>     >     > > >> > >> >> headers
>>>>>     >     > > >> > >> >> >>> > >> (Flume,
>>>>>     >     > > >> > >> >> >>> > >> > JMS)
>>>>>     >     > > >> > >> >> >>> > >> >     > it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > it
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > much easier for
>>>> these to
>>>>>     > be
>>>>>     >     > > >> > >> >> >>> > incorporated/integrated
>>>>>     >     > > >> > >> >> >>> > >> > into.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > How with Int
>>>> based
>>>>>     > headers
>>>>>     >     > could
>>>>>     >     > > we
>>>>>     >     > > >> > >> provide
>>>>>     >     > > >> > >> >> a
>>>>>     >     > > >> > >> >> >>> > >> > way/guidence to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  make
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > this
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > integration
>>>> simple /
>>>>>     > easy with
>>>>>     >     > > >> > transition
>>>>>     >     > > >> > >> >> flows
>>>>>     >     > > >> > >> >> >>> > over
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> >     > kafka?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * tough luck
>>>> buddy
>>>>>     > you're on
>>>>>     >     > your
>>>>>     >     > > >> own
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * simply hash
>>>> the string
>>>>>     > into
>>>>>     >     > int
>>>>>     >     > > >> code
>>>>>     >     > > >> > >> and
>>>>>     >     > > >> > >> >> hope
>>>>>     >     > > >> > >> >> >>> > for
>>>>>     >     > > >> > >> >> >>> > >> no
>>>>>     >     > > >> > >> >> >>> > >> >     > collisions
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > (how
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > convert back
>>>> though?)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > * http2 style as
>>>>>     > mentioned by
>>>>>     >     > > nacho.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > cheers,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Mike
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     > ______________________________
>>>>>     >     > > >> > __________
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > From: radai <
>>>>>     >     > > >> > radai.rosenbl...@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Sent: Wednesday,
>>>>>     > November 9,
>>>>>     >     > 2016
>>>>>     >     > > >> > 8:12 AM
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > To:
>>>> dev@kafka.apache.org
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Subject: Re:
>>>> [DISCUSS]
>>>>>     > KIP-82 -
>>>>>     >     > > Add
>>>>>     >     > > >> > >> Record
>>>>>     >     > > >> > >> >> >>> Headers
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > thinking about
>>>> it some
>>>>>     > more,
>>>>>     >     > the
>>>>>     >     > > >> best
>>>>>     >     > > >> > >> way to
>>>>>     >     > > >> > >> >> >>> > transmit
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > remapping
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > data to
>>>> consumers would
>>>>>     > be to
>>>>>     >     > > put it
>>>>>     >     > > >> > in
>>>>>     >     > > >> > >> the
>>>>>     >     > > >> > >> >> MD
>>>>>     >     > > >> > >> >> >>> > >> response
>>>>>     >     > > >> > >> >> >>> > >> >     > payload,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  so
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > maybe
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > it should be
>>>> discussed
>>>>>     > now.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > On Wed, Nov 9,
>>>> 2016 at
>>>>>     > 12:09
>>>>>     >     > AM,
>>>>>     >     > > >> > radai <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  radai.rosenbl...@gmail.com
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > im not opposed
>>>> to the
>>>>>     > idea of
>>>>>     >     > > >> > namespace
>>>>>     >     > > >> > >> >> >>> mapping.
>>>>>     >     > > >> > >> >> >>> > >> all
>>>>>     >     > > >> > >> >> >>> > >> > im
>>>>>     >     > > >> > >> >> >>> > >> >     > saying
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  is
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > its
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > not part of
>>>> the "mvp"
>>>>>     > and,
>>>>>     >     > > since
>>>>>     >     > > >> it
>>>>>     >     > > >> > >> >> requires
>>>>>     >     > > >> > >> >> >>> no
>>>>>     >     > > >> > >> >> >>> > >> wire
>>>>>     >     > > >> > >> >> >>> > >> > format
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > change,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > can
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > always be
>>>> added later.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also, its not
>>>> as
>>>>>     > simple as
>>>>>     >     > just
>>>>>     >     > > >> > >> >> configuring
>>>>>     >     > > >> > >> >> >>> MM
>>>>>     >     > > >> > >> >> >>> > to
>>>>>     >     > > >> > >> >> >>> > >> do
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > transform:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > lets
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > say i've
>>>> implemented
>>>>>     > large
>>>>>     >     > > message
>>>>>     >     > > >> > >> >> support as
>>>>>     >     > > >> > >> >> >>> > >> > {666,1} and
>>>>>     >     > > >> > >> >> >>> > >> >     > on
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  some
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > mirror
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > target cluster
>>>> its been
>>>>>     >     > > remapped
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> >> {999,1}.
>>>>>     >     > > >> > >> >> >>> the
>>>>>     >     > > >> > >> >> >>> > >> > consumer
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  plugin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > code
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > also need to
>>>> be told
>>>>>     > to look
>>>>>     >     > > for
>>>>>     >     > > >> the
>>>>>     >     > > >> > >> large
>>>>>     >     > > >> > >> >> >>> > message
>>>>>     >     > > >> > >> >> >>> > >> > "part X
>>>>>     >     > > >> > >> >> >>> > >> >     > of
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  Y"
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > header
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > under {999,1}.
>>>> doable,
>>>>>     > but
>>>>>     >     > > tricky.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > > On Tue, Nov 8,
>>>> 2016 at
>>>>>     > 10:29
>>>>>     >     > > PM,
>>>>>     >     > > >> > Gwen
>>>>>     >     > > >> > >> >> >>> Shapira <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  g...@confluent.io
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> While you can
>>>> do
>>>>>     > whatever
>>>>>     >     > you
>>>>>     >     > > >> want
>>>>>     >     > > >> > >> with a
>>>>>     >     > > >> > >> >> >>> > >> namespace
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > your
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > code,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> what I'd
>>>> expect is
>>>>>     > for each
>>>>>     >     > > app
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> >> >>> namespaces
>>>>>     >     > > >> > >> >> >>> > >> >     > configurable...
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> So if I
>>>> accidentally
>>>>>     > used
>>>>>     >     > 666
>>>>>     >     > > for
>>>>>     >     > > >> > my
>>>>>     >     > > >> > >> HR
>>>>>     >     > > >> > >> >> >>> > >> department,
>>>>>     >     > > >> > >> >> >>> > >> > and
>>>>>     >     > > >> > >> >> >>> > >> >     > still
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > want
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> run RadaiApp,
>>>> I can
>>>>>     > config
>>>>>     >     > > >> > >> "namespace=42"
>>>>>     >     > > >> > >> >> >>> for
>>>>>     >     > > >> > >> >> >>> > >> > RadaiApp and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > everything
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> will look
>>>> normal.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> This means
>>>> you only
>>>>>     > need to
>>>>>     >     > > sync
>>>>>     >     > > >> > usage
>>>>>     >     > > >> > >> >> >>> inside
>>>>>     >     > > >> > >> >> >>> > your
>>>>>     >     > > >> > >> >> >>> > >> > own
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > organization.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> Still hard,
>>>> but
>>>>>     > somewhat
>>>>>     >     > > easier
>>>>>     >     > > >> > than
>>>>>     >     > > >> > >> >> syncing
>>>>>     >     > > >> > >> >> >>> > with
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > entire
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > world.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> On Tue, Nov
>>>> 8, 2016
>>>>>     > at 10:07
>>>>>     >     > > PM,
>>>>>     >     > > >> > >> radai <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > >
>>>> radai.rosenbl...@gmail.com>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > and we can
>>>> start
>>>>>     > with
>>>>>     >     > > >> {namespace,
>>>>>     >     > > >> > >> id}
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> no
>>>>>     >     > > >> > >> >> >>> > >> > re-mapping
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > support
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> always
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > add it
>>>> later on
>>>>>     > if/when
>>>>>     >     > > >> > collisions
>>>>>     >     > > >> > >> >> >>> actually
>>>>>     >     > > >> > >> >> >>> > >> > happen (i
>>>>>     >     > > >> > >> >> >>> > >> >     > dont
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > think
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > they'd
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> be
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > a problem).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > every
>>>> interested
>>>>>     > party (so
>>>>>     >     > > orgs
>>>>>     >     > > >> > or
>>>>>     >     > > >> > >> >> >>> > individuals)
>>>>>     >     > > >> > >> >> >>> > >> > could
>>>>>     >     > > >> > >> >> >>> > >> >     > then
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > register
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > a
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > prefix (0 =
>>>>>     > reserved, 1 =
>>>>>     >     > > >> > confluent
>>>>>     >     > > >> > >> ...
>>>>>     >     > > >> > >> >> >>> 666
>>>>>     >     > > >> > >> >> >>> > = me
>>>>>     >     > > >> > >> >> >>> > >> > :-) )
>>>>>     >     > > >> > >> >> >>> > >> >     > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  do
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > whatever
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > the 2nd ID
>>>> - so once
>>>>>     >     > > linkedin
>>>>>     >     > > >> > >> >> registers,
>>>>>     >     > > >> > >> >> >>> say
>>>>>     >     > > >> > >> >> >>> > 3,
>>>>>     >     > > >> > >> >> >>> > >> > then
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  linkedin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > devs
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > are
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> free
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > to use {3,
>>>> *} with a
>>>>>     >     > > reasonable
>>>>>     >     > > >> > >> >> >>> expectation
>>>>>     >     > > >> > >> >> >>> > to
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> >     > collide
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > anything
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > else.
>>>> further
>>>>>     > partitioning
>>>>>     >     > > of
>>>>>     >     > > >> > that *
>>>>>     >     > > >> > >> >> >>> becomes
>>>>>     >     > > >> > >> >> >>> > >> > linkedin's
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > problem,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > but
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > "upstream
>>>>>     > registration"
>>>>>     >     > of a
>>>>>     >     > > >> > >> namespace
>>>>>     >     > > >> > >> >> >>> only
>>>>>     >     > > >> > >> >> >>> > has
>>>>>     >     > > >> > >> >> >>> > >> to
>>>>>     >     > > >> > >> >> >>> > >> >     > happen
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > once.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> > On Tue, Nov
>>>> 8, 2016
>>>>>     > at
>>>>>     >     > 9:03
>>>>>     >     > > PM,
>>>>>     >     > > >> > >> James
>>>>>     >     > > >> > >> >> >>> Cheng <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > wushuja...@gmail.com
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Nov
>>>> 8, 2016,
>>>>>     > at 5:54
>>>>>     >     > > PM,
>>>>>     >     > > >> > Gwen
>>>>>     >     > > >> > >> >> >>> Shapira <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > g...@confluent.io>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Thank
>>>> you so
>>>>>     > much for
>>>>>     >     > > this
>>>>>     >     > > >> > clear
>>>>>     >     > > >> > >> and
>>>>>     >     > > >> > >> >> >>> fair
>>>>>     >     > > >> > >> >> >>> > >> > summary of
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > arguments.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > I'm in
>>>> favor of
>>>>>     > ints.
>>>>>     >     > > Not a
>>>>>     >     > > >> > >> >> >>> deal-breaker,
>>>>>     >     > > >> > >> >> >>> > but
>>>>>     >     > > >> > >> >> >>> > >> > in
>>>>>     >     > > >> > >> >> >>> > >> >     > favor.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Even
>>>> more in
>>>>>     > favor of
>>>>>     >     > > >> Magnus's
>>>>>     >     > > >> > >> >> >>> > decentralized
>>>>>     >     > > >> > >> >> >>> > >> >     > suggestion
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > Roger's
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > tweak:
>>>> add a
>>>>>     > namespace
>>>>>     >     > > for
>>>>>     >     > > >> > >> headers.
>>>>>     >     > > >> > >> >> >>> This
>>>>>     >     > > >> > >> >> >>> > will
>>>>>     >     > > >> > >> >> >>> > >> > allow
>>>>>     >     > > >> > >> >> >>> > >> >     > each
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > just
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > use
>>>> whatever IDs
>>>>>     > it
>>>>>     >     > wants
>>>>>     >     > > >> > >> >> internally,
>>>>>     >     > > >> > >> >> >>> and
>>>>>     >     > > >> > >> >> >>> > >> then
>>>>>     >     > > >> > >> >> >>> > >> > let
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > admin
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deploying
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > the app
>>>> figure
>>>>>     > out an
>>>>>     >     > > >> > available
>>>>>     >     > > >> > >> >> >>> namespace
>>>>>     >     > > >> > >> >> >>> > ID
>>>>>     >     > > >> > >> >> >>> > >> > for the
>>>>>     >     > > >> > >> >> >>> > >> >     > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > live
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > So
>>>>>     >     > > >> > io.confluent.schema-registry
>>>>>     >     > > >> > >> can
>>>>>     >     > > >> > >> >> be
>>>>>     >     > > >> > >> >> >>> > >> > namespace
>>>>>     >     > > >> > >> >> >>> > >> >     > 0x01 on
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  my
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> deployment
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > and 0x57
>>>> on
>>>>>     > yours, and
>>>>>     >     > > the
>>>>>     >     > > >> > poor
>>>>>     >     > > >> > >> guys
>>>>>     >     > > >> > >> >> >>> > >> > developing the
>>>>>     >     > > >> > >> >> >>> > >> >     > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > don't
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > need
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > to
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > worry
>>>> about that.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> Gwen, if I
>>>>>     > understand
>>>>>     >     > your
>>>>>     >     > > >> > example
>>>>>     >     > > >> > >> >> >>> right, an
>>>>>     >     > > >> > >> >> >>> > >> >     > application
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > deployer
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > might
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> decide to
>>>> use 0x01
>>>>>     > in one
>>>>>     >     > > >> > >> deployment,
>>>>>     >     > > >> > >> >> and
>>>>>     >     > > >> > >> >> >>> > that
>>>>>     >     > > >> > >> >> >>> > >> > means
>>>>>     >     > > >> > >> >> >>> > >> >     > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > once
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> message
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> is written
>>>> into the
>>>>>     >     > > broker, it
>>>>>     >     > > >> > >> will be
>>>>>     >     > > >> > >> >> >>> > saved on
>>>>>     >     > > >> > >> >> >>> > >> > the
>>>>>     >     > > >> > >> >> >>> > >> >     > broker
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > with
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> specific
>>>> namespace
>>>>>     >     > (0x01).
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> If you
>>>> were to
>>>>>     > mirror
>>>>>     >     > that
>>>>>     >     > > >> > message
>>>>>     >     > > >> > >> >> into
>>>>>     >     > > >> > >> >> >>> > another
>>>>>     >     > > >> > >> >> >>> > >> >     > cluster,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > 0x01
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > would
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> accompany
>>>> the
>>>>>     > message,
>>>>>     >     > > right?
>>>>>     >     > > >> > What
>>>>>     >     > > >> > >> if
>>>>>     >     > > >> > >> >> the
>>>>>     >     > > >> > >> >> >>> > >> > deployers of
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > same
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > app
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> other
>>>> cluster uses
>>>>>     > 0x57?
>>>>>     >     > > They
>>>>>     >     > > >> > won't
>>>>>     >     > > >> > >> >> >>> > understand
>>>>>     >     > > >> > >> >> >>> > >> > each
>>>>>     >     > > >> > >> >> >>> > >> >     > other?
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> I'm not
>>>> sure
>>>>>     > that's an
>>>>>     >     > > >> avoidable
>>>>>     >     > > >> > >> >> >>> problem. I
>>>>>     >     > > >> > >> >> >>> > >> > think it
>>>>>     >     > > >> > >> >> >>> > >> >     > simply
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > means
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> in
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> order to
>>>> share
>>>>>     > data, you
>>>>>     >     > > have
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> also
>>>>>     >     > > >> > >> >> >>> have a
>>>>>     >     > > >> > >> >> >>> > >> > shared
>>>>>     >     > > >> > >> >> >>> > >> >     > (agreed
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > upon)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>> understanding of
>>>>>     > what the
>>>>>     >     > > >> > >> namespaces
>>>>>     >     > > >> > >> >> >>> mean.
>>>>>     >     > > >> > >> >> >>> > >> Which
>>>>>     >     > > >> > >> >> >>> > >> > I
>>>>>     >     > > >> > >> >> >>> > >> >     > think
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > makes
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > sense,
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> because the
>>>>>     > alternate
>>>>>     >     > > (sharing
>>>>>     >     > > >> > >> >> *nothing*
>>>>>     >     > > >> > >> >> >>> at
>>>>>     >     > > >> > >> >> >>> > >> all)
>>>>>     >     > > >> > >> >> >>> > >> > would
>>>>>     >     > > >> > >> >> >>> > >> >     > mean
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > that
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > there
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> would be
>>>> no way to
>>>>>     >     > > understand
>>>>>     >     > > >> > each
>>>>>     >     > > >> > >> >> other.
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> -James
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >>
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > Gwen
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> > On Tue,
>>>> Nov 8,
>>>>>     > 2016 at
>>>>>     >     > > 4:23
>>>>>     >     > > >> > PM,
>>>>>     >     > > >> > >> >> radai <
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > >
>>>> radai.rosenbl...@gmail.com
>>>>>     > >
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> wrote:
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> +1 for
>>>> sean's
>>>>>     >     > document.
>>>>>     >     > > it
>>>>>     >     > > >> > >> covers
>>>>>     >     > > >> > >> >> >>> pretty
>>>>>     >     > > >> > >> >> >>> > >> much
>>>>>     >     > > >> > >> >> >>> > >> > all
>>>>>     >     > > >> > >> >> >>> > >> >     > the
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > trade-offs
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > and
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >> provides
>>>>>     > concrete
>>>>>     >     > > figures
>>>>>     >     > > >> to
>>>>>     >     > > >> > >> argue
>>>>>     >     > > >> > >> >> >>> about
>>>>>     >     > > >> > >> >> >>> > :-)
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> >> >>
>>>> (nit-picking -
>>>>>     > used
>>>>>     >     > the
>>>>>     >     > > >> same
>>>>>     >     > > >> > >> xkcd
>>>>>     >     > > >> > >> >> >>> twice,
>>>>>     >     > > >> > >> >> >>> > >> also
>>>>>     >     > > >> > >> >> >>> > >> > trove
>>>>>     >     > > >> > >> >> >>> > >> >     > has
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > been
>>>>>     >     > > >> > >> >> >>> > >> >     > >>  > > > > > >> superceded
>>>>>     >     > > >> > >> >
>>>>>     >     > > >> >
>>>>>     >     > > >> >
>>>>>     >     > > >> >
>>>>>     >     > > >> > --
>>>>>     >     > > >> > Gwen Shapira
>>>>>     >     > > >> > Product Manager | Confluent
>>>>>     >     > > >> > 650.450.2760 | @gwenshap
>>>>>     >     > > >> > Follow us: Twitter | blog
>>>>>     >     > > >> >
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >> --
>>>>>     >     > > >> *Todd Palino*
>>>>>     >     > > >> Staff Site Reliability Engineer
>>>>>     >     > > >> Data Infrastructure Streaming
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >>
>>>>>     >     > > >> linkedin.com/in/toddpalino
>>>>>     >     > > >>
>>>>>     >     > >
>>>>>     >     > >
>>>>>     >     > >
>>>>>     >     > > --
>>>>>     >     > > Gwen Shapira
>>>>>     >     > > Product Manager | Confluent
>>>>>     >     > > 650.450.2760 | @gwenshap
>>>>>     >     > > Follow us: Twitter | blog
>>>>>     >     > >
>>>>>     >     >
>>>>>     >
>>>>>     >
>>>>>     > The information contained in this email is strictly confidential
>>>> and for
>>>>>     > the use of the addressee only, unless otherwise indicated. If you
>>>> are not
>>>>>     > the intended recipient, please do not read, copy, use or disclose
>>>> to others
>>>>>     > this message or any attachment. Please also notify the sender by
>>>> replying
>>>>>     > to this email or by telephone (+44(020 7896 0011) and then delete
>>>> the email
>>>>>     > and any copies of it. Opinions, conclusion (etc) that do not
>>>> relate to the
>>>>>     > official business of this company shall be understood as neither
>>>> given nor
>>>>>     > endorsed by it. IG is a trading name of IG Markets Limited (a
>>>> company
>>>>>     > registered in England and Wales, company number 04008957) and IG
>>>> Index
>>>>>     > Limited (a company registered in England and Wales, company number
>>>>>     > 01190902). Registered address at Cannon Bridge House, 25 Dowgate
>>>> Hill,
>>>>>     > London EC4R 2YA. Both IG Markets Limited (register number 195355)
>>>> and IG
>>>>>     > Index Limited (register number 114059) are authorised and
>>>> regulated by the
>>>>>     > Financial Conduct Authority.
>>>>>     >
>>>>>
>>>>>
>>>>> The information contained in this email is strictly confidential and for
>>>> the use of the addressee only, unless otherwise indicated. If you are not
>>>> the intended recipient, please do not read, copy, use or disclose to others
>>>> this message or any attachment. Please also notify the sender by replying
>>>> to this email or by telephone (+44(020 7896 0011) and then delete the email
>>>> and any copies of it. Opinions, conclusion (etc) that do not relate to the
>>>> official business of this company shall be understood as neither given nor
>>>> endorsed by it. IG is a trading name of IG Markets Limited (a company
>>>> registered in England and Wales, company number 04008957) and IG Index
>>>> Limited (a company registered in England and Wales, company number
>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
>>>> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
>>>> Index Limited (register number 114059) are authorised and regulated by the
>>>> Financial Conduct Authority.
>>>>>
>>>>
>>>>
>>>
>>
> 
> 
>

signature.asc
Description: OpenPGP digital signature

Re: [DISCUSS] Control Messages - [Was: KIP-82 - Add Record Headers]

Reply via email to