from:"Harsha"

Re: [DISCUSS] KIP-382: MirrorMaker 2.0

2018-10-17 Thread Harsha

Hi Ryanne,
   Thanks for the KIP. I am also curious about why not use the 
uReplicator design as the foundation given it alreadys resolves some of the 
fundamental issues in current MIrrorMaker, updating the confifgs on the fly and 
running the mirror maker agents in a worker model which can deployed in mesos 
or container orchestrations.  If possible can you document in the rejected 
alternatives what are missing parts that made you to consider a new design from 
ground up.

Thanks,
Harsha

On Wed, Oct 17, 2018, at 8:34 AM, Ryanne Dolan wrote:
> Jan, these are two separate issues.
> 
> 1) consumer coordination should not, ideally, involve unreliable or slow
> connections. Naively, a KafkaSourceConnector would coordinate via the
> source cluster. We can do better than this, but I'm deferring this
> optimization for now.
> 
> 2) exactly-once between two clusters is mind-bending. But keep in mind that
> transactions are managed by the producer, not the consumer. In fact, it's
> the producer that requests that offsets be committed for the current
> transaction. Obviously, these offsets are committed in whatever cluster the
> producer is sending to.
> 
> These two issues are closely related. They are both resolved by not
> coordinating or committing via the source cluster. And in fact, this is the
> general model of SourceConnectors anyway, since most SourceConnectors
> _only_ have a destination cluster.
> 
> If there is a lot of interest here, I can expound further on this aspect of
> MM2, but again I think this is premature until this first KIP is approved.
> I intend to address each of these in separate KIPs following this one.
> 
> Ryanne
> 
> On Wed, Oct 17, 2018 at 7:09 AM Jan Filipiak 
> wrote:
> 
> > This is not a performance optimisation. Its a fundamental design choice.
> >
> >
> > I never really took a look how streams does exactly once. (its a trap
> > anyways and you usually can deal with at least once donwstream pretty
> > easy). But I am very certain its not gonna get somewhere if offset
> > commit and record produce cluster are not the same.
> >
> > Pretty sure without this _design choice_ you can skip on that exactly
> > once already
> >
> > Best Jan
> >
> > On 16.10.2018 18:16, Ryanne Dolan wrote:
> > >  >  But one big obstacle in this was
> > > always that group coordination happened on the source cluster.
> > >
> > > Jan, thank you for bringing up this issue with legacy MirrorMaker. I
> > > totally agree with you. This is one of several problems with MirrorMaker
> > > I intend to solve in MM2, and I already have a design and prototype that
> > > solves this and related issues. But as you pointed out, this KIP is
> > > already rather complex, and I want to focus on the core feature set
> > > rather than performance optimizations for now. If we can agree on what
> > > MM2 looks like, it will be very easy to agree to improve its performance
> > > and reliability.
> > >
> > > That said, I look forward to your support on a subsequent KIP that
> > > addresses consumer coordination and rebalance issues. Stay tuned!
> > >
> > > Ryanne
> > >
> > > On Tue, Oct 16, 2018 at 6:58 AM Jan Filipiak  > > <mailto:jan.filip...@trivago.com>> wrote:
> > >
> > > Hi,
> > >
> > > Currently MirrorMaker is usually run collocated with the target
> > > cluster.
> > > This is all nice and good. But one big obstacle in this was
> > > always that group coordination happened on the source cluster. So
> > when
> > > then network was congested, you sometimes loose group membership and
> > > have to rebalance and all this.
> > >
> > > So one big request from we would be the support of having
> > coordination
> > > cluster != source cluster.
> > >
> > > I would generally say a LAN is better than a WAN for doing group
> > > coordinaton and there is no reason we couldn't have a group consuming
> > > topics from a different cluster and committing offsets to another
> > > one right?
> > >
> > > Other than that. It feels like the KIP has too much features where
> > many
> > > of them are not really wanted and counter productive but I will just
> > > wait and see how the discussion goes.
> > >
> > > Best Jan
> > >
> > >
> > > On 15.10.2018 18:16, Ryanne Dolan wrote:
> > >  > Hey y'all!
> > >  >
> > >  > Please take a look at KIP-382:
> > >  >
> > >  >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0
> > >  >
> > >  > Thanks for your feedback and support.
> > >  >
> > >  > Ryanne
> > >  >
> > >
> >

Re: [ANNOUNCE] New Committer: Manikumar Reddy

2018-10-17 Thread Harsha

Congrats Mani!! Very well deserved.

--Harsha
On Tue, Oct 16, 2018, at 5:20 PM, Attila Sasvari wrote:
> Congratulations Manikumar! Keep up the good work.
> 
> On Tue, Oct 16, 2018 at 12:30 AM Jungtaek Lim  wrote:
> 
> > Congrats Mani!
> > On Tue, 16 Oct 2018 at 1:45 PM Abhimanyu Nagrath <
> > abhimanyunagr...@gmail.com>
> > wrote:
> >
> > > Congratulations Manikumar
> > >
> > > On Tue, Oct 16, 2018 at 10:09 AM Satish Duggana <
> > satish.dugg...@gmail.com>
> > > wrote:
> > >
> > > > Congratulations Mani!
> > > >
> > > >
> > > > On Fri, Oct 12, 2018 at 9:41 PM Colin McCabe 
> > wrote:
> > > > >
> > > > > Congratulations, Manikumar!  Well done.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Fri, Oct 12, 2018, at 01:25, Edoardo Comar wrote:
> > > > > > Well done Manikumar !
> > > > > > --
> > > > > >
> > > > > > Edoardo Comar
> > > > > >
> > > > > > IBM Event Streams
> > > > > > IBM UK Ltd, Hursley Park, SO21 2JN
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > From:   "Matthias J. Sax" 
> > > > > > To: dev 
> > > > > > Cc: users 
> > > > > > Date:   11/10/2018 23:41
> > > > > > Subject:Re: [ANNOUNCE] New Committer: Manikumar Reddy
> > > > > >
> > > > > >
> > > > > >
> > > > > > Congrats!
> > > > > >
> > > > > >
> > > > > > On 10/11/18 2:31 PM, Yishun Guan wrote:
> > > > > > > Congrats Manikumar!
> > > > > > > On Thu, Oct 11, 2018 at 1:20 PM Sönke Liebau
> > > > > > >  wrote:
> > > > > > >>
> > > > > > >> Great news, congratulations Manikumar!!
> > > > > > >>
> > > > > > >> On Thu, Oct 11, 2018 at 9:08 PM Vahid Hashemian
> > > > > > 
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >>> Congrats Manikumar!
> > > > > > >>>
> > > > > > >>> On Thu, Oct 11, 2018 at 11:49 AM Ryanne Dolan <
> > > > ryannedo...@gmail.com>
> > > > > > >>> wrote:
> > > > > > >>>
> > > > > > >>>> Bravo!
> > > > > > >>>>
> > > > > > >>>> On Thu, Oct 11, 2018 at 1:48 PM Ismael Juma <
> > ism...@juma.me.uk>
> > > > > > wrote:
> > > > > > >>>>
> > > > > > >>>>> Congratulations Manikumar! Thanks for your continued
> > > > contributions.
> > > > > > >>>>>
> > > > > > >>>>> Ismael
> > > > > > >>>>>
> > > > > > >>>>> On Thu, Oct 11, 2018 at 10:39 AM Jason Gustafson
> > > > > > 
> > > > > > >>>>> wrote:
> > > > > > >>>>>
> > > > > > >>>>>> Hi all,
> > > > > > >>>>>>
> > > > > > >>>>>> The PMC for Apache Kafka has invited Manikumar Reddy as a
> > > > committer
> > > > > > >>> and
> > > > > > >>>>> we
> > > > > > >>>>>> are
> > > > > > >>>>>> pleased to announce that he has accepted!
> > > > > > >>>>>>
> > > > > > >>>>>> Manikumar has contributed 134 commits including significant
> > > > work to
> > > > > > >>> add
> > > > > > >>>>>> support for delegation tokens in Kafka:
> > > > > > >>>>>>
> > > > > > >>>>>> KIP-48:
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-48+Delegation+token+support+for+Kafka
> > > > > >
> > > > > > >>>>>> KIP-249
> > > > > > >>>>>> <
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-48+Delegation+token+support+for+KafkaKIP-249
> > > > > >
> > > > > > >>>>>>
> > > > > > >>>>>> :
> > > > > > >>>>>>
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-249%3A+Add+Delegation+Token+Operations+to+KafkaAdminClient
> > > > > >
> > > > > > >>>>>>
> > > > > > >>>>>> He has broad experience working with many of the core
> > > > components in
> > > > > > >>>> Kafka
> > > > > > >>>>>> and he has reviewed over 80 PRs. He has also made huge
> > > progress
> > > > > > >>>>> addressing
> > > > > > >>>>>> some of our technical debt.
> > > > > > >>>>>>
> > > > > > >>>>>> We appreciate the contributions and we are looking forward
> > to
> > > > more.
> > > > > > >>>>>> Congrats Manikumar!
> > > > > > >>>>>>
> > > > > > >>>>>> Jason, on behalf of the Apache Kafka PMC
> > > > > > >>>>>>
> > > > > > >>>>>
> > > > > > >>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> Sönke Liebau
> > > > > > >> Partner
> > > > > > >> Tel. +49 179 7940878
> > > > > > >> OpenCore GmbH & Co. KG - Thomas-Mann-Straße 8 - 22880 Wedel -
> > > > Germany
> > > > > >
> > > > > > [attachment "signature.asc" deleted by Edoardo Comar/UK/IBM]
> > > > > >
> > > > > >
> > > > > > Unless stated otherwise above:
> > > > > > IBM United Kingdom Limited - Registered in England and Wales with
> > > > number
> > > > > > 741598.
> > > > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
> > > PO6
> > > > 3AU
> > > >
> > >
> >
> 
> 
> -- 
> -- 
> Attila Sasvari
> Software Engineer
> <http://www.cloudera.com/>

Re: [DISCUSS] KIP-383 Pluggable interface for SSL Factory

2018-10-18 Thread Harsha

Hi,
  Thanks for the KIP. Curious to understand why the ChannelBuilder 
interface doesn't solve the stated reasons in Motiviation section.

Thanks,
Harsha

On Wed, Oct 17, 2018, at 12:10 PM, Pellerin, Clement wrote:
> I would like feedback on this proposal to make it possible to replace 
> SslFactory with a custom implementation.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-383%3A++Pluggable+interface+for+SSL+Factory
>

Re: Kafka 1.1.0 no longer available for download?

2018-12-06 Thread Harsha

You can download from here https://kafka.apache.org/downloads/
https://archive.apache.org/dist/kafka/1.1.0/kafka_2.12-1.1.0.tgz

On Wed, Dec 5, 2018, at 2:12 PM, David Glasser wrote:
> It looks like 1.1.0 is no longer available at
> https://www.apache.org/dist/kafka/
> 
> Is this intentional? While we'd like to upgrade, losing this version broke
> our CI today (and upgrading Kafka is always a process we undertake with
> care).
> 
> --dave

Re: [DISCUSS] KIP-373: Allow users to create delegation tokens for other users

2018-12-07 Thread Harsha

Hi Mani,
 Overall KIP looks good to me. Can we call this Impersonation 
support, which is what the KIP is doing? 
Also instead of using super.uses as the config which essentially giving 
cluster-wide support to the users, we can introduce impersonation.users as a 
config and users listed in the config are allowed to impersonate other users.

Thanks,
Harsha
  

On Fri, Dec 7, 2018, at 3:58 AM, Manikumar wrote:
> Bump up! to get some attention.
> 
> BTW, recently Apache Spark added  support for Kafka delegation token.
> https://issues.apache.org/jira/browse/SPARK-25501
> 
> On Fri, Dec 7, 2018 at 5:27 PM Manikumar  wrote:
> 
> > Bump up! to get some attention.
> >
> > BTW, recently Apache Spark added for Kafka delegation token support.
> > https://issues.apache.org/jira/browse/SPARK-25501
> >
> > On Tue, Sep 25, 2018 at 9:56 PM Manikumar 
> > wrote:
> >
> >> Hi all,
> >>
> >> I have created a KIP that proposes to allow users to create delegation
> >> tokens for other users.
> >>
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-373%3A+Allow+users+to+create+delegation+tokens+for+other+users
> >>
> >> Please take a look when you get a chance.
> >>
> >> Thanks,
> >> Manikumar
> >>
> >

Re: [DISCUSS] KIP-398: Support reading trust store from classpath

2018-12-08 Thread Harsha

Hi Noa,
   Based on KIP"s motivation section
"If we had the ability to load a trust store from the classpath as well as from 
a file, the trust store could be shipped in a jar that could be declared as a 
dependency and piggyback on the distribution infrastructure already in place."

It looks like you are making the assumption that distributing a jar is better 
than the file. I am not sure why one is better than the other. There are other 
use-cases where one can make a call local "daemon" over Unix socket to fetch a 
certificate as well. 
Just supporting a "classpath" option might work for a few users but it's not 
generic enough to support a wide variety of other infrastructures. My 
suggestion if the KIP motivation is to make the certificate/truststore 
available with different mechanisms, Lets make a interface that allow users to 
roll their own based on their infra and support File as the default mechanism 
so that we can support existing users.

-Harsha

On Sat, Dec 8, 2018, at 7:03 AM, Noa Resare wrote:
> 
> 
> > On 6 Dec 2018, at 20:16, Rajini Sivaram  wrote:
> > 
> > Hi Noa,
> > 
> > Thanks for the KIP. A few comments/questions:
> > 
> > 1. If we support filenames starting with `classpath:` by requiring
> > `file:`prefix,
> > then we are presumably not supporting files starting `file:`. Not
> > necessarily an issue, but we do need to document any restrictions.
> 
> I think that it would be trivial to support ‘file:’ as a prefix in a 
> filesystem path
> by just asking the user that really want that to add it twice:
> 
> The config value "file:file:my_weird_file_name" would map to the 
> filesystem path "file:my_weird_file_name”
> 
> 
> > 2. On the broker-side, trust stores are dynamically updatable. And we use
> > file modification time to decide whether trust store needs to be reloaded.
> > This is less of an issue once we implement
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-339%3A+Create+a+new+IncrementalAlterConfigs+API,
> > but at the moment, we are relying on actual files on the file system for
> > which we can compare modification times.
> 
> > 3. On the client-side, trust stores are not currently updatable. And we
> > don't have an API to make them updatable. By using class path, we preclude
> > the use of file modification times in future to detect key or trust store
> > updates for clients. It will be good to get feedback from the community on
> > whether this is a reasonable longer-term restriction.
> 
> Interesting. I think that it is a reasonable graceful degradation to 
> simply not pick up on changed truststores
> read from the classpath as long as it is documented, but if we really 
> want we could save a checksum of
> the truststore, re-read and compare to determine any changes.
> 
> > 4. It will be good to get more feedback from the community on whether
> > loading trust stores from CLASSPATH is a feature that is likely to be
> > widely adopted. If not, perhaps
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-383%3A++Pluggable+interface+for+SSL+Factory
> > will be sufficient to enable custom factories that do load trust store from
> > the CLASSPATH.
> 
> While this generic extension point would make it possible to do all 
> kinds of things, I think that the simplicity
> of allowing for this fairly small modification would benefit the kafka 
> user that doesn’t feel comfortable
> writing their own SSL Factory. That said, I would be thrilled to have 
> the opportunity to provide functionality
> to get rid of the truststore file format and have future kafka users use 
> both the loading of CA certs and client
> certs with PEM encoded cert and key files as the rest of the world has 
> for some time.
> 
> > 
> > Regards,
> > 
> > Rajini
> > 
> > 
> > On Tue, Dec 4, 2018 at 7:17 PM Sönke Liebau
> >  wrote:
> > 
> >> Hi Neo,
> >> 
> >> thanks for the KIP, the proposal sounds useful!
> >> Also I agree on both assumptions that you made:
> >> - users whose current truststore location starts with classpath: should be
> >> very few and extremely far between (and arguably made questionable choices
> >> when naming their files/directories), I personally think it is safe to
> >> ignore these
> >> - this could also be useful for loading keystores, not just truststores
> >> 
> >> One additional idea maybe, looking at the Spring documentation they seem to
> >> support filesystem, classpath and URL resources. Would it make sense to add
> >> something to allow loading the trustst

Re: [VOTE] KIP-394: Require member.id for initial join group request

2018-12-10 Thread Harsha

+1 . Thanks for the KIP. This is very much needed.

-Harsha

On Mon, Dec 10, 2018, at 11:00 AM, Guozhang Wang wrote:
> +1. Thanks Boyang!
> 
> 
> Guozhang
> 
> On Mon, Dec 10, 2018 at 10:29 AM Jason Gustafson  wrote:
> 
> > +1 Thanks for the KIP, Boyang!
> >
> > -Jason
> >
> > On Mon, Dec 10, 2018 at 10:07 AM Boyang Chen  wrote:
> >
> > > Thanks for voting my friends. Could someone give one more binding vote
> > > here?
> > >
> > > Best,
> > > Boyang
> > > 
> > > From: Bill Bejeck 
> > > Sent: Thursday, December 6, 2018 2:45 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-394: Require member.id for initial join group
> > > request
> > >
> > > +1
> > > Thanks for the KIP.
> > >
> > > -Bill
> > >
> > > On Wed, Dec 5, 2018 at 1:43 PM Matthias J. Sax 
> > > wrote:
> > >
> > > > Thanks for the KIP.
> > > >
> > > > +1 (binding)
> > > >
> > > > -Matthias
> > > >
> > > >
> > > > On 12/5/18 7:53 AM, Mayuresh Gharat wrote:
> > > > > +1 (non-binding)
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Mayuresh
> > > > >
> > > > >
> > > > > On Wed, Dec 5, 2018 at 3:59 AM Boyang Chen 
> > > wrote:
> > > > >
> > > > >> Hey friends,
> > > > >>
> > > > >> I would like to start a vote for KIP-394<
> > > > >>
> > > >
> > >
> > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-394%253A%2BRequire%2Bmember.id%2Bfor%2Binitial%2Bjoin%2Bgroup%2Brequest&data=02%7C01%7C%7C78066780826b40ed5e7d08d65ae1eda1%7C84df9e7fe9f640afb435%7C1%7C0%7C636796323725452670&sdata=Gp%2FVlUuezVVck81fMXH7yaQ7zKd0WaJ9Kc7GhtJW2Qo%3D&reserved=0
> > > > >.
> > > > >> The goal of this KIP is to improve broker stability by fencing
> > invalid
> > > > join
> > > > >> group requests.
> > > > >>
> > > > >> Best,
> > > > >> Boyang
> > > > >>
> > > > >>
> > > > >
> > > >
> > > >
> > >
> >
> 
> 
> -- 
> -- Guozhang

Re: [DISCUSS] KIP-402: Improve fairness in SocketServer processors

2018-12-11 Thread Harsha

Hi Rajini,
   Overall KIP looks good to me.  Is it possible to use 
max.connections config that we already have, althought its per IP.
But broker level max.connections would also be good have to guard against 
DOS'ing  a broker.  
Eitherway having constant like 20 without a configurable option doesn't sound 
right and as the KIP states that one can use num.network.threads to increase 
this capacity, it still not a viable option. Most of the time users tend to 
keep network threads minimal and given this  configuration will only need when 
a burst of requests comes through , allowing users to choose that ceiling would 
be beneficial.  Can you add any details on why 20 is sufficient , with default 
num.network.threads with 3 if one broker is getting more than 60 simultaneous 
connections  this would result in perceived slower responses from client side 
right?

Thanks,
Harsha

On Tue, Dec 11, 2018, at 2:48 AM, Rajini Sivaram wrote:
> Hi all,
> 
> I have submitted a KIP to improve fairness in channel processing in
> SocketServer to protect brokers from connection storms:
> 
>-
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-402%3A+Improve+fairness+in+SocketServer+processors
> 
> Feedback and suggestions welcome.
> 
> Thank you,
> 
> Rajini

Re: [DISCUSS] KIP-402: Improve fairness in SocketServer processors

2018-12-13 Thread Harsha

Thanks for the details Rajini.  It would be great if you can add a few details 
to the KIP, on how many connections you are able to handle in your cluster with 
number 20 to give some context.

Thanks,
Harsha

On Tue, Dec 11, 2018, at 10:22 AM, Rajini Sivaram wrote:
> Hi Harsha,
> 
> Thanks for reviewing the KIP.
> 
> 1) Yes, agree that we also need a max.connections configuration per-broker.
> I was thinking of doing that in a separate KIP, but I could add that here
> as well.
> 2) The number of connections processed in each iteration doesn't feel like
> an externalizable config.It is not a limit on connection rate, it is simply
> ensuring that existing connections are processed by each Processor after
> atmost every 20 new connections. It will be hard to describe this
> configuration for users to enable configuring this in a way that is
> suitable for a connection flood since it would depend on the number of
> factors like existing connection count etc. It feels like we should come up
> with a number that works well. We have been running with this code for a
> while and so far haven't run into any noticeable degradations with 20.
> 
> 
> 
> On Tue, Dec 11, 2018 at 6:03 PM Harsha  wrote:
> 
> > Hi Rajini,
> >Overall KIP looks good to me.  Is it possible to use
> > max.connections config that we already have, althought its per IP.
> > But broker level max.connections would also be good have to guard against
> > DOS'ing  a broker.
> > Eitherway having constant like 20 without a configurable option doesn't
> > sound right and as the KIP states that one can use num.network.threads to
> > increase this capacity, it still not a viable option. Most of the time
> > users tend to keep network threads minimal and given this  configuration
> > will only need when a burst of requests comes through , allowing users to
> > choose that ceiling would be beneficial.  Can you add any details on why 20
> > is sufficient , with default num.network.threads with 3 if one broker is
> > getting more than 60 simultaneous connections  this would result in
> > perceived slower responses from client side right?
> >
> > Thanks,
> > Harsha
> >
> >
> > On Tue, Dec 11, 2018, at 2:48 AM, Rajini Sivaram wrote:
> > > Hi all,
> > >
> > > I have submitted a KIP to improve fairness in channel processing in
> > > SocketServer to protect brokers from connection storms:
> > >
> > >-
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-402%3A+Improve+fairness+in+SocketServer+processors
> > >
> > > Feedback and suggestions welcome.
> > >
> > > Thank you,
> > >
> > > Rajini
> >

Re: what happens when a vote ends with no votes?

2018-12-14 Thread Harsha

Hi,
   It might have slipped through. You can try calling out VOTE again on the 
KIP.
Thanks,
Harsha

On Fri, Dec 14, 2018, at 12:19 PM, Pellerin, Clement wrote:
> I called a vote on KIP-383 more than 72h ago but it attracted no votes 
> and no comments.
> The rule requires lazy majority which demands at least 3 binding votes.
> What happens next?
> 
> -Original Message-
> From: Pellerin, Clement 
> Sent: Tuesday, December 11, 2018 2:17 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-383 Pluggable interface for SSL Factory
> 
> Since there was no objections, I'm calling a vote on KIP-383.
>

Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

2018-12-15 Thread Harsha

Overall LGTM. +1.

Thanks,
Harsha

On Fri, Dec 14, 2018, at 12:52 PM, Pellerin, Clement wrote:
> So far, there are no votes on this KIP. Please help me fix KAFKA-6654 by 
> voting for this fix.
> Improvement comments are also welcome.
> 
> -Original Message-
> From: Pellerin, Clement 
> Sent: Tuesday, December 11, 2018 2:17 PM
> To: dev@kafka.apache.org
> Subject: RE: [VOTE] KIP-383 Pluggable interface for SSL Factory
> 
> Since there was no objections, I'm calling a vote on KIP-383.
>

Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

2018-12-18 Thread Harsha

Yes. +1 binding.

Thanks,
Harsha

On Mon, Dec 17, 2018, at 5:21 AM, Pellerin, Clement wrote:
> I'm new here. Is this vote binding or not?
> 
> -Original Message-
> From: Harsha [mailto:ka...@harsha.io] 
> Sent: Saturday, December 15, 2018 1:59 PM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory
> 
> Overall LGTM. +1.
> 
> Thanks,
> Harsha

Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

2018-12-20 Thread Harsha

Damian,
   This is the VOTE thread. There is a DISCUSS thread which 
concluded in it.

-Harsha


On Wed, Dec 19, 2018, at 5:04 AM, Pellerin, Clement wrote:
> I did that and nobody came.
> https://lists.apache.org/list.html?dev@kafka.apache.org:lte=1M:kip-383
> I don't understand why this feature is not more popular.
> It's the solution to one Jira and a work-around for a handful more Jiras.
> 
> -Original Message-
> From: Damian Guy [mailto:damian@gmail.com] 
> Sent: Wednesday, December 19, 2018 7:38 AM
> To: dev
> Subject: Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory
> 
> Hi Clement,
> 
> You should start a separate thread for the vote, i.e., one with a subject
> of [VOTE] KIP-383 ...
> 
> Looks like you haven't done this?

Re: [VOTE] KIP-382 MirrorMaker 2.0

2018-12-21 Thread Harsha

+1 (binding).  Nice work Ryan.
-Harsha

On Fri, Dec 21, 2018, at 8:14 AM, Andrew Schofield wrote:
> +1 (non-binding)
> 
> Andrew Schofield
> IBM Event Streams
> 
> On 21/12/2018, 01:23, "Srinivas Reddy"  wrote:
> 
> +1 (non binding)
> 
> Thank you Ryan for the KIP, let me know if you need support in 
> implementing
> it.
> 
> -
> Srinivas
> 
> - Typed on tiny keys. pls ignore typos.{mobile app}
> 
> 
> On Fri, 21 Dec, 2018, 08:26 Ryanne Dolan  
> > Thanks for the votes so far!
> >
> > Due to recent discussions, I've removed the high-level REST API from the
> > KIP.
> >
> > On Thu, Dec 20, 2018 at 12:42 PM Paul Davidson 
> 
> > wrote:
> >
> > > +1
> > >
> > > Would be great to see the community build on the basic approach we 
> took
> > > with Mirus. Thanks Ryanne.
> > >
> > > On Thu, Dec 20, 2018 at 9:01 AM Andrew Psaltis 
>  > >
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Really looking forward to this and to helping in any way I can. 
> Thanks
> > > for
> > > > kicking this off Ryanne.
> > > >
> > > > On Thu, Dec 20, 2018 at 10:18 PM Andrew Otto 
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > This looks like a huge project! Wikimedia would be very excited to
> > have
> > > > > this. Thanks!
> > > > >
> > > > > On Thu, Dec 20, 2018 at 9:52 AM Ryanne Dolan 
> 
> > > > > wrote:
> > > > >
> > > > > > Hey y'all, please vote to adopt KIP-382 by replying +1 to this
> > > thread.
> > > > > >
> > > > > > For your reference, here are the highlights of the proposal:
> > > > > >
> > > > > > - Leverages the Kafka Connect framework and ecosystem.
> > > > > > - Includes both source and sink connectors.
> > > > > > - Includes a high-level driver that manages connectors in a
> > dedicated
> > > > > > cluster.
> > > > > > - High-level REST API abstracts over connectors between multiple
> > > Kafka
> > > > > > clusters.
> > > > > > - Detects new topics, partitions.
> > > > > > - Automatically syncs topic configuration between clusters.
> > > > > > - Manages downstream topic ACL.
> > > > > > - Supports "active/active" cluster pairs, as well as any number 
> of
> > > > active
> > > > > > clusters.
> > > > > > - Supports cross-data center replication, aggregation, and other
> > > > complex
> > > > > > topologies.
> > > > > > - Provides new metrics including end-to-end replication latency
> > > across
> > > > > > multiple data centers/clusters.
> > > > > > - Emits offsets required to migrate consumers between clusters.
> > > > > > - Tooling for offset translation.
> > > > > > - MirrorMaker-compatible legacy mode.
> > > > > >
> > > > > > Thanks, and happy holidays!
> > > > > > Ryanne
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Paul Davidson
> > > Principal Engineer, Ajna Team
> > > Big Data & Monitoring
> > >
> >
> 
>

Re: [VOTE] KIP-345: Introduce static membership protocol to reduce consumer rebalances

2019-01-03 Thread Harsha

+1 (binding).

Thanks,
Harsha

On Wed, Jan 2, 2019, at 9:59 AM, Boyang Chen wrote:
> Thanks Jason for the comment! I answered it on the discuss thread.
> 
> Folks, could we continue the vote for this KIP? This is a very critical 
> improvement for our streaming system
> stability and we need to get things rolling right at the start of 2019.
> 
> Thank you for your time!
> Boyang
> 
> 
> From: Jason Gustafson 
> Sent: Tuesday, December 18, 2018 7:40 AM
> To: dev
> Subject: Re: [VOTE] KIP-345: Introduce static membership protocol to 
> reduce consumer rebalances
> 
> Hi Boyang,
> 
> Thanks, the KIP looks good. Just one comment.
> 
> The new schema for the LeaveGroup request is slightly odd since it is
> handling both the single consumer use case and the administrative use case.
> I wonder we could make it consistent from a batching perspective.
> 
> In other words, instead of this:
> LeaveGroupRequest => GroupId MemberId [GroupInstanceId]
> 
> Maybe we could do this:
> LeaveGroupRequest => GroupId [GroupInstanceId MemberId]
> 
> For dynamic members, GroupInstanceId could be empty, which is consistent
> with JoinGroup. What do you think?
> 
> Also, just for clarification, what is the expected behavior if the current
> memberId of a static member is passed to LeaveGroup? Will the static member
> be removed? I know the consumer will not do this, but we'll still have to
> handle the case on the broker.
> 
> Best,
> Jason
> 
> 
> On Mon, Dec 10, 2018 at 11:54 PM Boyang Chen  wrote:
> 
> > Thanks Stanislav！
> >
> > Get Outlook for iOS<https://aka.ms/o0ukef>
> >
> > 
> > From: Stanislav Kozlovski 
> > Sent: Monday, December 10, 2018 11:28 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-345: Introduce static membership protocol to
> > reduce consumer rebalances
> >
> > This is great work, Boyang. Thank you very much.
> >
> > +1 (non-binding)
> >
> > On Mon, Dec 10, 2018 at 6:09 PM Boyang Chen  wrote:
> >
> > > Hey there, could I get more votes on this thread?
> > >
> > > Thanks for the vote from Mayuresh and Mike :)
> > >
> > > Best,
> > > Boyang
> > > 
> > > From: Mayuresh Gharat 
> > > Sent: Thursday, December 6, 2018 10:53 AM
> > > To: dev@kafka.apache.org
> > > Subject: Re: [VOTE] KIP-345: Introduce static membership protocol to
> > > reduce consumer rebalances
> > >
> > > +1 (non-binding)
> > >
> > > Thanks,
> > >
> > > Mayuresh
> > >
> > > On Tue, Dec 4, 2018 at 6:58 AM Mike Freyberger <
> > mike.freyber...@xandr.com>
> > > wrote:
> > >
> > > > +1 (non binding)
> > > >
> > > > On 12/4/18, 9:43 AM, "Patrick Williams" <
> > patrick.willi...@storageos.com
> > > >
> > > > wrote:
> > > >
> > > > Pls take me off this VOTE list
> > > >
> > > > Best,
> > > >
> > > > Patrick Williams
> > > >
> > > > Sales Manager, UK & Ireland, Nordics & Israel
> > > > StorageOS
> > > > +44 (0)7549 676279
> > > > patrick.willi...@storageos.com
> > > >
> > > > 20 Midtown
> > > > 20 Proctor Street
> > > > Holborn
> > > > London WC1V 6NX
> > > >
> > > > Twitter: @patch37
> > > > LinkedIn: linkedin.com/in/patrickwilliams4 <
> > > >
> > >
> > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flinkedin.com%2Fin%2Fpatrickwilliams4&data=02%7C01%7C%7C9b12ec4ce9ae4454db8a08d65f3a4862%7C84df9e7fe9f640afb435%7C1%7C0%7C636801101252994092&sdata=ipDTX%2FGARrFkwZfRuOY0M5m3iJ%2Bnkxovv6u9bBDaTyc%3D&reserved=0
> > > >
> > > >
> > > >
> > >
> > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslack.storageos.com%2F&data=02%7C01%7C%7C9b12ec4ce9ae4454db8a08d65f3a4862%7C84df9e7fe9f640afb435%7C1%7C0%7C636801101252994092&sdata=hxuKU6aZdQU%2FpxpqaaThR6IjpEmwIP5%2F3NhYzMYijkw%3D&reserved=0
> > > >
> > > >
> > > >
> > > > On 03/12/2018, 17:34, "Guozhang Wang"  wrote:
> > > >
> > > > Hello Boyang,
> > > >
> > > > I've browsed through the new wiki and there are still a couple of
> > > > minor
> > > > thi

Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

2019-01-09 Thread Harsha

HI All,
We are looking forward to this KIP. Appreciate if others can take a 
look at the kip and
vote on this thread.

Thanks
Harsha

On Fri, Dec 21, 2018, at 4:41 AM, Damian Guy wrote:
> must be my gmail playing up. This appears to be the DISCUSS thread to me...
> e
> On Thu, 20 Dec 2018 at 18:42, Harsha  wrote:
> 
> > Damian,
> >This is the VOTE thread. There is a DISCUSS thread which
> > concluded in it.
> >
> > -Harsha
> >
> >
> > On Wed, Dec 19, 2018, at 5:04 AM, Pellerin, Clement wrote:
> > > I did that and nobody came.
> > > https://lists.apache.org/list.html?dev@kafka.apache.org:lte=1M:kip-383
> > > I don't understand why this feature is not more popular.
> > > It's the solution to one Jira and a work-around for a handful more Jiras.
> > >
> > > -Original Message-
> > > From: Damian Guy [mailto:damian@gmail.com]
> > > Sent: Wednesday, December 19, 2018 7:38 AM
> > > To: dev
> > > Subject: Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL
> > Factory
> > >
> > > Hi Clement,
> > >
> > > You should start a separate thread for the vote, i.e., one with a subject
> > > of [VOTE] KIP-383 ...
> > >
> > > Looks like you haven't done this?
> >

Re: [DISCUSS] KIP-402: Improve fairness in SocketServer processors

2019-01-14 Thread Harsha

+1 to include in 2.2. Thanks Rajini for sharing the details.

On Mon, Jan 14, 2019, at 9:10 AM, Rajini Sivaram wrote:
> This is the result of the tests Gardner did before deploying the patch
> (thanks Gardner!):
> 
> *We used the Trogdor `ConnectionStress` workload to test the lazy buffer
> allocation patch (which has already been merged to AK) and the bounded
> acceptor queue patch. We didn't see any performance difference between the
> two patch sets in `connectsPerSec` and there was nothing outstanding in the
> Kafka JMX metrics. We were able to get pre-patch AK to run OOM reliably
> though during the `ConnectionStress` test. When running the same
> configuration post-patch, we were not able to get Kafka to run OOM.*
> 
> I won't have time to do any further performance tests before 2.2 KIP
> freeze. Are there any concerns about including this KIP in 2.2? If not, I
> will start voting later this week.
> 
> Thanks,
> 
> Rajini
> 
> On Fri, Dec 14, 2018 at 12:13 PM Rajini Sivaram 
> wrote:
> 
> > Hi Harsha,
> >
> > I am not sure if we have numbers for connection bursts. But since we have
> > the code, I can run some tests with and without the change and provided the
> > results.
> >
> > Hi Edo,
> >
> > There is no reason why we can't make num.network.threads a listener
> > config that allows different listeners to use different number of threads.
> > Have you run into any issues with the limitation of a single value for the
> > broker? It will be good to get feedback from the community on whether this
> > will be a useful change. Perhaps we could do it as a follow-on KIP if
> > required.
> >
> > On Fri, Dec 14, 2018 at 10:33 AM Edoardo Comar  wrote:
> >
> >> Hi Rajini
> >>
> >> thanks for the KIP!
> >> I noticed (from the KIP text) the new
> >> > Config option: Name: max.connections
> >> > The config may be prefixed with listener prefix to specify different
> >> limits for different listeners, enabling inter-broker connections to be
> >> created even if there are a large number of client connections on  a
> >> different listener.
> >>
> >> do you think it would make sense to also allow the `num.network.threads`
> >> to have an optional per-listener prefix ?
> >>
> >> ciao,
> >> Edo
> >> --
> >>
> >> Edoardo Comar
> >>
> >> IBM Event Streams
> >> IBM UK Ltd, Hursley Park, SO21 2JN
> >>
> >>
> >> Rajini Sivaram  wrote on 11/12/2018 18:22:03:
> >>
> >> > From: Rajini Sivaram 
> >> > To: dev 
> >> > Date: 11/12/2018 18:22
> >> > Subject: Re: [DISCUSS] KIP-402: Improve fairness in SocketServer
> >> processors
> >> >
> >> > Hi Harsha,
> >> >
> >> > Thanks for reviewing the KIP.
> >> >
> >> > 1) Yes, agree that we also need a max.connections configuration
> >> per-broker.
> >> > I was thinking of doing that in a separate KIP, but I could add that
> >> here
> >> > as well.
> >> > 2) The number of connections processed in each iteration doesn't feel
> >> like
> >> > an externalizable config.It is not a limit on connection rate, it is
> >> simply
> >> > ensuring that existing connections are processed by each Processor after
> >> > atmost every 20 new connections. It will be hard to describe this
> >> > configuration for users to enable configuring this in a way that is
> >> > suitable for a connection flood since it would depend on the number of
> >> > factors like existing connection count etc. It feels like we should
> >> come
> >> up
> >> > with a number that works well. We have been running with this code for a
> >> > while and so far haven't run into any noticeable degradations with 20.
> >> >
> >> >
> >> >
> >> > On Tue, Dec 11, 2018 at 6:03 PM Harsha  wrote:
> >> >
> >> > > Hi Rajini,
> >> > >Overall KIP looks good to me.  Is it possible to use
> >> > > max.connections config that we already have, althought its per IP.
> >> > > But broker level max.connections would also be good have to guard
> >> against
> >> > > DOS'ing  a broker.
> >> > > Eitherway having constant like 20 without a configurable option
> >> doesn't
> >> > > s

Re: [VOTE] KIP-402: Improve fairness in SocketServer processors

2019-01-15 Thread Harsha

+1 (binding).

Thanks,
Harsha

On Tue, Jan 15, 2019, at 3:38 PM, Rajini Sivaram wrote:
> Hi all,
> 
> I would like to start vote on KIP-402 to improve fairness in channel
> processing in SocketServer to protect brokers from connection storms and
> limit the total number of connections in brokers to avoid OOM. The KIP is
> here:
> 
>-
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-402%3A+Improve+fairness+in+SocketServer+processors
> 
> 
> Thanks,
> 
> Rajini

Re: [ANNOUNCE] New Committer: Vahid Hashemian

2019-01-15 Thread Harsha

Congrats!!
-Harsha

On Tue, Jan 15, 2019, at 4:00 PM, Mayuresh Gharat wrote:
> congrats !!
> 
> On Tue, Jan 15, 2019 at 3:42 PM Matthias J. Sax 
> wrote:
> 
> > Congrats!
> >
> > On 1/15/19 3:34 PM, Boyang Chen wrote:
> > > This is exciting moment! Congrats Vahid!
> > >
> > > Boyang
> > >
> > > 
> > > From: Rajini Sivaram 
> > > Sent: Wednesday, January 16, 2019 6:50 AM
> > > To: Users
> > > Cc: dev
> > > Subject: Re: [ANNOUNCE] New Committer: Vahid Hashemian
> > >
> > > Congratulations, Vahid! Well deserved!!
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > > On Tue, Jan 15, 2019 at 10:45 PM Jason Gustafson 
> > wrote:
> > >
> > >> Hi All,
> > >>
> > >> The PMC for Apache Kafka has invited Vahid Hashemian as a project
> > >> committer and
> > >> we are
> > >> pleased to announce that he has accepted!
> > >>
> > >> Vahid has made numerous contributions to the Kafka community over the
> > past
> > >> few years. He has authored 13 KIPs with core improvements to the
> > consumer
> > >> and the tooling around it. He has also contributed nearly 100 patches
> > >> affecting all parts of the codebase. Additionally, Vahid puts a lot of
> > >> effort into community engagement, helping others on the mail lists and
> > >> sharing his experience at conferences and meetups.
> > >>
> > >> We appreciate the contributions and we are looking forward to more.
> > >> Congrats Vahid!
> > >>
> > >> Jason, on behalf of the Apache Kafka PMC
> > >>
> > >
> >
> >
> 
> -- 
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125

Re: [VOTE] KIP-291: Have separate queues for control requests and data requests

2018-06-20 Thread Harsha

+1

-Harsha

On Wed, Jun 20, 2018, at 5:15 AM, Thomas Crayford wrote:
> +1 (non-binding)
> 
> On Tue, Jun 19, 2018 at 8:20 PM, Lucas Wang  wrote:
> 
> > Hi Jun, Ismael,
> >
> > Can you please take a look when you get a chance? Thanks!
> >
> > Lucas
> >
> > On Mon, Jun 18, 2018 at 1:47 PM, Ted Yu  wrote:
> >
> > > +1
> > >
> > > On Mon, Jun 18, 2018 at 1:04 PM, Lucas Wang 
> > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I've addressed a couple of comments in the discussion thread for
> > KIP-291,
> > > > and
> > > > got no objections after making the changes. Therefore I would like to
> > > start
> > > > the voting thread.
> > > >
> > > > KIP:
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 291%3A+Have+separate+queues+for+control+requests+and+data+requests
> > > >
> > > > Thanks for your time!
> > > > Lucas
> > > >
> > >
> >

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-06-24 Thread Harsha

Hi Lucas,
             One more question, any thoughts on making this configurable and 
also allowing subset of data requests to be prioritized. For example ,we notice 
in our cluster when we take out a broker and bring new one it will try to 
become follower and have lot of fetch requests to other leaders in clusters. 
This will negatively effect the application/client requests. We are also 
exploring the similar solution to de-prioritize if a new replica comes in for 
fetch requests, we are ok with the replica to be taking time but the leaders 
should prioritize the client requests. 
            

Thanks,
Harsha

On Fri, Jun 22nd, 2018 at 11:35 AM Lucas Wang wrote:

> 
> 
> 
> Hi Eno,
> 
> Sorry for the delayed response.
> - I haven't implemented the feature yet, so no experimental results so
> far.
> And I plan to test in out in the following days.
> 
> - You are absolutely right that the priority queue does not completely
> prevent
> data requests being processed ahead of controller requests.
> That being said, I expect it to greatly mitigate the effect of stable
> metadata.
> In any case, I'll try it out and post the results when I have it.
> 
> Regards,
> Lucas
> 
> On Wed, Jun 20, 2018 at 5:44 AM, Eno Thereska < eno.there...@gmail.com >
> wrote:
> 
> > Hi Lucas,
> >
> > Sorry for the delay, just had a look at this. A couple of questions:
> > - did you notice any positive change after implementing this KIP? I'm
> > wondering if you have any experimental results that show the benefit of
> the
> > two queues.
> >
> > - priority is usually not sufficient in addressing the problem the KIP
> > identifies. Even with priority queues, you will sometimes (often?) have
> the
> > case that data plane requests will be ahead of the control plane
> requests.
> > This happens because the system might have already started processing
> the
> > data plane requests before the control plane ones arrived. So it would
> be
> > good to know what % of the problem this KIP addresses.
> >
> > Thanks
> > Eno
> >
> 
> 
> > On Fri, Jun 15, 2018 at 4:44 PM, Ted Yu < yuzhih...@gmail.com > wrote:
> >
> > > Change looks good.
> > >
> > > Thanks
> > >
> > > On Fri, Jun 15, 2018 at 8:42 AM, Lucas Wang < lucasatu...@gmail.com >
> > wrote:
> > >
> > > > Hi Ted,
> > > >
> > > > Thanks for the suggestion. I've updated the KIP. Please take another
> 
> > > look.
> > > >
> > > > Lucas
> > > >
> > > > On Thu, Jun 14, 2018 at 6:34 PM, Ted Yu < yuzhih...@gmail.com >
> wrote:
> > > >
> > > > > Currently in KafkaConfig.scala :
> > > > >
> > > > > val QueuedMaxRequests = 500
> > > > >
> > > > > It would be good if you can include the default value for this new
> 
> > > config
> > > > > in the KIP.
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Thu, Jun 14, 2018 at 4:28 PM, Lucas Wang < lucasatu...@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Hi Ted, Dong
> > > > > >
> > > > > > I've updated the KIP by adding a new config, instead of reusing
> the
> > > > > > existing one.
> > > > > > Please take another look when you have time. Thanks a lot!
> > > > > >
> > > > > > Lucas
> > > > > >
> > > > > > On Thu, Jun 14, 2018 at 2:33 PM, Ted Yu < yuzhih...@gmail.com >
> > wrote:
> > > > > >
> > > > > > > bq. that's a waste of resource if control request rate is low
> > > > > > >
> > > > > > > I don't know if control request rate can get to 100,000,
> likely
> > > not.
> > > > > Then
> > > > > > > using the same bound as that for data requests seems high.
> > > > > > >
> > > > > > > On Wed, Jun 13, 2018 at 10:13 PM, Lucas Wang <
> > > lucasatu...@gmail.com >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Ted,
> > > > > > > >
> > > > > > > > Thanks for taking a look at this KIP.
> > > > > > > > Let's say today the setting of "queued.max.requests" in
> > cluster A
> > > > is
> &g

Re: [DISCUSS] KIP-308: Support dynamic update of max.connections.per.ip/max.connections.per.ip.overrides configs

2018-06-26 Thread Harsha

This is very useful. LGTM.

Thanks,
Harsha

On Mon, Jun 25th, 2018 at 10:20 AM, Dong Lin  wrote:

> 
> 
> 
> Hey Manikumar,
> 
> Thanks much for the KIP. It looks pretty good.
> 
> Thanks,
> Dong
> 
> On Thu, Jun 21, 2018 at 11:38 PM, Manikumar < manikumar.re...@gmail.com >
> wrote:
> 
> > Hi all,
> >
> > I have created a KIP to add support for dynamic update of
> > max.connections.per.ip/max.connections.per.ip.overrides configs
> >
> > * https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=85474993
> 
> > < https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=85474993
> 
> > >*
> >
> > Any feedback is appreciated.
> >
> > Thanks
> >
> 
> 
> 
>

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-06-26 Thread Harsha

Thanks for the pointer. Will take a look might suit our requirements better.

Thanks,
Harsha

On Mon, Jun 25th, 2018 at 2:52 PM, Lucas Wang  wrote:

> 
> 
> 
> Hi Harsha,
> 
> If I understand correctly, the replication quota mechanism proposed in
> KIP-73 can be helpful in that scenario.
> Have you tried it out?
> 
> Thanks,
> Lucas
> 
> 
> 
> On Sun, Jun 24, 2018 at 8:28 AM, Harsha < ka...@harsha.io > wrote:
> 
> > Hi Lucas,
> > One more question, any thoughts on making this configurable
> > and also allowing subset of data requests to be prioritized. For example
> 
> > ,we notice in our cluster when we take out a broker and bring new one it
> 
> > will try to become follower and have lot of fetch requests to other
> leaders
> > in clusters. This will negatively effect the application/client
> requests.
> > We are also exploring the similar solution to de-prioritize if a new
> > replica comes in for fetch requests, we are ok with the replica to be
> > taking time but the leaders should prioritize the client requests.
> >
> >
> > Thanks,
> > Harsha
> >
> > On Fri, Jun 22nd, 2018 at 11:35 AM Lucas Wang wrote:
> >
> > >
> > >
> > >
> > > Hi Eno,
> > >
> > > Sorry for the delayed response.
> > > - I haven't implemented the feature yet, so no experimental results so
> 
> > > far.
> > > And I plan to test in out in the following days.
> > >
> > > - You are absolutely right that the priority queue does not completely
> 
> > > prevent
> > > data requests being processed ahead of controller requests.
> > > That being said, I expect it to greatly mitigate the effect of stable
> > > metadata.
> > > In any case, I'll try it out and post the results when I have it.
> > >
> > > Regards,
> > > Lucas
> > >
> > > On Wed, Jun 20, 2018 at 5:44 AM, Eno Thereska < eno.there...@gmail.com
> >
> > > wrote:
> > >
> > > > Hi Lucas,
> > > >
> > > > Sorry for the delay, just had a look at this. A couple of questions:
> 
> > > > - did you notice any positive change after implementing this KIP?
> I'm
> > > > wondering if you have any experimental results that show the benefit
> of
> > > the
> > > > two queues.
> > > >
> > > > - priority is usually not sufficient in addressing the problem the
> KIP
> > > > identifies. Even with priority queues, you will sometimes (often?)
> have
> > > the
> > > > case that data plane requests will be ahead of the control plane
> > > requests.
> > > > This happens because the system might have already started
> processing
> > > the
> > > > data plane requests before the control plane ones arrived. So it
> would
> > > be
> > > > good to know what % of the problem this KIP addresses.
> > > >
> > > > Thanks
> > > > Eno
> > > >
> > >
> > >
> > > > On Fri, Jun 15, 2018 at 4:44 PM, Ted Yu < yuzhih...@gmail.com >
> wrote:
> > > >
> > > > > Change looks good.
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, Jun 15, 2018 at 8:42 AM, Lucas Wang < lucasatu...@gmail.com
> 
> > >
> > > > wrote:
> > > > >
> > > > > > Hi Ted,
> > > > > >
> > > > > > Thanks for the suggestion. I've updated the KIP. Please take
> > another
> > >
> > > > > look.
> > > > > >
> > > > > > Lucas
> > > > > >
> > > > > > On Thu, Jun 14, 2018 at 6:34 PM, Ted Yu < yuzhih...@gmail.com >
> > > wrote:
> > > > > >
> > > > > > > Currently in KafkaConfig.scala :
> > > > > > >
> > > > > > > val QueuedMaxRequests = 500
> > > > > > >
> > > > > > > It would be good if you can include the default value for this
> 
> > new
> > >
> > > > > config
> > > > > > > in the KIP.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Thu, Jun 14, 2018 at 4:28 PM, Lucas Wang <
> > lucasatu...@gmail.com
> > > >
> > > > > > wrote:
> >

Re: [VOTE] KIP-324: Add method to get metrics() in AdminClient

2018-06-27 Thread Harsha

+1 (binding)

Thanks,
Harsha

On Wed, Jun 27th, 2018 at 10:56 AM, Damian Guy  wrote:

> 
> 
> 
> +1 (binding)
> 
> Thanks
> 
> 
> 
> On Wed, 27 Jun 2018 at 18:50 Bill Bejeck < bbej...@gmail.com > wrote:
> 
> > +1
> >
> > -Bill
> >
> > On Wed, Jun 27, 2018 at 12:47 PM Manikumar < manikumar.re...@gmail.com >
> 
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > Thanks.
> > >
> > > On Wed, Jun 27, 2018 at 10:15 PM Matthias J. Sax < matth...@confluent.io
> >
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On 6/26/18 2:33 PM, Guozhang Wang wrote:
> > > > > +1. Thanks.
> > > > >
> > > > > On Tue, Jun 26, 2018 at 2:31 PM, Yishun Guan < gyis...@gmail.com >
> 
> > > wrote:
> > > > >
> > > > >> Hi All,
> > > > >>
> > > > >> I am starting a vote on this KIP:
> > > > >>
> > > > >> https://cwiki.apache.org/confluence/x/lQg0BQ
> > > > >>
> > > > >> Thanks,
> > > > >> Yishun
> > > > >>
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
> 
> 
> 
> 
> 
>

Re: [kafka-clients] [VOTE] 1.0.2 RC1

2018-07-02 Thread Harsha

+1.
    
1) Ran unit tests
2) 3 node cluster , tested basic operations.

Thanks,
Harsha

On Mon, Jul 2nd, 2018 at 11:57 AM, Jun Rao  wrote:

> 
> 
> 
> Hi, Matthias,
> 
> Thanks for the running the release. Verified quickstart on scala 2.12
> binary. +1
> 
> Jun
> 
> On Fri, Jun 29, 2018 at 10:02 PM, Matthias J. Sax < matth...@confluent.io >
> 
> wrote:
> 
> > Hello Kafka users, developers and client-developers,
> >
> > This is the second candidate for release of Apache Kafka 1.0.2.
> >
> > This is a bug fix release addressing 27 tickets:
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.0.2
> >
> > Release notes for the 1.0.2 release:
> > http://home.apache.org/~mjsax/kafka-1.0.2-rc1/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by end of next week (7/6/18).
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~mjsax/kafka-1.0.2-rc1/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~mjsax/kafka-1.0.2-rc1/javadoc/
> >
> > * Tag to be voted upon (off 1.0 branch) is the 1.0.2 tag:
> > https://github.com/apache/kafka/releases/tag/1.0.2-rc1
> >
> > * Documentation:
> > http://kafka.apache.org/10/documentation.html
> >
> > * Protocol:
> > http://kafka.apache.org/10/protocol.html
> >
> > * Successful Jenkins builds for the 1.0 branch:
> > Unit/integration tests: https://builds.apache.org/job/kafka-1.0-jdk7/214/
> 
> > System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/1.0/225/
> >
> > /**
> >
> > Thanks,
> > -Matthias
> >
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups
> > "kafka-clients" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an
> > email to kafka-clients+ unsubscr...@googlegroups.com.
> > To post to this group, send email to kafka-clie...@googlegroups.com.
> > Visit this group at https://groups.google.com/group/kafka-clients.
> > To view this discussion on the web visit https://groups.google.com/d/
> > msgid/kafka-clients/ca183ad4-9285-e423-3850-261f9dfec044%40confluent.io.
> 
> > For more options, visit https://groups.google.com/d/optout.
> >
> 
> 
> 
>

Re: [VOTE] 2.0.0 RC1

2018-07-02 Thread Harsha

+1. 

1) Ran unit tests 
2) 3 node cluster , tested basic operations. 

Thanks,
Harsha

On Mon, Jul 2nd, 2018 at 11:13 AM, "Vahid S Hashemian" 
 wrote:

> 
> 
> 
> +1 (non-binding)
> 
> Built from source and ran quickstart successfully on Ubuntu (with Java 8).
> 
> 
> Minor: It seems this doc update PR is not included in the RC:
> https://github.com/apache/kafka/pull/5280
> Guozhang seems to have wanted to cherry-pick it to 2.0.
> 
> Thanks Rajini!
> --Vahid
> 
> 
> 
> 
> From: Rajini Sivaram < rajinisiva...@gmail.com >
> To: dev < dev@kafka.apache.org >, Users < us...@kafka.apache.org >,
> kafka-clients < kafka-clie...@googlegroups.com >
> Date: 06/29/2018 11:36 AM
> Subject: [VOTE] 2.0.0 RC1
> 
> 
> 
> Hello Kafka users, developers and client-developers,
> 
> 
> This is the second candidate for release of Apache Kafka 2.0.0.
> 
> 
> This is a major version release of Apache Kafka. It includes 40 new KIPs
> and
> 
> several critical bug fixes. Please see the 2.0.0 release plan for more
> details:
> 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
> 
> 
> 
> A few notable highlights:
> 
> - Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
> (KIP-277)
> - SASL/OAUTHBEARER implementation (KIP-255)
> - Improved quota communication and customization of quotas (KIP-219,
> KIP-257)
> - Efficient memory usage for down conversion (KIP-283)
> - Fix log divergence between leader and follower during fast leader
> failover (KIP-279)
> - Drop support for Java 7 and remove deprecated code including old
> scala
> clients
> - Connect REST extension plugin, support for externalizing secrets and
> improved error handling (KIP-285, KIP-297, KIP-298 etc.)
> - Scala API for Kafka Streams and other Streams API improvements
> (KIP-270, KIP-150, KIP-245, KIP-251 etc.)
> 
> Release notes for the 2.0.0 release:
> 
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/RELEASE_NOTES.html
> 
> 
> 
> 
> *** Please download, test and vote by Tuesday, July 3rd, 4pm PT
> 
> 
> Kafka's KEYS file containing PGP keys we use to sign the release:
> 
> http://kafka.apache.org/KEYS
> 
> 
> 
> * Release artifacts to be voted upon (source and binary):
> 
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/
> 
> 
> 
> * Maven artifacts to be voted upon:
> 
> https://repository.apache.org/content/groups/staging/
> 
> 
> 
> * Javadoc:
> 
> http://home.apache.org/~rsivaram/kafka-2.0.0-rc1/javadoc/
> 
> 
> 
> * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
> 
> https://github.com/apache/kafka/tree/2.0.0-rc1
> 
> 
> 
> * Documentation:
> 
> http://kafka.apache.org/20/documentation.html
> 
> 
> 
> * Protocol:
> 
> http://kafka.apache.org/20/protocol.html
> 
> 
> 
> * Successful Jenkins builds for the 2.0 branch:
> 
> Unit/integration tests:
> https://builds.apache.org/job/kafka-2.0-jdk8/66/
> 
> 
> System tests:
> https://jenkins.confluent.io/job/system-test-kafka/job/2.0/15/
> 
> 
> 
> 
> Please test and verify the release artifacts and submit a vote for this RC
> 
> or report any issues so that we can fix them and roll out a new RC ASAP!
> 
> Although this release vote requires PMC votes to pass, testing, votes, and
> 
> bug
> reports are valuable and appreciated from everyone.
> 
> 
> Thanks,
> 
> 
> Rajini
> 
> 
> 
> 
> 
> 
> 
>

Re: [VOTE] KIP-322: Return new error code for DeleteTopics API when topic deletion disabled.

2018-07-03 Thread Harsha

+1.

Thanks,
Harsha

On Tue, Jul 3rd, 2018 at 9:22 AM, Ted Yu  wrote:

> 
> 
> 
> +1
> 
> On Tue, Jul 3, 2018 at 9:05 AM, Mickael Maison < mickael.mai...@gmail.com >
> 
> wrote:
> 
> > +1 (non binding)
> > Thanks for the KIP
> >
> > On Tue, Jul 3, 2018 at 4:59 PM, Vahid S Hashemian
> > < vahidhashem...@us.ibm.com > wrote:
> > > +1 (non-binding)
> > >
> > > --Vahid
> > >
> > >
> > >
> > > From: Gwen Shapira < g...@confluent.io >
> > > To: dev < dev@kafka.apache.org >
> > > Date: 07/03/2018 08:49 AM
> > > Subject: Re: [VOTE] KIP-322: Return new error code for
> > DeleteTopics
> > > API when topic deletion disabled.
> > >
> > >
> > >
> > > +1
> > >
> > > On Tue, Jul 3, 2018 at 8:24 AM, Manikumar < manikumar.re...@gmail.com >
> 
> > > wrote:
> > >
> > >> Manikumar < manikumar.re...@gmail.com >
> > >> Fri, Jun 29, 7:59 PM (4 days ago)
> > >> to dev
> > >> Hi All,
> > >>
> > >> I would like to start voting on KIP-322 which would return new error
> > > code
> > >> for DeleteTopics API when topic deletion disabled.
> > >>
> > >>
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=87295558
> > >
> > >>
> > >> Thanks,
> > >>
> > >
> > >
> > >
> > > --
> > > *Gwen Shapira*
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter <
> > > https://twitter.com/ConfluentInc
> > >> | blog
> > > <
> > > http://www.confluent.io/blog
> > >>
> > >
> > >
> > >
> > >
> >
> 
> 
> 
>

Re: [VOTE] KIP-231: Improve the Required ACL of ListGroups API

2018-07-06 Thread Harsha

+1.

Thanks,
Harsha

On Fri, Jun 1st, 2018 at 10:21 AM, "Vahid S Hashemian" 
 wrote:

> 
> 
> 
> I'm bumping this vote thread up as the KIP requires only one binding +1 to
> 
> pass.
> The KIP is very similar in nature to the recently approved KIP-277 (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API
> 
> ) and proposes a small improvement to make APIs' minimum required
> permissions more consistent.
> 
> Thanks.
> --Vahid
> 
> 
> 
> 
> From: Vahid S Hashemian/Silicon Valley/IBM
> To: dev < dev@kafka.apache.org >
> Date: 12/19/2017 11:30 AM
> Subject: [VOTE] KIP-231: Improve the Required ACL of ListGroups API
> 
> 
> I believe the concerns on this KIP have been addressed so far.
> Therefore, I'd like to start a vote.
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-231%3A+Improve+the+Required+ACL+of+ListGroups+API
> 
> 
> Thanks.
> --Vahid
> 
> 
> 
> 
> 
> 
>

Re: [VOTE] 1.1.1 RC3

2018-07-09 Thread Harsha

+1.

* Ran unit tests
* Installed in a cluster and ran simple tests

Thanks,
Harsha

On Mon, Jul 9th, 2018 at 6:38 AM, Ted Yu  wrote:

> 
> 
> 
> +1
> 
> Ran test suite.
> 
> Checked signatures.
> 
> 
> 
> On Sun, Jul 8, 2018 at 3:36 PM Dong Lin < lindon...@gmail.com > wrote:
> 
> > Hello Kafka users, developers and client-developers,
> >
> > This is the fourth candidate for release of Apache Kafka 1.1.1.
> >
> > Apache Kafka 1.1.1 is a bug-fix release for the 1.1 branch that was
> first
> > released with 1.1.0 about 3 months ago. We have fixed about 25 issues
> since
> > that release. A few of the more significant fixes include:
> >
> > KAFKA-6925 < https://issues.apache.org/jira/browse/KAFKA-6925> - Fix
> memory
> > leak in StreamsMetricsThreadImpl
> > KAFKA-6937 < https://issues.apache.org/jira/browse/KAFKA-6937> - In-sync
> 
> > replica delayed during fetch if replica throttle is exceeded
> > KAFKA-6917 < https://issues.apache.org/jira/browse/KAFKA-6917> - Process
> 
> > txn
> > completion asynchronously to avoid deadlock
> > KAFKA-6893 < https://issues.apache.org/jira/browse/KAFKA-6893> - Create
> > processors before starting acceptor to avoid ArithmeticException
> > KAFKA-6870 < https://issues.apache.org/jira/browse/KAFKA-6870> -
> > Fix ConcurrentModificationException in SampledStat
> > KAFKA-6878 < https://issues.apache.org/jira/browse/KAFKA-6878> - Fix
> > NullPointerException when querying global state store
> > KAFKA-6879 < https://issues.apache.org/jira/browse/KAFKA-6879> - Invoke
> > session init callbacks outside lock to avoid Controller deadlock
> > KAFKA-6857 < https://issues.apache.org/jira/browse/KAFKA-6857> - Prevent
> 
> > follower from truncating to the wrong offset if undefined leader epoch
> is
> > requested
> > KAFKA-6854 < https://issues.apache.org/jira/browse/KAFKA-6854> - Log
> > cleaner
> > fails with transaction markers that are deleted during clean
> > KAFKA-6747 < https://issues.apache.org/jira/browse/KAFKA-6747> - Check
> > whether there is in-flight transaction before aborting transaction
> > KAFKA-6748 < https://issues.apache.org/jira/browse/KAFKA-6748> - Double
> > check before scheduling a new task after the punctuate call
> > KAFKA-6739 < https://issues.apache.org/jira/browse/KAFKA-6739> -
> > Fix IllegalArgumentException when down-converting from V2 to V0/V1
> > KAFKA-6728 < https://issues.apache.org/jira/browse/KAFKA-6728> -
> > Fix NullPointerException when instantiating the HeaderConverter
> >
> > Kafka 1.1.1 release plan:
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.1.1
> >
> > Release notes for the 1.1.1 release:
> > http://home.apache.org/~lindong/kafka-1.1.1-rc3/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Thursday, July 12, 12pm PT ***
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~lindong/kafka-1.1.1-rc3/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~lindong/kafka-1.1.1-rc3/javadoc/
> >
> > * Tag to be voted upon (off 1.1 branch) is the 1.1.1-rc3 tag:
> > https://github.com/apache/kafka/tree/1.1.1-rc3
> >
> > * Documentation:
> > http://kafka.apache.org/11/documentation.html
> >
> > * Protocol:
> > http://kafka.apache.org/11/protocol.html
> >
> > * Successful Jenkins builds for the 1.1 branch:
> > Unit/integration tests: * https://builds.apache.org/job/kafka-1.1-jdk7/162
> 
> > < https://builds.apache.org/job/kafka-1.1-jdk7/162>*
> > System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/1.1/156/
> >
> > Please test and verify the release artifacts and submit a vote for this
> RC,
> > or report any issues so we can fix them and get a new RC out ASAP.
> Although
> > this release vote requires PMC votes to pass, testing, votes, and bug
> > reports are valuable and appreciated from everyone.
> >
> >
> > Regards,
> > Dong
> >
> 
> 
> 
> 
> 
>

Re: [VOTE] 2.0.0 RC2

2018-07-12 Thread Harsha

+1
1. Ran unit tests
2. Tested few use cases through 3-node cluster.

Thanks,
Harsha

On Thu, Jul 12, 2018, at 9:33 AM, Mickael Maison wrote:
> +1 non-binding
> Built from source, ran tests, ran quickstart and check signatures
> 
> Thanks!
> 
> 
> On Wed, Jul 11, 2018 at 10:48 PM, Jakub Scholz  wrote:
> > +1 (non-binbding) ... I built the RC2 from source, run tests and used it
> > with several of my applications without any problems.
> >
> > Thanks & Regards
> > Jakub
> >
> > On Tue, Jul 10, 2018 at 7:17 PM Rajini Sivaram 
> > wrote:
> >
> >> Hello Kafka users, developers and client-developers,
> >>
> >>
> >> This is the third candidate for release of Apache Kafka 2.0.0.
> >>
> >>
> >> This is a major version release of Apache Kafka. It includes 40 new  KIPs
> >> and
> >>
> >> several critical bug fixes. Please see the 2.0.0 release plan for more
> >> details:
> >>
> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
> >>
> >>
> >> A few notable highlights:
> >>
> >>- Prefixed wildcard ACLs (KIP-290), Fine grained ACLs for CreateTopics
> >>(KIP-277)
> >>- SASL/OAUTHBEARER implementation (KIP-255)
> >>- Improved quota communication and customization of quotas (KIP-219,
> >>KIP-257)
> >>- Efficient memory usage for down conversion (KIP-283)
> >>- Fix log divergence between leader and follower during fast leader
> >>failover (KIP-279)
> >>- Drop support for Java 7 and remove deprecated code including old scala
> >>clients
> >>- Connect REST extension plugin, support for externalizing secrets and
> >>improved error handling (KIP-285, KIP-297, KIP-298 etc.)
> >>- Scala API for Kafka Streams and other Streams API improvements
> >>(KIP-270, KIP-150, KIP-245, KIP-251 etc.)
> >>
> >>
> >> Release notes for the 2.0.0 release:
> >>
> >> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/RELEASE_NOTES.html
> >>
> >>
> >> *** Please download, test and vote by Friday, July 13, 4pm PT
> >>
> >>
> >> Kafka's KEYS file containing PGP keys we use to sign the release:
> >>
> >> http://kafka.apache.org/KEYS
> >>
> >>
> >> * Release artifacts to be voted upon (source and binary):
> >>
> >> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/
> >>
> >>
> >> * Maven artifacts to be voted upon:
> >>
> >> https://repository.apache.org/content/groups/staging/
> >>
> >>
> >> * Javadoc:
> >>
> >> http://home.apache.org/~rsivaram/kafka-2.0.0-rc2/javadoc/
> >>
> >>
> >> * Tag to be voted upon (off 2.0 branch) is the 2.0.0 tag:
> >>
> >> https://github.com/apache/kafka/tree/2.0.0-rc2
> >>
> >>
> >>
> >> * Documentation:
> >>
> >> http://kafka.apache.org/20/documentation.html
> >>
> >>
> >> * Protocol:
> >>
> >> http://kafka.apache.org/20/protocol.html
> >>
> >>
> >> * Successful Jenkins builds for the 2.0 branch:
> >>
> >> Unit/integration tests: https://builds.apache.org/job/kafka-2.0-jdk8/72/
> >>
> >> System tests:
> >> https://jenkins.confluent.io/job/system-test-kafka/job/2.0/27/
> >>
> >>
> >> /**
> >>
> >>
> >> Thanks,
> >>
> >>
> >> Rajini
> >>

Re: KIP-327: Add describe all topics API to AdminClient

2018-07-12 Thread Harsha

Very useful. LGTM.

Thanks,
Harsha

On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote:
> Hi all,
> 
> I have created a KIP to add describe all topics API to AdminClient .
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-327%3A+Add+describe+all+topics+API+to+AdminClient
> 
> Please take a look.
> 
> Thanks,

Re: [VOTE] 1.1.1 RC3

2018-07-12 Thread Harsha

+1.
1. Ran unit tests
2. Ran 3 node cluster to run few tests.

Thanks,
Harsha

On Thu, Jul 12, 2018, at 7:29 AM, Manikumar wrote:
> +1 (non-binding)  Ran tests,  Verified quick start,  producer/consumer perf
> tests
> 
> 
> 
> On Thu, Jul 12, 2018 at 11:06 AM Brett Rann 
> wrote:
> 
> > +1 (non binding)
> > rolling upgrade of shared staging multitenacy (200+ consumer groups)
> > cluster from 1.1.0 to 1.1.1-rc3 using the kafka_2.11-1.1.1.tgz artifact.
> > cluster looks healthy after upgrade. Lack of burrow lag suggests consumers
> > are still happy, and incoming messages remains the same.
> >
> > On Mon, Jul 9, 2018 at 8:36 AM Dong Lin  wrote:
> >
> > > Hello Kafka users, developers and client-developers,
> > >
> > > This is the fourth candidate for release of Apache Kafka 1.1.1.
> > >
> > > Apache Kafka 1.1.1 is a bug-fix release for the 1.1 branch that was first
> > > released with 1.1.0 about 3 months ago. We have fixed about 25 issues
> > since
> > > that release. A few of the more significant fixes include:
> > >
> > > KAFKA-6925 <https://issues.apache.org/jira/browse/KAFKA-6925
> > > <https://issues.apache.org/jira/browse/KAFKA-6925>> - Fix memory
> > > leak in StreamsMetricsThreadImpl
> > > KAFKA-6937 <https://issues.apache.org/jira/browse/KAFKA-6937
> > > <https://issues.apache.org/jira/browse/KAFKA-6937>> - In-sync
> > > replica delayed during fetch if replica throttle is exceeded
> > > KAFKA-6917 <https://issues.apache.org/jira/browse/KAFKA-6917
> > > <https://issues.apache.org/jira/browse/KAFKA-6917>> - Process txn
> > > completion asynchronously to avoid deadlock
> > > KAFKA-6893 <https://issues.apache.org/jira/browse/KAFKA-6893
> > > <https://issues.apache.org/jira/browse/KAFKA-6893>> - Create
> > > processors before starting acceptor to avoid ArithmeticException
> > > KAFKA-6870 <https://issues.apache.org/jira/browse/KAFKA-6870
> > > <https://issues.apache.org/jira/browse/KAFKA-6870>> -
> > > Fix ConcurrentModificationException in SampledStat
> > > KAFKA-6878 <https://issues.apache.org/jira/browse/KAFKA-6878
> > > <https://issues.apache.org/jira/browse/KAFKA-6878>> - Fix
> > > NullPointerException when querying global state store
> > > KAFKA-6879 <https://issues.apache.org/jira/browse/KAFKA-6879
> > > <https://issues.apache.org/jira/browse/KAFKA-6879>> - Invoke
> > > session init callbacks outside lock to avoid Controller deadlock
> > > KAFKA-6857 <https://issues.apache.org/jira/browse/KAFKA-6857
> > > <https://issues.apache.org/jira/browse/KAFKA-6857>> - Prevent
> > > follower from truncating to the wrong offset if undefined leader epoch is
> > > requested
> > > KAFKA-6854 <https://issues.apache.org/jira/browse/KAFKA-6854
> > > <https://issues.apache.org/jira/browse/KAFKA-6854>> - Log cleaner
> > > fails with transaction markers that are deleted during clean
> > > KAFKA-6747 <https://issues.apache.org/jira/browse/KAFKA-6747
> > > <https://issues.apache.org/jira/browse/KAFKA-6747>> - Check
> > > whether there is in-flight transaction before aborting transaction
> > > KAFKA-6748 <https://issues.apache.org/jira/browse/KAFKA-6748
> > > <https://issues.apache.org/jira/browse/KAFKA-6748>> - Double
> > > check before scheduling a new task after the punctuate call
> > > KAFKA-6739 <https://issues.apache.org/jira/browse/KAFKA-6739
> > > <https://issues.apache.org/jira/browse/KAFKA-6739>> -
> > > Fix IllegalArgumentException when down-converting from V2 to V0/V1
> > > KAFKA-6728 <https://issues.apache.org/jira/browse/KAFKA-6728
> > > <https://issues.apache.org/jira/browse/KAFKA-6728>> -
> > > Fix NullPointerException when instantiating the HeaderConverter
> > >
> > > Kafka 1.1.1 release plan:
> > > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.1.1
> > > <https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.1.1>
> > >
> > > Release notes for the 1.1.1 release:
> > > http://home.apache.org/~lindong/kafka-1.1.1-rc3/RELEASE_NOTES.html
> > > <http://home.apache.org/~lindong/kafka-1.1.1-rc3/RELEASE_NOTES.html>
> > >
> > > *** Please download, test and vote by Thursday, July 12, 12pm PT ***
> > >
> > > Kafka's KEYS file containing PGP keys we use to sign the release:
> > > http://kafka

Re: [VOTE] 1.1.0 RC2

2018-03-14 Thread Harsha

+1
Ran tests
Ran a 3 node cluster to test basic operations.

Thanks,
Harsha

On Wed, Mar 14, 2018, at 9:04 AM, Ted Yu wrote:
> +1
> 
> Ran test suite - passed (apart from testMetricsLeak which is flaky).
> 
> On Wed, Mar 14, 2018 at 3:30 AM, Damian Guy  wrote:
> 
> > Thanks for pointing out Satish. Links updated:
> >
> > 
> >
> > Hello Kafka users, developers and client-developers,
> >
> > This is the third candidate for release of Apache Kafka 1.1.0.
> >
> > This is minor version release of Apache Kakfa. It Includes 29 new KIPs.
> > Please see the release plan for more details:
> >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957546
> >
> > A few highlights:
> >
> > * Significant Controller improvements (much faster and session expiration
> > edge cases fixed)
> > * Data balancing across log directories (JBOD)
> > * More efficient replication when the number of partitions is large
> > * Dynamic Broker Configs
> > * Delegation tokens (KIP-48)
> > * Kafka Streams API improvements (KIP-205 / 210 / 220 / 224 / 239)
> >
> > Release notes for the 1.1.0 release:
> > http://home.apache.org/~damianguy/kafka-1.1.0-rc2/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Friday, March 16, 1pm PDT>
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~damianguy/kafka-1.1.0-rc2/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~damianguy/kafka-1.1.0-rc2/javadoc/
> >
> > * Tag to be voted upon (off 1.1 branch) is the 1.1.0 tag:
> > https://github.com/apache/kafka/tree/1.1.0-rc2
> >
> >
> > * Documentation:
> > http://kafka.apache.org/11/documentation.html
> > <http://kafka.apache.org/1/documentation.html>
> >
> > * Protocol:
> > http://kafka.apache.org/11/protocol.html
> > <http://kafka.apache.org/1/protocol.html>
> >
> > * Successful Jenkins builds for the 1.1 branch:
> > Unit/integration tests: https://builds.apache.org/job/kafka-1.1-jdk7/78
> > System tests: https://jenkins.confluent.io/job/system-test-kafka/job/1.1/
> > 38/
> >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75957546
> >
> > On Wed, 14 Mar 2018 at 04:41 Satish Duggana 
> > wrote:
> >
> > > Hi Damian,
> > > Thanks for starting vote thread for 1.1.0 release.
> > >
> > > There may be a typo on the tag to be voted upon for this release
> > candidate.
> > > I guess it should be https://github.com/apache/kafka/tree/1.1.0-rc2
> > > instead
> > > of https://github.com/apache/kafka/tree/1.1.0-rc.
> > >
> > > On Wed, Mar 14, 2018 at 8:27 AM, Satish Duggana <
> > satish.dugg...@gmail.com>
> > > wrote:
> > >
> > > > Hi Damian,
> > > > Given release plan link in earlier mail is about 1.0 release. You may
> > > want
> > > > to replace that with 1.1.0 release plan link[1].
> > > >
> > > > 1 - https://cwiki.apache.org/confluence/pages/viewpage.
> > > > action?pageId=75957546
> > > >
> > > > Thanks,
> > > > Satish.
> > > >
> > > > On Wed, Mar 14, 2018 at 12:47 AM, Damian Guy 
> > > wrote:
> > > >
> > > >> Hello Kafka users, developers and client-developers,
> > > >>
> > > >> This is the third candidate for release of Apache Kafka 1.1.0.
> > > >>
> > > >> This is minor version release of Apache Kakfa. It Includes 29 new
> > KIPs.
> > > >> Please see the release plan for more details:
> > > >>
> > > >>
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=71764913
> > > >>
> > > >> A few highlights:
> > > >>
> > > >> * Significant Controller improvements (much faster and session
> > > expiration
> > > >> edge cases fixed)
> > > >> * Data balancing across log directories (JBOD)
> > > >> * More efficient replication when the number of partitions is large
> > > >> * Dynamic Broker Configs
> > > >> * Delegation tokens (KIP-48)
> > > >>

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Harsha

+1 (binding)

Thanks,
Harsha

On Tue, Aug 7, 2018, at 10:22 AM, Manikumar wrote:
> +1 (non-binding)
> 
> Thanks for the KIP.
> 
> On Tue, Aug 7, 2018 at 10:42 PM Ray Chiang  wrote:
> 
> > +1 (non-binding)
> >
> > -Ray
> >
> > On 8/7/18 9:26 AM, Ted Yu wrote:
> > > +1
> > >
> > > On Tue, Aug 7, 2018 at 5:25 AM Thomas Becker 
> > wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> We've hit issues with the log cleaner in the past, and this would be a
> > >> great improvement.
> > >> On Tue, 2018-08-07 at 12:19 +0100, Stanislav Kozlovski wrote:
> > >>
> > >> Hey everybody,
> > >>
> > >> I'm starting a vote on KIP-346
> > >>
> > >> <
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
> > >>
> > >>
> > >> 
> > >>
> > >> This email and any attachments may contain confidential and privileged
> > >> material for the sole use of the intended recipient. Any review,
> > copying,
> > >> or distribution of this email (or any attachments) by others is
> > prohibited.
> > >> If you are not the intended recipient, please contact the sender
> > >> immediately and permanently delete this email and any attachments. No
> > >> employee or agent of TiVo Inc. is authorized to conclude any binding
> > >> agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo
> > >> Inc. may only be made by a signed written agreement.
> > >>
> >
> >

Re: [DISCUSS] KIP-357: Add support to list ACLs per principal

2018-08-23 Thread Harsha

+1 (binding)

Thanks,
Harsha

On Wed, Aug 22, 2018, at 9:15 AM, Manikumar wrote:
> Hi Viktor,
> We already have a method in Authorizer interface to get acls for a given
> principal.
> We will use this method to fetch acls and filter the results for requested
> Resources.
> Authorizer {
>def getAcls(principal: KafkaPrincipal): Map[Resource, Set[Acl]]
> }
> Currently AdminClient API doesn't have a API to fetch acls for a given
> principal.
> So while using AclCommand with AdminClient API (KIP-332), we just filter
> the results returned
> from describeAcls API. We can add new AdminClient API/new
> DescribeAclsRequest if required in future.
> 
> Updated the KIP. Thanks for the review.
> 
> Thanks,
> 
> On Wed, Aug 22, 2018 at 5:53 PM Viktor Somogyi-Vass 
> wrote:
> 
> > Hi Manikumar,
> >
> > Implementation-wise is it just a filter over the returned ACL listing or do
> > you plan to add new methods to the Authorizer as well?
> >
> > Thanks,
> > Viktor
> >
> > On Fri, Aug 17, 2018 at 9:18 PM Priyank Shah 
> > wrote:
> >
> > > +1(non-binding)
> > >
> > > Thanks.
> > > Priyank
> > >
> > > On 8/16/18, 6:01 AM, "Manikumar"  wrote:
> > >
> > > Hi all,
> > >
> > > I have created a minor KIP to add support to list ACLs per principal
> > > using
> > > AclCommand (kafka-acls.sh)
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-357%3A++Add+support+to+list+ACLs+per+principal
> > >
> > > Please take a look.
> > >
> > > Thanks,
> > >
> > >
> > >
> >

Re: [VOTE] KIP-357: Add support to list ACLs per principal

2018-08-27 Thread Harsha

+1 (binding)

-Harsha

On Mon, Aug 27, 2018, at 12:46 PM, Jakub Scholz wrote:
> +1 (non-binding)
> 
> On Mon, Aug 27, 2018 at 6:24 PM Manikumar  wrote:
> 
> > Hi All,
> >
> > I would like to start voting on KIP-357 which allows to list ACLs per
> > principal using AclCommand (kafka-acls.sh)
> >
> > KIP:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-357%3A++Add+support+to+list+ACLs+per+principal
> >
> > Discussion Thread:
> >
> > https://lists.apache.org/thread.html/dc7f6005845a372a0a48a40872a32d9ece03807a4fb1bb89d3645afb@%3Cdev.kafka.apache.org%3E
> >
> > Thanks,
> > Manikumar
> >

Re: [VOTE] KIP-336: Consolidate ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer

2018-08-30 Thread Harsha

+1.
Thanks,
Harsha

On Thu, Aug 30, 2018, at 4:19 AM, Attila Sasvári wrote:
> Thanks for the KIP and the updates Viktor!
> 
> +1 (non-binding)
> 
> 
> 
> On Wed, Aug 29, 2018 at 10:44 AM Manikumar 
> wrote:
> 
> > +1 (non-binding)
> >
> > Thanks for the KIP.
> >
> > On Wed, Aug 29, 2018 at 1:41 AM Jason Gustafson 
> > wrote:
> >
> > > +1 Thanks for the updates.
> > >
> > > On Tue, Aug 28, 2018 at 1:15 AM, Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com> wrote:
> > >
> > > > Sure, I've added it. I'll also do the testing today.
> > > >
> > > > On Mon, Aug 27, 2018 at 5:03 PM Ismael Juma  wrote:
> > > >
> > > > > Thanks Viktor. I think it would be good to verify that existing
> > > > > ExtendedSerializer implementations work without recompiling. This
> > could
> > > > be
> > > > > done as a manual test. If you agree, I suggest adding it to the
> > testing
> > > > > plan section.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Mon, Aug 27, 2018 at 7:57 AM Viktor Somogyi-Vass <
> > > > > viktorsomo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks guys, I've updated my KIP with this info (so to keep
> > solution
> > > > #1).
> > > > > > If you find it good enough, please vote as well or let me know if
> > you
> > > > > think
> > > > > > something is missing.
> > > > > >
> > > > > > On Sat, Aug 25, 2018 at 1:14 AM Ismael Juma 
> > > wrote:
> > > > > >
> > > > > > > I'm OK with 1 too. It makes me a bit sad that we don't have a
> > path
> > > > for
> > > > > > > removing the method without headers, but it seems like the
> > simplest
> > > > and
> > > > > > > least confusing option (I am assuming that headers are not needed
> > > in
> > > > > the
> > > > > > > serializers in the common case).
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Fri, Aug 24, 2018 at 2:42 PM Jason Gustafson <
> > > ja...@confluent.io>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hey Viktor,
> > > > > > > >
> > > > > > > > Good summary. I agree that option 1) seems like the simplest
> > > choice
> > > > > > and,
> > > > > > > as
> > > > > > > > you note, we can always add the default implementation later.
> > > I'll
> > > > > > leave
> > > > > > > > Ismael to make a case for the circular forwarding approach ;)
> > > > > > > >
> > > > > > > > -Jason
> > > > > > > >
> > > > > > > > On Fri, Aug 24, 2018 at 3:02 AM, Viktor Somogyi-Vass <
> > > > > > > > viktorsomo...@gmail.com> wrote:
> > > > > > > >
> > > > > > > > > I think in the first draft I didn't provide an implementation
> > > for
> > > > > > them
> > > > > > > as
> > > > > > > > > it seemed very simple and straightforward. I looked up a
> > couple
> > > > of
> > > > > > > > > implementations of the ExtendedSerializers on github and the
> > > > > general
> > > > > > > > > behavior seems to be that they delegate to the 2 argument
> > > > > > (headerless)
> > > > > > > > > method:
> > > > > > > > >
> > > > > > > > > https://github.com/khoitnm/practice-spring-kafka-grpc/blob/
> > > > > > > > > a6fc9b3395762c4889807baedd822f4653d5dcdd/kafka-common/src/
> > > > > > > > > main/java/org/tnmk/common/kafka/serialization/protobuf/
> > > > > > > > > ProtobufSerializer.java
> > > > > > > > >
> > > > > >
> > > https://github.com/hong-zhu/nxgen/blob/5cf1427d4e1a8f5c7fab47955af32a
> > > > > > > > > 0d4f4873af/nxgen-kafka-client/src/main/java/nxgen/kafka/
&g

Re: [VOTE] KIP-110: Add Codec for ZStandard Compression

2018-09-12 Thread Harsha

+1 (binding).

Thanks,
Harsha

On Wed, Sep 12, 2018, at 4:56 PM, Jason Gustafson wrote:
> Great contribution! +1
> 
> On Wed, Sep 12, 2018 at 10:20 AM, Manikumar 
> wrote:
> 
> > +1 (non-binding).
> >
> > Thanks for the KIP.
> >
> > On Wed, Sep 12, 2018 at 10:44 PM Ismael Juma  wrote:
> >
> > > Thanks for the KIP, +1 (binding).
> > >
> > > Ismael
> > >
> > > On Wed, Sep 12, 2018 at 10:02 AM Dongjin Lee  wrote:
> > >
> > > > Hello, I would like to start a VOTE on KIP-110: Add Codec for ZStandard
> > > > Compression.
> > > >
> > > > The KIP:
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 110%3A+Add+Codec+for+ZStandard+Compression
> > > > Discussion thread:
> > > > https://www.mail-archive.com/dev@kafka.apache.org/msg88673.html
> > > >
> > > > Thanks,
> > > > Dongjin
> > > >
> > > > --
> > > > *Dongjin Lee*
> > > >
> > > > *A hitchhiker in the mathematical world.*
> > > >
> > > > *github:  <http://goog_969573159/>github.com/dongjinleekr
> > > > <http://github.com/dongjinleekr>linkedin:
> > > kr.linkedin.com/in/dongjinleekr
> > > > <http://kr.linkedin.com/in/dongjinleekr>slideshare:
> > > > www.slideshare.net/dongjinleekr
> > > > <http://www.slideshare.net/dongjinleekr>*
> > > >
> > >
> >

Re: [VOTE] KIP-367 Introduce close(Duration) to Producer and AdminClient instead of close(long, TimeUnit)

2018-09-12 Thread Harsha

+1 (Binding).
Thanks,
Harsha

On Wed, Sep 12, 2018, at 9:06 PM, vito jeng wrote:
> +1
> 
> 
> 
> ---
> Vito
> 
> On Mon, Sep 10, 2018 at 4:52 PM, Dongjin Lee  wrote:
> 
> > +1. (Non-binding)
> >
> > On Mon, Sep 10, 2018 at 4:13 AM Matthias J. Sax 
> > wrote:
> >
> > > Thanks a lot for the KIP.
> > >
> > > +1 (binding)
> > >
> > >
> > > -Matthias
> > >
> > >
> > > On 9/8/18 11:27 AM, Chia-Ping Tsai wrote:
> > > > Hi All,
> > > >
> > > > I'd like to put KIP-367 to the vote.
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=89070496
> > > >
> > > > --
> > > > Chia-Ping
> > > >
> > >
> > >
> >
> > --
> > *Dongjin Lee*
> >
> > *A hitchhiker in the mathematical world.*
> >
> > *github:  <http://goog_969573159/>github.com/dongjinleekr
> > <http://github.com/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
> > <http://kr.linkedin.com/in/dongjinleekr>slideshare:
> > www.slideshare.net/dongjinleekr
> > <http://www.slideshare.net/dongjinleekr>*
> >

Re: [DISCUSS] KIP-371: Add a configuration to build custom SSL principal name

2018-09-18 Thread Harsha

Hi Manikumar,
I am interested to know the reason for exposing this config, given a 
user has access to PrincipalBuilder interface to build their interpretation of 
an identity from the X509 certificates. Is this to simplify extraction of 
identity? and also there are other use cases where user's will extract 
SubjectAltName to construct the identity I guess thats not going to supported 
by this method.

Thanks,
Harsha

On Tue, Sep 18, 2018, at 8:25 AM, Manikumar wrote:
> Hi Rajini,
> 
> I don't have strong reasons for rejecting Option 2. I just felt Option 1 is
> sufficient for
> the common use-cases (extracting single field, like CN etc..).
> 
> We are open to go with Option 2, for more flexible mapping mechanism.
> Let us know, your preference.
> 
> Thanks,
> 
> 
> On Tue, Sep 18, 2018 at 8:05 PM Rajini Sivaram 
> wrote:
> 
> > Hi Manikumar,
> >
> > It wasn't entirely clear to me why Option 2 was rejected.
> >
> > On Tue, Sep 18, 2018 at 7:47 AM, Manikumar 
> > wrote:
> >
> > > Hi All,
> > >
> > > We would like to go with Option 1, which adds a new configuration
> > parameter
> > > pair of the form:
> > > ssl.principal.mapping.pattern, ssl.principal.mapping.value. This will
> > > fulfill the requirement for most of the common use cases.
> > >
> > > We would like to include the KIP in the upcoming release. If there no
> > > concerns, would like to start vote on this KIP.
> > >
> > > Thanks,
> > >
> > > On Fri, Sep 14, 2018 at 11:32 PM Priyank Shah 
> > > wrote:
> > >
> > > > Definitely a helpful change. +1 for Option 2.
> > > >
> > > > On 9/14/18, 10:52 AM, "Manikumar"  wrote:
> > > >
> > > > Hi Eno,
> > > >
> > > > Thanks for the review.
> > > >
> > > > Most often users want to extract one of the field (eg. CN). CN is
> > the
> > > > commonly used field.
> > > > For this simple change, users need to build and maintain the custom
> > > > principal builder class
> > > > and also package and deploy to the all brokers. Having configurable
> > > > rules
> > > > will be useful.
> > > >
> > > > Proposed mapping rules works on string representation of the X.500
> > > > distinguished name(RFC2253 format) [1].
> > > > Mapping rules can use the attribute types keywords defined in RFC
> > > 2253
> > > > (CN,
> > > > L, ST, O, OU, C, STREET, DC, UID).
> > > >
> > > > Any additional/custom attribute types are emitted as OIDs. To emit
> > > > additional attribute type keys,
> > > > we need to have OID -> attribute type keyword String mapping.[2]
> > > >
> > > > For example, String representation of X500Principal("CN=Duke,
> > > > OU=JavaSoft,
> > > > O=Sun Microsystems, C=US, EMAILADDRESS=t...@test.com")
> > > > will be "CN=Duke,OU=JavaSoft,O=Sun
> > > > Microsystems,C=US,1.2.840.113549.1.9.1=#
> > > 160d7465737440746573742e636f6d"
> > > >
> > > > If we have the OID - key mapping ("1.2.840.113549.1.9.1",
> > > > "emailAddress"),
> > > > the string will be
> > > > "CN=Duke,OU=JavaSoft,O=Sun Microsystems,C=US,emailAddress=
> > > > t...@test.com"
> > > >
> > > > Since we are not passing this mapping, we can not extarct using
> > > > additional
> > > > attribute type keyword string.
> > > > If the user want to extract additional attribute keys, we need to
> > > write
> > > > custom principal builder class.
> > > >
> > > > Hope the above helps. Update the KIP.
> > > >
> > > > [1]
> > > >
> > > > https://docs.oracle.com/javase/7/docs/api/javax/security/auth/x500/
> > > X500Principal.html#getName(java.lang.String)
> > > >
> > > > [2]
> > > >
> > > > https://docs.oracle.com/javase/7/docs/api/javax/security/auth/x500/
> > > X500Principal.html#getName(java.lang.String,%20java.util.Map)
> > > >
> > > > Thanks
> > > >
> > > > On Fri, Sep 14, 2018 at 7:44 PM Eno Thereska <
> > eno.there...@gmail.com
> > > >
> > > > wrote:
> > > >

Re: [DISCUSS] KIP-371: Add a configuration to build custom SSL principal name

2018-09-18 Thread Harsha

Thanks. I am also leaning towards option 2, as it will help the consistency of 
expressing such mapping between sasl and ssl.
-Harsha

On Tue, Sep 18, 2018, at 8:27 PM, Manikumar wrote:
> Hi Harsha,
> 
> Thanks for the review. Yes, As mentioned on the motivation section, this is
> to simply extracting fields from the certificates
> for the common use cases. Yes, we are not supporting extracting
> SubjectAltName using this KIP.
> 
> Thanks,
> 
> 
> On Wed, Sep 19, 2018 at 8:29 AM Harsha  wrote:
> 
> > Hi Manikumar,
> > I am interested to know the reason for exposing this config, given
> > a user has access to PrincipalBuilder interface to build their
> > interpretation of an identity from the X509 certificates. Is this to
> > simplify extraction of identity? and also there are other use cases where
> > user's will extract SubjectAltName to construct the identity I guess thats
> > not going to supported by this method.
> >
> > Thanks,
> > Harsha
> >
> > On Tue, Sep 18, 2018, at 8:25 AM, Manikumar wrote:
> > > Hi Rajini,
> > >
> > > I don't have strong reasons for rejecting Option 2. I just felt Option 1
> > is
> > > sufficient for
> > > the common use-cases (extracting single field, like CN etc..).
> > >
> > > We are open to go with Option 2, for more flexible mapping mechanism.
> > > Let us know, your preference.
> > >
> > > Thanks,
> > >
> > >
> > > On Tue, Sep 18, 2018 at 8:05 PM Rajini Sivaram 
> > > wrote:
> > >
> > > > Hi Manikumar,
> > > >
> > > > It wasn't entirely clear to me why Option 2 was rejected.
> > > >
> > > > On Tue, Sep 18, 2018 at 7:47 AM, Manikumar 
> > > > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > We would like to go with Option 1, which adds a new configuration
> > > > parameter
> > > > > pair of the form:
> > > > > ssl.principal.mapping.pattern, ssl.principal.mapping.value. This will
> > > > > fulfill the requirement for most of the common use cases.
> > > > >
> > > > > We would like to include the KIP in the upcoming release. If there no
> > > > > concerns, would like to start vote on this KIP.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > On Fri, Sep 14, 2018 at 11:32 PM Priyank Shah  > >
> > > > > wrote:
> > > > >
> > > > > > Definitely a helpful change. +1 for Option 2.
> > > > > >
> > > > > > On 9/14/18, 10:52 AM, "Manikumar" 
> > wrote:
> > > > > >
> > > > > > Hi Eno,
> > > > > >
> > > > > > Thanks for the review.
> > > > > >
> > > > > > Most often users want to extract one of the field (eg. CN). CN
> > is
> > > > the
> > > > > > commonly used field.
> > > > > > For this simple change, users need to build and maintain the
> > custom
> > > > > > principal builder class
> > > > > > and also package and deploy to the all brokers. Having
> > configurable
> > > > > > rules
> > > > > > will be useful.
> > > > > >
> > > > > > Proposed mapping rules works on string representation of the
> > X.500
> > > > > > distinguished name(RFC2253 format) [1].
> > > > > > Mapping rules can use the attribute types keywords defined in
> > RFC
> > > > > 2253
> > > > > > (CN,
> > > > > > L, ST, O, OU, C, STREET, DC, UID).
> > > > > >
> > > > > > Any additional/custom attribute types are emitted as OIDs. To
> > emit
> > > > > > additional attribute type keys,
> > > > > > we need to have OID -> attribute type keyword String
> > mapping.[2]
> > > > > >
> > > > > > For example, String representation of X500Principal("CN=Duke,
> > > > > > OU=JavaSoft,
> > > > > > O=Sun Microsystems, C=US, EMAILADDRESS=t...@test.com")
> > > > > > will be "CN=Duke,OU=JavaSoft,O=Sun
> > > > > > Microsystems,C=US,1.2.840.113549.1.9.1=#
> > > > > 160d7465737440746573742e636f6d"
&g

Re: [VOTE] KIP 368: Allow SASL Connections to Periodically Re-Authenticate

2018-09-19 Thread Harsha

KIP looks good. +1 (binding)

Thanks,
Harsha

On Wed, Sep 19, 2018, at 7:44 AM, Rajini Sivaram wrote:
> Hi Ron,
> 
> Thanks for the KIP!
> 
> +1 (binding)
> 
> On Tue, Sep 18, 2018 at 6:24 PM, Konstantin Chukhlomin 
> wrote:
> 
> > +1 (non binding)
> >
> > > On Sep 18, 2018, at 1:18 PM, michael.kamin...@nytimes.com wrote:
> > >
> > >
> > >
> > > On 2018/09/18 14:59:09, Ron Dagostino  wrote:
> > >> Hi everyone.  I would like to start the vote for KIP-368:
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 368%3A+Allow+SASL+Connections+to+Periodically+Re-Authenticate
> > >>
> > >> This KIP proposes adding the ability for SASL clients (and brokers when
> > a
> > >> SASL mechanism is the inter-broker protocol) to re-authenticate their
> > >> connections to brokers and for brokers to close connections that
> > continue
> > >> to use expired sessions.
> > >>
> > >> Ron
> > >>
> > >
> > > +1 (non binding)
> >
> >

Re: Issue in Creating the Kafka Consumer

2018-09-20 Thread Harsha


It looks like you are trying to connect to SASL Kafka broker? If that's
the case make sure you follow the 
dochttp://kafka.apache.org/documentation.html#security_jaas_client
to pass in JAAS config with the KafkaClient section to your
consumer. -Harsha
On Thu, Sep 20, 2018, at 8:31 AM, Sravanthi Gottam wrote:
> Hi Team,


>  


> I am facing the issue in creating the Kafka consumer. I am giving the
> jaas config file in the build path>  


>  


> 10:14:00,765 WARN  [org.springframework.web.context.support.Annotatio-
> nConfigWebApplicationContext] (ServerService Thread Pool -- 84)
> Exception encountered during context initialization - cancelling
> refresh attempt:
> _org.springframework.context.ApplicationContextException_: Failed to
> start bean 'org.springframework.kafka.config.internalKafkaListenerEnd-
> pointRegistry'; nested exception is
> _org.apache.kafka.common.KafkaException_: Failed to construct kafka
> consumer> 10:14:00,789 ERROR 
> [org.springframework.web.servlet.DispatcherServlet]
> (ServerService Thread Pool -- 84) Context initialization failed:
> _org.springframework.context.ApplicationContextException_: Failed to
> start bean 'org.springframework.kafka.config.internalKafkaListenerEnd-
> pointRegistry'; nested exception is
> _org.apache.kafka.common.KafkaException_: Failed to construct kafka
> consumer>at 
> org.springframework.context.support.DefaultLifecycleProcess-
>or.doStart(_DefaultLifecycleProcessor.java:176_) [spring-context-
>4.3.3.RELEASE.jar:4.3.3.RELEASE]>at 
> org.springframework.context.support.DefaultLifecycleProcess-
>or.access$200(_DefaultLifecycleProcessor.java:51_) [spring-context-
>4.3.3.RELEASE.jar:4.3.3.RELEASE]>at 
> org.springframework.context.support.DefaultLifecycleProcess-
>or$LifecycleGroup.start(_DefaultLifecycleProcessor.java:346_)
>[spring-context-4.3.3.RELEASE.jar:4.3.3.RELEASE]>at 
> org.springframework.context.support.DefaultLifecycleProcess-
>or.startBeans(_DefaultLifecycleProcessor.java:149_) [spring-context-
>4.3.3.RELEASE.jar:4.3.3.RELEASE]>  


> ... 24 more


> Caused by: _org.apache.kafka.common.KafkaException_:
> _java.lang.IllegalArgumentException_: No serviceName defined in either
> JAAS or Kafka config>at 
> org.apache.kafka.common.network.SaslChannelBuilder.configur-
>e(_SaslChannelBuilder.java:98_) [kafka-clients-0.11.0.0.jar:]>
> at org.apache.kafka.common.network.ChannelBuilders.create(_Cha-
>nnelBuilders.java:112_) [kafka-clients-0.11.0.0.jar:]>at 
> org.apache.kafka.common.network.ChannelBuilders.clientChann-
>elBuilder(_ChannelBuilders.java:58_) [kafka-clients-
>0.11.0.0.jar:]>at 
> org.apache.kafka.clients.ClientUtils.createChannelBuilder(_-
>ClientUtils.java:88_) [kafka-clients-0.11.0.0.jar:]>at 
> org.apache.kafka.clients.consumer.KafkaConsumer.(_Kaf-
>kaConsumer.java:695_) [kafka-clients-0.11.0.0.jar:]>... 36 more


> Caused by: _java.lang.IllegalArgumentException_: No serviceName
> defined in either JAAS or Kafka config>at 
> org.apache.kafka.common.security.kerberos.KerberosLogin.get-
>ServiceName(_KerberosLogin.java:298_) [kafka-clients-
>0.11.0.0.jar:]>at 
> org.apache.kafka.common.security.kerberos.KerberosLogin.con-
>figure(_KerberosLogin.java:87_) [kafka-clients-0.11.0.0.jar:]>
> at org.apache.kafka.common.security.authenticator.LoginManager-
>.(_LoginManager.java:49_) [kafka-clients-0.11.0.0.jar:]>
> at org.apache.kafka.common.security.authenticator.LoginManager-
>.acquireLoginManager(_LoginManager.java:73_) [kafka-clients-
>0.11.0.0.jar:]>at 
> org.apache.kafka.common.network.SaslChannelBuilder.configur-
>e(_SaslChannelBuilder.java:90_) [kafka-clients-0.11.0.0.jar:]>
> ... 40 more


>  


>  


> Thanks,


> *Sravanthi Gottam*


> Medical Management Systems


>  


> Centene logo


> 7930 Clayton Rd | St Louis, MO 63117


> Ext: 8099453


> _Sravanthi.gottam@centene.com_


>  


> CONFIDENTIALITY NOTICE: This communication contains information
> intended for the use of the individuals to whom it is addressed and
> may contain information that is privileged, confidential or exempt
> from other disclosure under applicable law. If you are not the
> intended recipient, you are notified that any disclosure, printing,
> copying, distribution or use of the contents is prohibited. If you
> have received this in error, please notify the sender immediately by
> telephone or by returning it by return mail and then permanently
> delete the communication from your system. Thank you.

Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

2019-01-24 Thread Harsha

Hi Rajini,
   Since you helped review the KIP if you don't mind can you vote on 
this KIP.
Thanks,
Harsha

On Wed, Jan 9, 2019, at 8:05 AM, Harsha wrote:
> HI All,
> We are looking forward to this KIP. Appreciate if others can 
> take a look at the kip and
> vote on this thread.
> 
> Thanks
> Harsha
> 
> On Fri, Dec 21, 2018, at 4:41 AM, Damian Guy wrote:
> > must be my gmail playing up. This appears to be the DISCUSS thread to me...
> > e
> > On Thu, 20 Dec 2018 at 18:42, Harsha  wrote:
> > 
> > > Damian,
> > >This is the VOTE thread. There is a DISCUSS thread which
> > > concluded in it.
> > >
> > > -Harsha
> > >
> > >
> > > On Wed, Dec 19, 2018, at 5:04 AM, Pellerin, Clement wrote:
> > > > I did that and nobody came.
> > > > https://lists.apache.org/list.html?dev@kafka.apache.org:lte=1M:kip-383
> > > > I don't understand why this feature is not more popular.
> > > > It's the solution to one Jira and a work-around for a handful more 
> > > > Jiras.
> > > >
> > > > -Original Message-
> > > > From: Damian Guy [mailto:damian@gmail.com]
> > > > Sent: Wednesday, December 19, 2018 7:38 AM
> > > > To: dev
> > > > Subject: Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL
> > > Factory
> > > >
> > > > Hi Clement,
> > > >
> > > > You should start a separate thread for the vote, i.e., one with a 
> > > > subject
> > > > of [VOTE] KIP-383 ...
> > > >
> > > > Looks like you haven't done this?
> > >

[DISCUSS] KIP-405: Kafka Tiered Storage

2019-02-04 Thread Harsha

 Hi All,
 We are interested in adding tiered storage to Kafka. More details 
about motivation and design are in the KIP.  We are working towards an initial 
POC. Any feedback or questions on this KIP are welcome.

Thanks,
Harsha

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-02-04 Thread Harsha

Hi Eric,
   Thanks for your questions. Answers are in-line

"The high-level design seems to indicate that all of the logic for when and
how to copy log segments to remote storage lives in the RLM class. The
default implementation is then HDFS specific with additional
implementations being left to the community. This seems like it would
require anyone implementing a new RLM to also re-implement the logic for
when to ship data to remote storage."

RLM will be responsible for shipping log segments and it will decide when a log 
segment is ready to be shipped over.
Once a Log Segement(s) are identified as rolled over,  RLM will delegate this 
responsibility to a pluggable remote storage implementation. Users who are 
looking add their own implementation to enable other storages all they need to 
do is to implement the copy and read mechanisms and not to re-implement RLM 
itself.

"Would it not be better for the Remote Log Manager implementation to be
non-configurable, and instead have an interface for the remote storage
layer? That way the "when" of the logic is consistent across all
implementations and it's only a matter of "how," similar to how the Streams
StateStores are managed."

It's possible that we can RLM non-configurable. But for the initial release and 
to keep the backward compatibility 
we want to make this configurable and for any users who might not be interested 
in having the LogSegments shipped to remote, they don't need to worry about 
this.

Hi Ryanne,
 Thanks for your questions.

"How could this be used to leverage fast key-value stores, e.g. Couchbase,
which can serve individual records but maybe not entire segments? Or is the
idea to only support writing and fetching entire segments? Would it make
sense to support both?"

LogSegment once its rolled over are immutable objects and we want to keep the 
current structure of LogSegments and corresponding Index files. It will be easy 
to copy the whole segment as it is, instead of re-reading each file and use a 
key/value store. 

"
- Instead of defining a new interface and/or mechanism to ETL segment files
from brokers to cold storage, can we just leverage Kafka itself? In
particular, we can already ETL records to HDFS via Kafka Connect, Gobblin
etc -- we really just need a way for brokers to read these records back.
I'm wondering whether the new API could be limited to the fetch, and then
existing ETL pipelines could be more easily leveraged. For example, if you
already have an ETL pipeline from Kafka to HDFS, you could leave that in
place and just tell Kafka how to read these records/segments from cold
storage when necessary."

This is pretty much what everyone does and it has the additional overhead of 
keeping these pipelines operating and monitoring.
What's proposed in the KIP is not ETL. It's just looking a the logs that are 
written and rolled over to copy the file as it is.
Each new topic needs to be added (sure we can do so via wildcard or another 
mechanism) but new topics need to be onboard to ship the data into remote 
storage through a traditional ETL pipeline.
Once the data lands somewhere like HDFS/HIVE etc.. Users need to write another 
processing line to re-process this data similar to how they are doing it in 
their Stream processing pipelines.  Tiered storage is to get away from this and 
make this transparent to the user.  They don't need to run another ETL process 
to ship the logs.

"I'm wondering if we could just add support for loading segments from
remote URIs instead of from file, i.e. via plugins for s3://, hdfs:// etc.
I suspect less broker logic would change in that case -- the broker
wouldn't necessarily care if it reads from file:// or s3:// to load a given
segment."

Yes, this is what we are discussing in KIP. We are leaving the details of 
loading segments to RLM read part instead of directly exposing this in the  
Broker. This way we can keep the current Kafka code as it is without changing 
the assumptions around the local disk. Let the RLM handle the remote storage 
part.

Thanks,
Harsha

On Mon, Feb 4, 2019, at 12:54 PM, Ryanne Dolan wrote:
> Harsha, Sriharsha, Suresh, a couple thoughts:
> 
> - How could this be used to leverage fast key-value stores, e.g. Couchbase,
> which can serve individual records but maybe not entire segments? Or is the
> idea to only support writing and fetching entire segments? Would it make
> sense to support both?
> 
> - Instead of defining a new interface and/or mechanism to ETL segment files
> from brokers to cold storage, can we just leverage Kafka itself? In
> particular, we can already ETL records to HDFS via Kafka Connect, Gobblin
> etc -- we really just need a way for brokers to read these records back.
> I'm wondering whether the new API could be limited to the fetch, and then
> existing ETL pipelines co

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-02-06 Thread Harsha

Thanks Eno, Adam & Satish for you review and questions. I'll address these in 
KIP and update the thread here. 

Thanks,
Harsha

On Wed, Feb 6, 2019, at 7:09 AM, Satish Duggana wrote:
> Thanks, Harsha for the KIP. It is a good start for tiered storage in
> Kafka. I have a few comments/questions.
> 
> It may be good to have a configuration to keep the number of local
> segments instead of keeping only the active segment. This config can
> be exposed at cluster and topic levels with default value as 1. In
> some use cases, few consumers may lag over one segment, it will be
> better to serve from local storage instead of remote storage.
> 
> It may be better to keep “remote.log.storage.enable” and respective
> configuration at topic level along with cluster level. It will be
> helpful in environments where few topics are configured with
> local-storage and other topics are configured with remote storage.
> 
> Each topic-partition leader pushes its log segments with respective
> index files to remote whenever active log rolls over, it updates the
> remote log index file for the respective remote log segment. The
> second option is to add offset index files also for each segment. It
> can serve consumer fetch requests for old segments from local log
> segment instead of serving directly from the remote log which may
> cause high latencies. There can be different strategies in when the
> remote segment is copied to a local segment.
> 
> What is “remote.log.manager.scheduler.interval.ms” config about?
> 
> How do followers sync RemoteLogSegmentIndex files? Do they request
> from leader replica? This looks to be important as the failed over
> leader should have RemoteLogSegmentIndex updated and ready to avoid
> high latencies in serving old data stored in remote logs.
> 
> Thanks,
> Satish.
> 
> On Tue, Feb 5, 2019 at 10:42 PM Ryanne Dolan  wrote:
> >
> > Thanks Harsha, makes sense.
> >
> > Ryanne
> >
> > On Mon, Feb 4, 2019 at 5:53 PM Harsha Chintalapani  wrote:
> >
> > > "I think you are saying that this enables additional (potentially cheaper)
> > > storage options without *requiring* an existing ETL pipeline. “
> > > Yes.
> > >
> > > " But it's not really a replacement for the sort of pipelines people build
> > > with Connect, Gobblin etc.”
> > >
> > > It is not. But also making an assumption that everyone runs these
> > > pipelines for storing raw Kafka data into HDFS or S3 is also wrong
> > >  assumption.
> > > The aim of this KIP is to provide tiered storage as whole package not
> > > asking users to ship the data on their own using existing ETL, which means
> > > running a consumer and maintaining those pipelines.
> > >
> > > " My point was that, if you are already offloading records in an ETL
> > > pipeline, why do you need a new pipeline built into the broker to ship the
> > > same data to the same place?”
> > >
> > > As you said its ETL pipeline, which means users of these pipelines are
> > > reading the data from broker and transforming its state and storing it
> > > somewhere.
> > > The point of this KIP is store log segments as it is without changing
> > > their structure so that we can use the existing offset mechanisms to look
> > > it up when the consumer needs to read old data. When you do load it via
> > > your existing pipelines you are reading the topic as a whole , which
> > > doesn’t guarantee that you’ll produce this data back into HDFS in S3 in 
> > > the
> > > same order and who is going to generate the Index files again.
> > >
> > >
> > > "So you'd end up with one of 1)cold segments are only useful to Kafka; 2)
> > > you have the same data written to HDFS/etc twice, once for Kafka and once
> > > for everything else, in two separate formats”
> > >
> > > You are talking two different use cases. If someone is storing raw data
> > > out of Kafka for long term access.
> > > By storing the data as it is in HDFS though Kafka will solve this issue.
> > > They do not need to run another pipe-line to ship these logs.
> > >
> > > If they are running pipelines to store in HDFS in a different format,
> > > thats a different use case. May be they are transforming Kafka logs to ORC
> > > so that they can query through Hive.  Once you transform the log segment 
> > > it
> > > does loose its ability to use the existing offset index.
> > > Main objective here not to change the existing protocol and still be able

Re: [VOTE] KIP-412: Extend Admin API to support dynamic application log levels

2019-02-20 Thread Harsha

+1 (binding).

Thanks,
Harsha

On Tue, Feb 19, 2019, at 7:53 AM, Andrew Schofield wrote:
> Thanks for the KIP.
> 
> +1 (non-binding)
> 
> On 18/02/2019, 12:48, "Stanislav Kozlovski"  wrote:
> 
> Hey everybody, I'm starting a VOTE thread for KIP-412. This feature should
> significantly improve the flexibility and ease in debugging Kafka in run
> time
> 
> KIP-412 -
> 
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-412%253A%2BExtend%2BAdmin%2BAPI%2Bto%2Bsupport%2Bdynamic%2Bapplication%2Blog%2Blevels&data=02%7C01%7C%7C69bc63a9d7864e25ec3c08d69596eec4%7C84df9e7fe9f640afb435%7C1%7C0%7C636860872825557120&sdata=XAnMhy6EPC7JkB77NBBhLR%2FvE7XrTutuS5Rlt%2FDpwfU%3D&reserved=0
> 
> 
> -- 
> Best,
> Stanislav
> 
> 
>

Re: [VOTE] KIP-430 - Return Authorized Operations in Describe Responses

2019-02-21 Thread Harsha

+1 (binding).

Thanks,
Harsha

On Thu, Feb 21, 2019, at 2:49 AM, Satish Duggana wrote:
> Thanks for the KIP, +1 (non-binding)
> 
> ~Satish.
> 
> On Thu, Feb 21, 2019 at 3:58 PM Rajini Sivaram  
> wrote:
> >
> > I would like to start vote on KIP-430 to optionally obtain authorized
> > operations when describing resources:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-430+-+Return+Authorized+Operations+in+Describe+Responses
> >
> > Thank you,
> >
> > Rajini
>

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-02-21 Thread Harsha

Thanks George. LGTM.
Jun & Tom, Can you please take a look at the updated KIP.
Thanks,
Harsha

On Wed, Feb 20, 2019, at 12:18 PM, George Li wrote:
> Hi,
> 
> After discussing with Tom, Harsha and I are picking up KIP-236 
> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-236%3A+Interruptible+Partition+Reassignment>.
>  The work focused on safely/cleanly cancel / rollback pending reassignments 
> in a timely fashion. Pull Request #6296 
> <https://github.com/apache/kafka/pull/6296> Still working on more 
> integration/system tests. 
> 
> Please review and provide feedbacks/suggestions. 
> 
> Thanks,
> George
> 
> 
> On Saturday, December 23, 2017, 0:51:13 GMT, Jun Rao  
> wrote:
> 
> Hi, Tom,

Thanks for the reply.

10. That's a good thought. Perhaps it's better to get rid of
/admin/reassignment_requests
too. The window when a controller is not available is small. So, we can
just failed the admin client if the controller is not reachable after the
timeout.

13. With the changes in 10, the old approach is handled through ZK callback
and the new approach is through Kafka RPC. The ordering between the two is
kind of arbitrary. Perhaps the ordering can just be based on the order that
the reassignment is added to the controller request queue. From there, we
can either do the overriding or the prevention.

Jun


On Fri, Dec 22, 2017 at 7:31 AM, Tom Bentley  wrote:

> Hi Jun,
>
> Thanks for responding, my replies are inline:
>
> 10. You explanation makes sense. My remaining concern is the additional ZK
> > writes in the proposal. With the proposal, we will need to do following
> > writes in ZK.
> >
> > a. write new assignment in /admin/reassignment_requests
> >
> > b. write new assignment and additional metadata in
> > /admin/reassignments/$topic/$partition
> >
> > c. write old + new assignment  in /brokers/topics/[topic]
> >
> > d. write new assignment in /brokers/topics/[topic]
> >
> > e. delete /admin/reassignments/$topic/$partition
> >
> > So, there are quite a few ZK writes. I am wondering if it's better to
> > consolidate the info in /admin/reassignments/$topic/$partition into
> > /brokers/topics/[topic].
> > For example, we can just add some new JSON fields in
> > /brokers/topics/[topic]
> > to remember the new assignment and potentially the original replica count
> > when doing step c. Those fields with then be removed in step d. That way,
> > we can get rid of step b and e, saving 2 ZK writes per partition.
> >
>
> This seems like a great idea to me.
>
> It might also be possible to get rid of the /admin/reassignment_requests
> subtree too. I've not yet published the ideas I have for the AdminClient
> API for reassigning partitions, but given the existence of such an API, the
> route to starting a reassignment would be the AdminClient, and not
> zookeeper. In that case there is no need for /admin/reassignment_requests
> at all. The only drawback that I can see is that while it's currently
> possible to trigger a reassignment even during a controller
> election/failover that would no longer be the case if all requests had to
> go via the controller.
>
>
> > 11. What you described sounds good. We could potentially optimize the
> > dropped replicas a bit more. Suppose that assignment [0,1,2] is first
> > changed to [1,2,3] and then to [2,3,4]. When initiating the second
> > assignment, we may end up dropping replica 3 and only to restart it
> again.
> > In this case, we could only drop a replica if it's not going to be added
> > back again.
> >
>
> I had missed that, thank you! I will update the proposed algorithm to
> prevent this.
>
>
> > 13. Since this is a corner case, we can either prevent or allow
> overriding
> > with old/new mechanisms. To me, it seems that allowing is simpler to
> > implement, the order in /admin/reassignment_requests determines the
> > ordering the of override, whether that's initiated by the new way or the
> > old way.
> >
>
> That makes sense except for the corner case where:
>
> * There is no current controller and
> * Writes to both the new and old znodes happen
>
> On election of the new controller, for those partitions with both a
> reassignment_request and in /admin/reassign_partitions, we have to decide
> which should win. You could use the modification time, though there are
> some very unlikely scenarios where that doesn't work properly, for example
> if both znodes have the same mtime, or the /admin/reassign_partitions was
> updated, but the assignment of the partition wasn't changed, like this:
>
> 0. /admin/reassign

Re: [DISCUSS] KIP-435: Incremental Partition Reassignment

2019-02-22 Thread Harsha

Hi Viktor,
There is already KIP-236 for the same feature and George made a PR 
for this as well.
Lets consolidate these two discussions. If you have any cases that are not 
being solved by KIP-236 can you please mention them in that thread. We can 
address as part of KIP-236.

Thanks,
Harsha

On Fri, Feb 22, 2019, at 5:44 AM, Viktor Somogyi-Vass wrote:
> Hi Folks,
> 
> I've created a KIP about an improvement of the reassignment algorithm we
> have. It aims to enable partition-wise incremental reassignment. The
> motivation for this is to avoid excess load that the current replication
> algorithm implicitly carries as in that case there are points in the
> algorithm where both the new and old replica set could be online and
> replicating which puts double (or almost double) pressure on the brokers
> which could cause problems.
> Instead my proposal would slice this up into several steps where each step
> is calculated based on the final target replicas and the current replica
> assignment taking into account scenarios where brokers could be offline and
> when there are not enough replicas to fulfil the min.insync.replica
> requirement.
> 
> The link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-435%3A+Incremental+Partition+Reassignment
> 
> I'd be happy to receive any feedback.
> 
> An important note is that this KIP and another one, KIP-236 that is 
> about
> interruptible reassignment (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-236%3A+Interruptible+Partition+Reassignment)
> should be compatible.
> 
> Thanks,
> Viktor
>

Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-25 Thread Harsha

I think the motivation of the KIP is to configure which API we want to allow 
for a broker.
This is challenging for a hosted service where you have customers with 
different versions of clients.
It's not just about down conversion but for example transactions, there is a 
case where we do not want to allow users to start using transactions and there 
is no way to disable to this right now and as specified in the KIP, having a 
lock on which client versions we support.
Authorizer's original purpose is to allow policies to be enforced for each of 
the Kafka APIs, specifically in the context of security.
Extending this to a general purpose gatekeeper might not be suitable and as 
mentioned in the thread every implementation of authorizer needs to 
re-implement to provide the same set of functionality.
I think it's better to add an implementation which will use a broker's dynamic 
config as mentioned in approach 1.

Thanks,
Harsha

On Sat, Feb 23, 2019, at 6:21 AM, Ismael Juma wrote:
> Thanks for the KIP. Have we considered the existing topic config that makes
> it possible to disallow down conversions? That's the biggest downside in
> allowing older clients.
> 
> Ismael
> 
> On Fri, Feb 22, 2019, 2:11 PM Ying Zheng  wrote:
> 
> >
> >
>

Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-25 Thread Harsha

Hi Ying,
I think the question is can we add a module in the core which can take 
up the dynamic config and does a block certain APIs.  This module will be 
called in each of the APIs like the authorizer does today to check if the API 
is supported for the client. 
Instead of throwing AuthorizationException like the authorizer does today it 
can throw UnsupportedException.
Benefits are,  we are keeping the authorizer interface as is and adding the 
flexibility based on dynamic configs without the need for categorizing broker 
APIs and it will be easy to extend to do additional options,  like turning off 
certain features which might be in interest to the service providers.
One drawback,  It will introduce another call to check instead of centralizing 
everything around Authorizer.

Thanks,
Harsha

On Mon, Feb 25, 2019, at 2:43 PM, Ying Zheng wrote:
> If you guys don't like the extension of authorizer interface, I will just
> propose a single broker dynamic configuration: client.min.api.version, to
> keep things simple.
> 
> What do you think?
> 
> On Mon, Feb 25, 2019 at 2:23 PM Ying Zheng  wrote:
> 
> > @Viktor Somogyi-Vass, @Harsha, It seems the biggest concern is the
> > backward-compatibility to the existing authorizers. We can put the new
> > method into a new trait / interface:
> > trait AuthorizerEx extends Authorizer {
> >def authorize(session: Session, operation: Operation, resource: Resource,
> > apiVersion: Short): Boolean
> > }
> >
> > When loading an authorizer class, broker will check if the class
> > implemented AuthorizerEx interface. If not, broker will wrapper the
> > Authorizer object with an Adapter class, in which authorizer(...
> > apiVersion) call is translated to the old authorizer() call. So that, both
> > old and new Authorizer is supported and can be treated as AuthorizerEx in
> > the new broker code.
> >
> > As for the broker dynamic configuration approach, I'm not sure how to
> > correctly categorize the 40+ broker APIs into a few categories.
> > For example, describe is used by producer, consumer, and admin. Should it
> > be controlled by producer.min.api.version or consumer.min.api.version?
> > Should producer.min.api.version apply to transaction operations?
> >
> >
> > On Mon, Feb 25, 2019 at 10:33 AM Harsha  wrote:
> >
> >> I think the motivation of the KIP is to configure which API we want to
> >> allow for a broker.
> >> This is challenging for a hosted service where you have customers with
> >> different versions of clients.
> >> It's not just about down conversion but for example transactions, there
> >> is a case where we do not want to allow users to start using transactions
> >> and there is no way to disable to this right now and as specified in the
> >> KIP, having a lock on which client versions we support.
> >> Authorizer's original purpose is to allow policies to be enforced for
> >> each of the Kafka APIs, specifically in the context of security.
> >> Extending this to a general purpose gatekeeper might not be suitable and
> >> as mentioned in the thread every implementation of authorizer needs to
> >> re-implement to provide the same set of functionality.
> >> I think it's better to add an implementation which will use a broker's
> >> dynamic config as mentioned in approach 1.
> >>
> >> Thanks,
> >> Harsha
> >>
> >> On Sat, Feb 23, 2019, at 6:21 AM, Ismael Juma wrote:
> >> > Thanks for the KIP. Have we considered the existing topic config that
> >> makes
> >> > it possible to disallow down conversions? That's the biggest downside in
> >> > allowing older clients.
> >> >
> >> > Ismael
> >> >
> >> > On Fri, Feb 22, 2019, 2:11 PM Ying Zheng 
> >> wrote:
> >> >
> >> > >
> >> > >
> >> >
> >>
> >
>

Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-26 Thread Harsha

Hi Colin,
  
"> I think Ismael and Gwen here bring up a good point.  The version of the 
> request is a technical detail that isn't really related to 
> authorization.  There are a lot of other technical details like this 
> like the size of the request, the protocol it came in on, etc.  None of 
> them are passed to the authorizer-- they all have configuration knobs 
> to control how we handle them.  If we add this technical detail, 
> logically we'll have to start adding all the others, and the authorizer 
> API will get really bloated.  It's better to keep it focused on 
> authorization, I think."

probably my previous email is not clear but I am agreeing with Gwen's point. 
I am not in favor of extending authorizer to support this.


"> Another thing to consider is that if we add a new broker configuration 
> that lets us set a minimum client version which is allowed, that could 
> be useful to other users as well.  On the other hand, most users are 
> not likely to write a custom authorizer to try to take advantage of 
> version information being passed to the authorizer.  So, I think using> a 
> configuration is clearly the better way to go here.  Perhaps it can 
> be a KIP-226 dynamic configuration to make this easier to deploy?"

Although minimum client version might help to a certain extent there are other 
cases where we want users to not start using transactions for example. My 
proposal in the previous thread was to introduce another module/interface, 
let's say
"SupportedAPIs" which will take in dynamic configuration to check which APIs 
are allowed. 
It can throw UnsupportedException just like we are throwing Authorization 
Exception.


Thanks,
Harsha


n Tue, Feb 26, 2019, at 10:04 AM, Colin McCabe wrote:
> Hi Harsha,
> 
> I think Ismael and Gwen here bring up a good point.  The version of the 
> request is a technical detail that isn't really related to 
> authorization.  There are a lot of other technical details like this 
> like the size of the request, the protocol it came in on, etc.  None of 
> them are passed to the authorizer-- they all have configuration knobs 
> to control how we handle them.  If we add this technical detail, 
> logically we'll have to start adding all the others, and the authorizer 
> API will get really bloated.  It's better to keep it focused on 
> authorization, I think.
> 
> Another thing to consider is that if we add a new broker configuration 
> that lets us set a minimum client version which is allowed, that could 
> be useful to other users as well.  On the other hand, most users are 
> not likely to write a custom authorizer to try to take advantage of 
> version information being passed to the authorizer.  So, I think  using 
> a configuration is clearly the better way to go here.  Perhaps it can 
> be a KIP-226 dynamic configuration to make this easier to deploy?
> 
> cheers,
> Colin
> 
> 
> On Mon, Feb 25, 2019, at 15:43, Harsha wrote:
> > Hi Ying,
> > I think the question is can we add a module in the core which 
> > can take up the dynamic config and does a block certain APIs.  This 
> > module will be called in each of the APIs like the authorizer does 
> > today to check if the API is supported for the client. 
> > Instead of throwing AuthorizationException like the authorizer does 
> > today it can throw UnsupportedException.
> > Benefits are,  we are keeping the authorizer interface as is and adding 
> > the flexibility based on dynamic configs without the need for 
> > categorizing broker APIs and it will be easy to extend to do additional 
> > options,  like turning off certain features which might be in interest 
> > to the service providers.
> > One drawback,  It will introduce another call to check instead of 
> > centralizing everything around Authorizer.
> > 
> > Thanks,
> > Harsha
> > 
> > On Mon, Feb 25, 2019, at 2:43 PM, Ying Zheng wrote:
> > > If you guys don't like the extension of authorizer interface, I will just
> > > propose a single broker dynamic configuration: client.min.api.version, to
> > > keep things simple.
> > > 
> > > What do you think?
> > > 
> > > On Mon, Feb 25, 2019 at 2:23 PM Ying Zheng  wrote:
> > > 
> > > > @Viktor Somogyi-Vass, @Harsha, It seems the biggest concern is the
> > > > backward-compatibility to the existing authorizers. We can put the new
> > > > method into a new trait / interface:
> > > > trait AuthorizerEx extends Authorizer {
> > > >def authorize(session: Session, operation: Operation, resource: 
> > > > Resource,
> > >

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

2019-02-26 Thread Harsha

Thanks for the KIP Kan.  I think the design will be simpler if we just 
deprecate storing broker.id in meta.properties and start storing it in 
zookeeper as you suggested. 

Thanks,
Harsha

On Tue, Feb 5, 2019, at 2:40 PM, Li Kan wrote:
> My bad, forgot to put the link to the KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-426%3A+Persist+Broker+Id+to+Zookeeper
> 
> On Tue, Feb 5, 2019 at 2:38 PM Li Kan  wrote:
> 
> > Hi, I have KIP-426, which is a small change on automatically determining
> > broker id when starting up. I am new to Kafka so there are a bunch of
> > design trade-offs that I might be missing or hard to decide, so I'd like to
> > get some suggestions on it. I'd expect (and open) to modify (or even
> > totally rewrite) the KIP based on suggestions. Thanks.
> >
> > --
> > Best,
> > Kan
> >
> 
> 
> -- 
> Best,
> Kan
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

2019-02-27 Thread Harsha

Hi Colin,
  What we want to is to preserve the broker.id so that we can do an 
offline rebuild of a broker. In our cases going through online Kafka 
replication to bring up, a failed node will put producer latencies at risk 
given the new broker will put all the other leaders busy with its replication 
requests. For an offline rebuild, we do not need to do rebalance as long as we 
can recover the broker.id
  Overall, irrespective of this use case we still want an ability to 
retrieve a broker.id for an existing host. This will make swapping in new hosts 
with failed hosts by keeping the existing hostname easier.

Thanks,
Harsha
On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> Hi Li,
> 
>  > The mechanism simplifies deployment because the same configuration can be 
>  > used across all brokers, however, in a large system where disk failure is 
>  > a norm, the meta file could often get lost, causing a new broker id being 
>  > allocated. This is problematic because new broker id has no partition 
>  > assigned to it so it can’t do anything, while partitions assigned to the 
>  > old one lose one replica
> 
> If all of the disks have failed, then the partitions will lose their 
> replicas no matter what, right?  If any of the disks is still around, 
> then there will be a meta file on the disk which contains the previous 
> broker ID.  So I'm not sure that we need to change anything here.
> 
> best,
> Colin
> 
> 
> On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > Hi, I have KIP-426, which is a small change on automatically determining
> > broker id when starting up. I am new to Kafka so there are a bunch of
> > design trade-offs that I might be missing or hard to decide, so I'd like to
> > get some suggestions on it. I'd expect (and open) to modify (or even
> > totally rewrite) the KIP based on suggestions. Thanks.
> > 
> > -- 
> > Best,
> > Kan
> >
>

Re: [DISCUSS] KIP-433: Provide client API version to authorizer

2019-02-27 Thread Harsha

HI Colin,
Overlooked the IDEMPOTENT_WRITE ACL. This along with 
client.min.version should solve the cases proposed in the KIP.
Can we turn this KIP into adding min.client.version config to broker and it 
could be part of the dynamic config .

Thanks,
Harsha

On Wed, Feb 27, 2019, at 12:17 PM, Colin McCabe wrote:
> On Tue, Feb 26, 2019, at 16:33, Harsha wrote:
> > Hi Colin,
> >   
> > "> I think Ismael and Gwen here bring up a good point.  The version of the 
> > > request is a technical detail that isn't really related to 
> > > authorization.  There are a lot of other technical details like this 
> > > like the size of the request, the protocol it came in on, etc.  None of 
> > > them are passed to the authorizer-- they all have configuration knobs 
> > > to control how we handle them.  If we add this technical detail, 
> > > logically we'll have to start adding all the others, and the authorizer 
> > > API will get really bloated.  It's better to keep it focused on 
> > > authorization, I think."
> > 
> > probably my previous email is not clear but I am agreeing with Gwen's 
> > point. 
> > I am not in favor of extending authorizer to support this.
> > 
> > 
> > "> Another thing to consider is that if we add a new broker configuration 
> > > that lets us set a minimum client version which is allowed, that could 
> > > be useful to other users as well.  On the other hand, most users are 
> > > not likely to write a custom authorizer to try to take advantage of 
> > > version information being passed to the authorizer.  So, I think using> a 
> > > configuration is clearly the better way to go here.  Perhaps it can 
> > > be a KIP-226 dynamic configuration to make this easier to deploy?"
> > 
> > Although minimum client version might help to a certain extent there 
> > are other cases where we want users to not start using transactions for 
> > example. My proposal in the previous thread was to introduce another 
> > module/interface, let's say
> > "SupportedAPIs" which will take in dynamic configuration to check which 
> > APIs are allowed. 
> > It can throw UnsupportedException just like we are throwing 
> > Authorization Exception.
> 
> Hi Harsha,
> 
> We can already prevent people from using transactions using ACLs, 
> right?  That's what the IDEMPOTENT_WRITE ACL was added for.
> 
> In general, I think users should be able to think of ACLs in terms of 
> "what can I do" rather than "how is it implemented."  For example, 
> maybe some day we will replace FetchRequest with GetStuffRequest.  But 
> users who have READ permission on a topic shouldn't have to change 
> anything.  So I think the Authorizer interface should not be aware of 
> individual RPC types or message versions.
> 
> best,
> Colin
> 
> 
> > 
> > 
> > Thanks,
> > Harsha
> > 
> > 
> > n Tue, Feb 26, 2019, at 10:04 AM, Colin McCabe wrote:
> > > Hi Harsha,
> > > 
> > > I think Ismael and Gwen here bring up a good point.  The version of the 
> > > request is a technical detail that isn't really related to 
> > > authorization.  There are a lot of other technical details like this 
> > > like the size of the request, the protocol it came in on, etc.  None of 
> > > them are passed to the authorizer-- they all have configuration knobs 
> > > to control how we handle them.  If we add this technical detail, 
> > > logically we'll have to start adding all the others, and the authorizer 
> > > API will get really bloated.  It's better to keep it focused on 
> > > authorization, I think.
> > > 
> > > Another thing to consider is that if we add a new broker configuration 
> > > that lets us set a minimum client version which is allowed, that could 
> > > be useful to other users as well.  On the other hand, most users are 
> > > not likely to write a custom authorizer to try to take advantage of 
> > > version information being passed to the authorizer.  So, I think  using 
> > > a configuration is clearly the better way to go here.  Perhaps it can 
> > > be a KIP-226 dynamic configuration to make this easier to deploy?
> > > 
> > > cheers,
> > > Colin
> > > 
> > > 
> > > On Mon, Feb 25, 2019, at 15:43, Harsha wrote:
> > > > Hi Ying,
> > > > I think the question is can we add a module in the core which 
> > > > can take up the

Re: [DISCUSS] KIP-427: Add AtMinIsr topic partition category (new metric & TopicCommand option)

2019-02-28 Thread Harsha

Hi Dong,
 I think AtMinIsr is still valuable to indicate cluster is at a 
critical state and something needs to be done asap to restore.
To your example 
" let's say min_isr = 1 and replica_set_size = 3, it is
> still possible that planned maintenance (e.g. one broker restart +
> partition reassignment) can cause isr size drop to 1. Since AtMinIsr can
> also cause fault positive (i.e. the fact that AtMinIsr > 0 does not
> necessarily need attention from user), "

One broker restart shouldn't cause ISR to drop to 1 from 3 unless 2 partitions 
are co-located on the same broker.
This is still a valuable indicator to the admins that the partition assignment 
needs to be moved.

In our case, we run 4 replicas for critical topics with min.isr = 2 . URPs are 
not really good indicator to take immediate action if one of the replicas is 
down. If 2 replicas are down and we are at 2 alive replicas this is stop 
everything to restore the cluster to a good state.

Thanks,
Harsha






On Wed, Feb 27, 2019, at 11:17 PM, Dong Lin wrote:
> Hey Kevin,
> 
> Thanks for the update.
> 
> The KIP suggests that AtMinIsr is better than UnderReplicatedPartition as
> indicator for alerting. However, in most case where min_isr =
> replica_set_size - 1, these two metrics are exactly the same, where planned
> maintenance can easily cause positive AtMinIsr value. In the other
> scenario, for example let's say min_isr = 1 and replica_set_size = 3, it is
> still possible that planned maintenance (e.g. one broker restart +
> partition reassignment) can cause isr size drop to 1. Since AtMinIsr can
> also cause fault positive (i.e. the fact that AtMinIsr > 0 does not
> necessarily need attention from user), I am not sure it is worth to add
> this metric.
> 
> In the Usage section, it is mentioned that user needs to manually check
> whether there is ongoing maintenance after AtMinIsr is triggered. Could you
> explain how is this different from the current way where we use
> UnderReplicatedPartition to trigger alert? More specifically, can we just
> replace AtMinIsr with UnderReplicatedPartition in the Usage section?
> 
> Thanks,
> Dong
> 
> 
> On Tue, Feb 26, 2019 at 6:49 PM Kevin Lu  wrote:
> 
> > Hi Dong!
> >
> > Thanks for the feedback!
> >
> > You bring up a good point in that the AtMinIsr metric cannot be used to
> > identify failure in the mentioned scenarios. I admit the motivation section
> > placed too much emphasis on "identifying failure".
> >
> > I have modified the KIP to reflect the implementation as the AtMinIsr
> > metric is intended to serve as a warning as one more failure to a partition
> > AtMinIsr will cause producers with acks=ALL configured to fail. It has an
> > additional benefit when minIsr=1 as it will warn us that the entire
> > partition is at risk of going offline, but that is more of a side effect
> > that only applies in that scenario (minIsr=1).
> >
> > Regards,
> > Kevin
> >
> > On Tue, Feb 26, 2019 at 5:11 PM Dong Lin  wrote:
> >
> > > Hey Kevin,
> > >
> > > Thanks for the proposal!
> > >
> > > It seems that the proposed implementation does not match the motivation.
> > > The motivation suggests that the operator wants to tell the planned
> > > maintenance (e.g. broker restart) from unplanned failure (e.g. network
> > > failure). But the use of the metric AtMinIsr does not really
> > differentiate
> > > between these causes of the reduced number of ISR. For example, an
> > > unplanned failure can cause ISR to drop from 3 to 2 but it can still be
> > > higher than the minIsr (say 1). And a planned maintenance can cause ISR
> > to
> > > drop from 3 to 2, which trigger the AtMinIsr metric if minIsr=2. Can you
> > > update the design doc to fix or explain this issue?
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Tue, Feb 12, 2019 at 9:02 AM Kevin Lu  wrote:
> > >
> > > > Hi All,
> > > >
> > > > Getting the discussion thread started for KIP-427 in case anyone is
> > free
> > > > right now.
> > > >
> > > > I’d like to propose a new category of topic partitions *AtMinIsr* which
> > > are
> > > > partitions that only have the minimum number of in sync replicas left
> > in
> > > > the ISR set (as configured by min.insync.replicas).
> > > >
> > > > This would add two new metrics *ReplicaManager.AtMinIsrPartitionCount
> > *&
> > > > *Partition.AtMinIsr*, and a new TopicCommand option*
> > > > --at-min-isr-partitions* to help in monitoring and alerting.
> > > >
> > > > KIP link: KIP-427: Add AtMinIsr topic partition category (new metric &
> > > > TopicCommand option)
> > > > <
> > > >
> > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103089398
> > > > >
> > > >
> > > > Please take a look and let me know what you think.
> > > >
> > > > Regards,
> > > > Kevin
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

2019-03-01 Thread Harsha

Hi,
 Cluster management tools are more generic and they are not aware of Kafka 
specific configs like broker.id.
Even if they are aware of broker.id's , they will be lost when a disk is lost. 
  Irrespective of these use cases, let's look at the problem in isolation.
1. disks are the most common failure case in Kafka clusters 
2. We are storing auto-generated broker.id on disks hence we lose this 
broker.id mapping when disks fail.
3. If we keep the previously generated broker.id mapping along with host on 
zookeeper it's easier to retrieve that mapping on a new host. This would reduce 
the reassignment step and allow us to just copy the data and start the new node 
with the previous broker.id
which is what the KIP is proposing. 
I want to understand what are your concerns in moving this mapping which 
already exists on disk to zookeeper? 

Thanks,
Harsha

On Fri, Mar 1, 2019, at 11:11 AM, Colin McCabe wrote:
> On Wed, Feb 27, 2019, at 14:12, Harsha wrote:
> > Hi Colin,
> >   What we want to is to preserve the broker.id so that we 
> > can do an offline rebuild of a broker. In our cases going through 
> > online Kafka replication to bring up, a failed node will put producer 
> > latencies at risk given the new broker will put all the other leaders 
> > busy with its replication requests. For an offline rebuild, we do not 
> > need to do rebalance as long as we can recover the broker.id
> >   Overall, irrespective of this use case we still want an 
> > ability to retrieve a broker.id for an existing host. This will make 
> > swapping in new hosts with failed hosts by keeping the existing 
> > hostname easier.
> 
> Thanks for the explanation.  Shouldn't this should be handled by the 
> cluster management tool, though?  Kafka doesn't include a mechanism for 
> re-creating nodes that failed.  That's up to kubernetes, or ansible, or 
> whatever cluster provisioning framework you have in place.  This feels 
> like the same kind of thing: managing how the cluster is provisioned.
> 
> best,
> Colin
> 
> > 
> > Thanks,
> > Harsha
> > On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> > > Hi Li,
> > > 
> > >  > The mechanism simplifies deployment because the same configuration can 
> > > be 
> > >  > used across all brokers, however, in a large system where disk failure 
> > > is 
> > >  > a norm, the meta file could often get lost, causing a new broker id 
> > > being 
> > >  > allocated. This is problematic because new broker id has no partition 
> > >  > assigned to it so it can’t do anything, while partitions assigned to 
> > > the 
> > >  > old one lose one replica
> > > 
> > > If all of the disks have failed, then the partitions will lose their 
> > > replicas no matter what, right?  If any of the disks is still around, 
> > > then there will be a meta file on the disk which contains the previous 
> > > broker ID.  So I'm not sure that we need to change anything here.
> > > 
> > > best,
> > > Colin
> > > 
> > > 
> > > On Tue, Feb 5, 2019, at 14:38, Li Kan wrote:
> > > > Hi, I have KIP-426, which is a small change on automatically determining
> > > > broker id when starting up. I am new to Kafka so there are a bunch of
> > > > design trade-offs that I might be missing or hard to decide, so I'd 
> > > > like to
> > > > get some suggestions on it. I'd expect (and open) to modify (or even
> > > > totally rewrite) the KIP based on suggestions. Thanks.
> > > > 
> > > > -- 
> > > > Best,
> > > > Kan
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-426: Persist Broker Id to Zookeeper

2019-03-02 Thread Harsha

Hi Eno,

A control plane needs to do this today because Kafka doesn't provide such 
mapping.
I am not sure why we want every control plane figure this out and rather let 
this mapping which exists today in Kafka at node level
on disk be at a global level in zookeeper.
If we implement this, any control plane will be much simpler and not all the 
different environments need to understand and re-implement this broker.id 
mapping.
I don't understand duplication either, which control-pane we are talking about? 

Irrespective which control pane a user ends up using I want to understand the 
concerns about a broker.id host mapping being available in zookeeper.  Broker 
id belongs to Kafka and not in control pane. 

Thanks,
Harsha


On Sat, Mar 2, 2019, at 3:50 AM, Eno Thereska wrote:
> Hi Harsha, Li Kan,
> 
> What Colin mentioned is what I see in practice as well (at AWS and our
> clusters). A control plane management tool decides the mapping
> hostname-broker ID and can change it as it sees fit as brokers fail and new
> ones are brought in. That control plane usually already has a database of
> sorts that keeps track of existing broker IDs. So this work would duplicate
> what that control plane already does. It could also lead to extra work if
> that control plane decides to do something different that what the mapping
> in Zookeeper has.
> 
> At a minimum I'd like to see the motivation expanded and a description of
> how the current cluster is managed that Li Kan has in mind.
> 
> Thanks
> Eno
> 
> On Sat, Mar 2, 2019 at 1:43 AM Harsha  wrote:
> 
> > Hi,
> >  Cluster management tools are more generic and they are not aware of
> > Kafka specific configs like broker.id.
> > Even if they are aware of broker.id's , they will be lost when a disk is
> > lost.
> >   Irrespective of these use cases, let's look at the problem in
> > isolation.
> > 1. disks are the most common failure case in Kafka clusters
> > 2. We are storing auto-generated broker.id on disks hence we lose this
> > broker.id mapping when disks fail.
> > 3. If we keep the previously generated broker.id mapping along with host
> > on zookeeper it's easier to retrieve that mapping on a new host. This would
> > reduce the reassignment step and allow us to just copy the data and start
> > the new node with the previous broker.id
> > which is what the KIP is proposing.
> > I want to understand what are your concerns in moving this mapping which
> > already exists on disk to zookeeper?
> >
> > Thanks,
> > Harsha
> >
> > On Fri, Mar 1, 2019, at 11:11 AM, Colin McCabe wrote:
> > > On Wed, Feb 27, 2019, at 14:12, Harsha wrote:
> > > > Hi Colin,
> > > >   What we want to is to preserve the broker.id so that we
> > > > can do an offline rebuild of a broker. In our cases going through
> > > > online Kafka replication to bring up, a failed node will put producer
> > > > latencies at risk given the new broker will put all the other leaders
> > > > busy with its replication requests. For an offline rebuild, we do not
> > > > need to do rebalance as long as we can recover the broker.id
> > > >   Overall, irrespective of this use case we still want an
> > > > ability to retrieve a broker.id for an existing host. This will make
> > > > swapping in new hosts with failed hosts by keeping the existing
> > > > hostname easier.
> > >
> > > Thanks for the explanation.  Shouldn't this should be handled by the
> > > cluster management tool, though?  Kafka doesn't include a mechanism for
> > > re-creating nodes that failed.  That's up to kubernetes, or ansible, or
> > > whatever cluster provisioning framework you have in place.  This feels
> > > like the same kind of thing: managing how the cluster is provisioned.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > > Harsha
> > > > On Wed, Feb 27, 2019, at 11:53 AM, Colin McCabe wrote:
> > > > > Hi Li,
> > > > >
> > > > >  > The mechanism simplifies deployment because the same
> > configuration can be
> > > > >  > used across all brokers, however, in a large system where disk
> > failure is
> > > > >  > a norm, the meta file could often get lost, causing a new broker
> > id being
> > > > >  > allocated. This is problematic because new broker id has no
> > partition
> > > > >  > assigned to it so it can’t do anything, while partitions assigne

Re: [VOTE] KIP-436 Add a metric indicating start time

2019-03-08 Thread Harsha

+1 (binding)

Thanks,
Harsha

On Fri, Mar 8, 2019, at 2:55 AM, Dongjin Lee wrote:
> +1 (non binding)
> 
> 2 bindings, 3 non-bindings until now. (Colin, Manikumar / Satish, Mickael,
> Dongjin)
> 
> On Fri, Mar 8, 2019 at 7:44 PM Mickael Maison 
> wrote:
> 
> > +1 (non binding)
> > Thanks
> >
> > On Fri, Mar 8, 2019 at 6:39 AM Satish Duggana 
> > wrote:
> > >
> > > Thanks for the KIP,
> > > +1 (non-binding)
> > >
> > > ~Satish.
> > >
> > > On Thu, Mar 7, 2019 at 11:58 PM Manikumar 
> > wrote:
> > >
> > > > +1 (binding).
> > > >
> > > > Thanks for the KIP.
> > > >
> > > > Thanks,
> > > > Manikumar
> > > >
> > > >
> > > > On Thu, Mar 7, 2019 at 11:52 PM Colin McCabe 
> > wrote:
> > > >
> > > > > +1 (binding).
> > > > >
> > > > > Thanks, Stanislav.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > On Tue, Mar 5, 2019, at 05:23, Stanislav Kozlovski wrote:
> > > > > > Hey everybody,
> > > > > >
> > > > > > I'd like to start a vote thread about the lightweight KIP-436
> > > > > > KIP: KIP-436
> > > > > > <
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-436%3A+Add+a+metric+indicating+start+time
> > > > > >
> > > > > > JIRA: KAFKA-7992 <https://issues.apache.org/jira/browse/KAFKA-7992
> > >
> > > > > > Pull Request: 6318 <https://github.com/apache/kafka/pull/6318>
> > > > > >
> > > > > > --
> > > > > > Best,
> > > > > > Stanislav
> > > > > >
> > > > >
> > > >
> >
> 
> 
> -- 
> *Dongjin Lee*
> 
> *A hitchhiker in the mathematical world.*
> *github:  <http://goog_969573159/>github.com/dongjinleekr
> <https://github.com/dongjinleekr>linkedin: kr.linkedin.com/in/dongjinleekr
> <https://kr.linkedin.com/in/dongjinleekr>speakerdeck: speakerdeck.com/dongjin
> <https://speakerdeck.com/dongjin>*
>

Re: [VOTE] KIP-427: Add AtMinIsr topic partition category (new metric & TopicCommand option)

2019-03-08 Thread Harsha



 +1 (binding)

-Harsha

On Thu, Mar 7, 2019, at 6:48 PM, hacker win7 wrote:
> +1 (non-binding)
> 
> > On Mar 8, 2019, at 02:32, Stanislav Kozlovski  
> > wrote:
> > 
> > Thanks for the KIP, Kevin! This change will be a good improvement to
> > Kafka's observability story
> > 
> > +1 (non-binding)
> > 
> > On Thu, Mar 7, 2019 at 4:49 AM Vahid Hashemian 
> > wrote:
> > 
> >> Thanks for the KIP Kevin.
> >> 
> >> +1 (binding)
> >> 
> >> --Vahid
> >> 
> >> On Wed, Mar 6, 2019 at 8:39 PM Dongjin Lee  wrote:
> >> 
> >>> +1 (non-binding)
> >>> 
> >>> On Wed, Mar 6, 2019, 3:14 AM Dong Lin  wrote:
> >>> 
> >>>> Hey Kevin,
> >>>> 
> >>>> Thanks for the KIP!
> >>>> 
> >>>> +1 (binding)
> >>>> 
> >>>> Thanks,
> >>>> Dong
> >>>> 
> >>>> On Tue, Mar 5, 2019 at 9:38 AM Kevin Lu  wrote:
> >>>> 
> >>>>> Hi All,
> >>>>> 
> >>>>> I would like to start the vote thread for KIP-427: Add AtMinIsr topic
> >>>>> partition category (new metric & TopicCommand option).
> >>>>> 
> >>>>> 
> >>>> 
> >>> 
> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=103089398
> >>>>> 
> >>>>> Thanks!
> >>>>> 
> >>>>> Regards,
> >>>>> Kevin
> >>>>> 
> >>>> 
> >>> 
> >> 
> > 
> > 
> > -- 
> > Best,
> > Stanislav
> 
>

Re: [VOTE] 2.2.0 RC2

2019-03-21 Thread Harsha

+1 (non-bidning)
 - Download artifacts, setup 3 node cluster
- Ran producer/consumer clients

Thanks,
Harsha

On Thu, Mar 21, 2019, at 5:54 AM, Andrew Schofield wrote:
> +1 (non-binding)
> 
> - Downloaded the artifacts
> - Ran Kafka Connect connectors
> 
> Thanks,
> Andrew Schofield
> IBM Event Streams
> 
> On 19/03/2019, 19:13, "Manikumar"  wrote:
> 
> +1 (non-binding)
> 
> - Verified the artifacts, build from src, ran tests
> - Verified the quickstart, ran producer/consumer performance tests.
> 
> Thanks for running release!.
> 
> Thanks,
> Manikumar
> 
> On Wed, Mar 20, 2019 at 12:19 AM David Arthur 
> wrote:
> 
> > +1
> >
> > Validated signatures, and ran through quick-start.
> >
> > Thanks!
> >
> > On Mon, Mar 18, 2019 at 4:00 AM Jakub Scholz  
> wrote:
> >
> > > +1 (non-binding). I used the staged binaries and run some of my 
> tests
> > > against them. All seems to look good to me.
> > >
> > > On Sat, Mar 9, 2019 at 11:56 PM Matthias J. Sax 
> 
> > > wrote:
> > >
> > > > Hello Kafka users, developers and client-developers,
> > > >
> > > > This is the third candidate for release of Apache Kafka 2.2.0.
> > > >
> > > >  - Added SSL support for custom principal name
> > > >  - Allow SASL connections to periodically re-authenticate
> > > >  - Command line tool bin/kafka-topics.sh adds AdminClient 
> support
> > > >  - Improved consumer group management
> > > >- default group.id is `null` instead of empty string
> > > >  - API improvement
> > > >- Producer: introduce close(Duration)
> > > >- AdminClient: introduce close(Duration)
> > > >- Kafka Streams: new flatTransform() operator in Streams 
> DSL
> > > >- KafkaStreams (and other classed) now implement 
> AutoClosable to
> > > > support try-with-resource
> > > >- New Serdes and default method implementations
> > > >  - Kafka Streams exposed internal client.id via ThreadMetadata
> > > >  - Metric improvements:  All `-min`, `-avg` and `-max` 
> metrics will now
> > > > output `NaN` as default value
> > > > Release notes for the 2.2.0 release:
> > > > 
> https://eur02.safelinks.protection.outlook.com/?url=https:%2F%2Fhome.apache.org%2F~mjsax%2Fkafka-2.2.0-rc2%2FRELEASE_NOTES.html&data=02%7C01%7C%7Cbc5822a806a749b0638208d6ac9ef756%7C84df9e7fe9f640afb435%7C1%7C0%7C636886196079314852&sdata=zBUbQlQiAuGZzs33TUPUqsuC8IpPavg2lT3yPFO%2F3nA%3D&reserved=0
> > > >
> > > > *** Please download, test, and vote by Thursday, March 14, 
> 9am PST.
> > > >
> > > > Kafka's KEYS file containing PGP keys we use to sign the 
> release:
> > > > 
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkafka.apache.org%2FKEYS&data=02%7C01%7C%7Cbc5822a806a749b0638208d6ac9ef756%7C84df9e7fe9f640afb435%7C1%7C0%7C636886196079314852&sdata=g1Gg%2BoIRgpKUum5%2Bmi2plT1qIfH9d2aZkdK9jw7DLxM%3D&reserved=0
> > > >
> > > > * Release artifacts to be voted upon (source and binary):
> > > > 
> https://eur02.safelinks.protection.outlook.com/?url=https:%2F%2Fhome.apache.org%2F~mjsax%2Fkafka-2.2.0-rc2%2F&data=02%7C01%7C%7Cbc5822a806a749b0638208d6ac9ef756%7C84df9e7fe9f640afb435%7C1%7C0%7C636886196079324862&sdata=dUZrMCGvR4ki8XS%2B9dEDQ5Bavv4A4xq86CtcXQ6tnFs%3D&reserved=0
> > > >
> > > > * Maven artifacts to be voted upon:
> > > > 
> https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Frepository.apache.org%2Fcontent%2Fgroups%2Fstaging%2Forg%2Fapache%2Fkafka%2F&data=02%7C01%7C%7Cbc5822a806a749b0638208d6ac9ef756%7C84df9e7fe9f640afb435%7C1%7C0%7C636886196079324862&sdata=sCoRIXmcRQd473bRqwFgQaSm2XI%2BBqHw%2FbiddQd4hnE%3D&reserved=0
> > > >
> > > > * Javadoc:
> > > > 
> https://eur02.safelinks.protection.outlook.com/?url=https:%2F%2Fhome.apache.org%2F~mjsax%2Fkafka-2.2.0-rc2%2Fjavadoc%2F&data=02%7C01%7C%7Cbc5822a806a749b0638208d6ac9ef756%7C84df9e7fe9f640afb435%7C1%7C0%7C636886196079324862&sdata=iK4WEFuaK0lCySWROi7BbBv%2Bpg8h%2B9umbVNA7I1rqxc%3D&reserved=0
> > > >
> > > > * Tag to be voted upon (off 2.2 branch) is the 2.2.0

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-03-27 Thread Harsha

Hi All,
   Thanks for your initial feedback. We updated the KIP. Please take a 
look and let us know if you have any questions.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage

Thanks,
Harsha

On Wed, Feb 6, 2019, at 10:30 AM, Harsha wrote:
> Thanks Eno, Adam & Satish for you review and questions. I'll address 
> these in KIP and update the thread here. 
> 
> Thanks,
> Harsha
> 
> On Wed, Feb 6, 2019, at 7:09 AM, Satish Duggana wrote:
> > Thanks, Harsha for the KIP. It is a good start for tiered storage in
> > Kafka. I have a few comments/questions.
> > 
> > It may be good to have a configuration to keep the number of local
> > segments instead of keeping only the active segment. This config can
> > be exposed at cluster and topic levels with default value as 1. In
> > some use cases, few consumers may lag over one segment, it will be
> > better to serve from local storage instead of remote storage.
> > 
> > It may be better to keep “remote.log.storage.enable” and respective
> > configuration at topic level along with cluster level. It will be
> > helpful in environments where few topics are configured with
> > local-storage and other topics are configured with remote storage.
> > 
> > Each topic-partition leader pushes its log segments with respective
> > index files to remote whenever active log rolls over, it updates the
> > remote log index file for the respective remote log segment. The
> > second option is to add offset index files also for each segment. It
> > can serve consumer fetch requests for old segments from local log
> > segment instead of serving directly from the remote log which may
> > cause high latencies. There can be different strategies in when the
> > remote segment is copied to a local segment.
> > 
> > What is “remote.log.manager.scheduler.interval.ms” config about?
> > 
> > How do followers sync RemoteLogSegmentIndex files? Do they request
> > from leader replica? This looks to be important as the failed over
> > leader should have RemoteLogSegmentIndex updated and ready to avoid
> > high latencies in serving old data stored in remote logs.
> > 
> > Thanks,
> > Satish.
> > 
> > On Tue, Feb 5, 2019 at 10:42 PM Ryanne Dolan  wrote:
> > >
> > > Thanks Harsha, makes sense.
> > >
> > > Ryanne
> > >
> > > On Mon, Feb 4, 2019 at 5:53 PM Harsha Chintalapani  
> > > wrote:
> > >
> > > > "I think you are saying that this enables additional (potentially 
> > > > cheaper)
> > > > storage options without *requiring* an existing ETL pipeline. “
> > > > Yes.
> > > >
> > > > " But it's not really a replacement for the sort of pipelines people 
> > > > build
> > > > with Connect, Gobblin etc.”
> > > >
> > > > It is not. But also making an assumption that everyone runs these
> > > > pipelines for storing raw Kafka data into HDFS or S3 is also wrong
> > > >  assumption.
> > > > The aim of this KIP is to provide tiered storage as whole package not
> > > > asking users to ship the data on their own using existing ETL, which 
> > > > means
> > > > running a consumer and maintaining those pipelines.
> > > >
> > > > " My point was that, if you are already offloading records in an ETL
> > > > pipeline, why do you need a new pipeline built into the broker to ship 
> > > > the
> > > > same data to the same place?”
> > > >
> > > > As you said its ETL pipeline, which means users of these pipelines are
> > > > reading the data from broker and transforming its state and storing it
> > > > somewhere.
> > > > The point of this KIP is store log segments as it is without changing
> > > > their structure so that we can use the existing offset mechanisms to 
> > > > look
> > > > it up when the consumer needs to read old data. When you do load it via
> > > > your existing pipelines you are reading the topic as a whole , which
> > > > doesn’t guarantee that you’ll produce this data back into HDFS in S3 in 
> > > > the
> > > > same order and who is going to generate the Index files again.
> > > >
> > > >
> > > > "So you'd end up with one of 1)cold segments are only useful to Kafka; 
> > > > 2)
> > > > you have the same data written to HDFS/etc twice, once for Kafka and 
> > > > once
&

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-04-01 Thread Harsha

emoteIndex we are extending the current index mechanism to find a offset 
and its message to find a file in remote storage for a givent offset. This will 
be optimal way finding for a given offset which remote segment might be serving 
compare to storing all of this data into internal topic.

"To add to Eric's question/confusion about where logic lives (RLM vs. RSM),
I think it would be helpful to explicitly identify in the KIP that the RLM
delegates to the RSM since the RSM is part of the public API and is the
pluggable piece.  For example, instead of saying "RLM will ship the log
segment files that are older than a configurable time to remote storage" I
think it would be better to say "RLM identifies log segment files that are
older than a configurable time and delegates to the configured RSM to ship
them to remote storage" (or something like that -- just make it clear that
the RLM is delegating to the configured RSM)."

Thanks. I agree with you. I'll update the KIP.

Hi Ambud,

Thanks for the comments.

"1. Wouldn't implicit checking for old offsets in remote location if not
found locally on the leader i.e. do we really need remote index files?
Since the storage path for a given topic would presumably be constant
across all the brokers, the remote topic-partition path could simply be
checked to see if there are any segment file names that would meet the
offset requirements for a Consumer Fetch Request. RSM implementations could
optionally cache this information."

By storing the remote index files locally , it will be faster for us to 
determine for a requested offset which file might contain the data. This will 
help us resolve the remote file quickly and return the response. Instead of 
making a call to remote tier for index look up. Given index files are smaller , 
it won't be much hit to the storage space.

"2. Would it make sense to create an internal compacted Kafka topic to
publish & record remote segment information? This would enable the
followers to get updates about new segments rather than running list()
operations on remote storage to detect new segments which may be expensive."

I think Ron also alluding to this. We thought shipping remote index files to 
remote storage files and let the follower's RLM picking that up makes it easy 
to have the current replication protocol without any changes.  So we don't 
determine if a follower is in ISR or not based on another topic's replication.  
We will run small tests and determine if use of topic is better for this. 
Thanks for the suggestion.

3. For RLM to scan local segment rotations are you thinking of leveraging
java.nio.file.WatchService or simply running listFiles() on a periodic
basis? Since WatchService implementation is heavily OS dependent it might
create some complications around missing FS Events.

Ideally we want to introduce file events like you suggested. For POC work we 
are using just listFiles(). Also copying these files to remote can be slower 
and we will not delete the files from local disk until the segment is copied 
and any requests to the data in these files will be served from local disk. So 
I don't think we need to be aggressive and optimize the this copy segment to 
remote path. 

Hi Viktor,
 Thanks for the comments.

"I have a rather technical question to this. How do you plan to package this
extension? Does this mean that Kafka will depend on HDFS?
I think it'd be nice to somehow separate this off to a different package in
the project so that it could be built and released separately from the main
Kafka packages."

We would like all of this code to be part of Apache Kafka . In early days of 
Kafka, there is external module which used to contain kafka to hdfs copy tools 
and dependencies.  We would like to have RLM (class implementation) and 
RSM(interface) to be in core and as you suggested, implementation of RSM could 
be in another package so that the dependencies of RSM won't come into Kafka's 
classpath unless someone explicity configures them. 

Thanks,
Harsha

On Mon, Apr 1, 2019, at 1:02 AM, Viktor Somogyi-Vass wrote:
> Hey Harsha,
> 
> I have a rather technical question to this. How do you plan to package this
> extension? Does this mean that Kafka will depend on HDFS?
> I think it'd be nice to somehow separate this off to a different package in
> the project so that it could be built and released separately from the main
> Kafka packages.
> This decoupling would be useful when direct dependency on HDFS (or other
> implementations) is not needed and would also encourage decoupling for
> other storage implementations.
> 
> Best,
> Viktor
> 
> On Mon, Apr 1, 2019 at 3:44 AM Ambud Sharma  wrote:
> 
> > Hi Harsha,
> >
> > Thank you for proposing this KIP. We are looking forward to this feature as
> &g

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-04-03 Thread Harsha

Hi Viktor,

"Now, will the consumer be able to consume a remote segment if:
- the remote segment is stored in the remote storage, BUT
- the leader broker failed right after this AND
- the follower which is to become a leader didn't scan yet for a new
segment?"

If I understand correctly, after a local log segment copied to remote and 
leader is failed to write the index files and leadership changed to a follower. 
In this case we consider the log segment copy failed and newly elected leader 
will start copying the data from last the known offset in the remote to copy.  
Consumers who are looking for the offset which might be in the failed copy log 
segment will continue to be read the data from local disk since the local log 
segment will only be deleted once a successful copy of the log segment.

"As a follow-up question, what are your experiences, does a failover in a
broker causes bigger than usual churn in the consumers? (I'm thinking about
the time required to rebuild remote index files.)"

Rebuild remote index files will only happen in case of  remote storage missing 
all the copied index files.  Fail-over will not trigger this rebuild.

Hi Ryan,

"Harsha, can you comment on this alternative approach: instead of fetching
directly from remote storage via a new API, implement something like
paging, where segments are paged-in and out of cold storage based on access
frequency/recency? For example, when a remote segment is accessed, it could
be first fetched to disk and then read from there. I suppose this would
require less code changes, or at least less API changes."

Copying whole log segment from remote is inefficient. When tiered storage is 
enabled users might prefer hardware with smaller disks and having to copy the 
log segment to local disk again , especially incase of multiple consumers on 
multiple topics triggering this might negatively affect the available local 
storage.
What we proposed in the KIP doesn't affect the existing APIs and we didn't call 
for any API changes. 

"And related to paging, does the proposal address what happens when a broker
runs out of HDD space? Maybe we should have a way to configure a max number
of segments or bytes stored on each broker, after which older or
least-recently-used segments are kicked out, even if they aren't expired
per the retention policy? Otherwise, I suppose tiered storage requires some
babysitting to ensure that brokers don't run out of local storage, despite
having access to potentially unbounded cold storage."

Existing Kafka behavior will not change with addition of tiered storage and 
enabling it also will not change behavior.
Just like today it's up to the operator to make sure the HD space is monitored 
and take necessary actions to mitigate that before it becomes fatal failure for 
broker. We don't stop users to configure the retention period to infinite and 
they can easily run out of the space.

These are not the alternatives considered as they are not efficient copy in out 
of local disk , hence the reason we didn't add to alternatives considered :).

Thanks,
Harsha

On Wed, Apr 3, 2019, at 7:51 AM, Ryanne Dolan wrote:
> Harsha, can you comment on this alternative approach: instead of fetching
> directly from remote storage via a new API, implement something like
> paging, where segments are paged-in and out of cold storage based on access
> frequency/recency? For example, when a remote segment is accessed, it could
> be first fetched to disk and then read from there. I suppose this would
> require less code changes, or at least less API changes.
> 
> And related to paging, does the proposal address what happens when a broker
> runs out of HDD space? Maybe we should have a way to configure a max number
> of segments or bytes stored on each broker, after which older or
> least-recently-used segments are kicked out, even if they aren't expired
> per the retention policy? Otherwise, I suppose tiered storage requires some
> babysitting to ensure that brokers don't run out of local storage, despite
> having access to potentially unbounded cold storage.
> 
> Just some things to add to Alternatives Considered :)
> 
> Ryanne
> 
> On Wed, Apr 3, 2019 at 8:21 AM Viktor Somogyi-Vass 
> wrote:
> 
> > Hi Harsha,
> >
> > Thanks for the answer, makes sense.
> > In the meantime one edge case popped up in my mind but first let me
> > summarize what I understand if I interpret your KIP correctly.
> >
> > So basically whenever the leader RSM copies over a segment to the remote
> > storage, the leader RLM will append an entry to its remote index files with
> > the remote position. After this LogManager can delete the local segment.
> > Parallel to this RLM followers are periodically scanning the remote storage
> >

Re: [VOTE] KIP-369 Alternative Partitioner to Support "Always Round-Robin" Selection

2019-04-04 Thread Harsha

Looks like the KIP is passed with 3 binding votes.  From Matthias, Bill Bejeck 
and myself you got 3 binding votes.
You can do the full tally of the votes and send out a close of vote thread.

Thanks,
Harsha

On Thu, Apr 4, 2019, at 12:24 PM, M. Manna wrote:
> Hello,
> 
> Trying to revive this thread again. Would anyone be interested in having
> this KiP through
> 
> 
> Thanks,
> 
> On Fri, 25 Jan 2019 at 16:44, M. Manna  wrote:
> 
> > Hello,
> >
> > I am trying to revive this thread. I only got 1 binding vote so far.
> >
> > Please feel free to revisit and comment here.
> >
> > Thanks,
> >
> > On Thu, 25 Oct 2018 at 00:15, M. Manna  wrote:
> >
> >> Hey IJ,
> >>
> >> Thanks for your interest in the KIP.
> >>
> >> My point was simply that the round-robin should happen even if the key is
> >> not null. As for the importance of key in our case, we treat the key as
> >> metadata. Each key is composed of certain info which are parsed by our
> >> consumer thread. We will then determine whether it's an actionable message
> >> (e.g. process it), or a loopback(ignore it). You could argue, "Why not
> >> append this metadata with the record and parse it there?". But that means
> >> the following:
> >>
> >> 1) I'm always passing null key to achieve this - I would like to pass
> >> Null/Not-Null/Other key i.e. flexibility
> >> 2) Suppose the message size is 99 KB and and max message bytes allowed is
> >> 100K. Now prefixing metadata with message results into the actual message
> >> being 101K. This will fail at producer level and cause a retry/log this in
> >> our DB for future pickup.
> >>
> >> To avoid all these, we are simply proposing this new partitioner class.
> >> but all Kafka new releases will still have DefaultPartitioner as default,
> >> unless they change the prop file to use our new class.
> >>
> >> Regards,
> >>
> >> On Sun, 21 Oct 2018 at 04:05, Ismael Juma  wrote:
> >>
> >>> Thanks for the KIP. Can you please elaborate on the need for the key in
> >>> this case? The KIP simply states that the key is needed for metadata, but
> >>> doesn't give any more details.
> >>>
> >>> Ismael
> >>>
> >>> On Tue, Sep 4, 2018 at 3:39 AM M. Manna  wrote:
> >>>
> >>> > Hello,
> >>> >
> >>> > I have made necessary changes as per the original discussion thread,
> >>> and
> >>> > would like to put it for votes.
> >>> >
> >>> > Thank you very much for your suggestion and guidance so far.
> >>> >
> >>> > Regards,
> >>> >
> >>>
> >>
>

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-04-08 Thread Harsha

Thanks, Ron. Updating the KIP. will add answers here as well

 1) If the cold storage technology can be cross-region, is there a
 possibility for a disaster recovery Kafka cluster to share the messages in
 cold storage?  My guess is the answer is no, and messages replicated to the
 D/R cluster have to be migrated to cold storage from there independently.
 (The same cross-region cold storage medium could be used, but every message
 would appear there twice).

If I understand the question correctly, what you are saying is Kafka A cluster 
(active) shipping logs to remote storage which cross-region replication and 
another Kafka Cluster B (Passive) will it be able to use the remote storage 
copied logs directly.
For the initial version my answer is No. We can handle this in subsequent 
changes after this one.

 2) Can/should external (non-Kafka) tools have direct access to the messages
 in cold storage.  I think this might have been addressed when someone asked
 about ACLs, and I believe the answer is "no" -- if some external tool needs
 to operate on that data then that external tool should read that data by
acting as a Kafka consumer.  Again, just asking to get the answer clearly
documented in case it is unclear.

The answer is No. All tools/clients must go through broker APIs to access any 
data (local or remote). 
Only Kafka broker user will have access to remote storage logs and 
Security/ACLs will work the way it does today.
Tools/Clients going directly to the remote storage might help in terms of 
efficiency but this requires Protocol changes and some way of syncing ACLs in 
Kafka to the Remote storage. 

Thanks,
Harsha

On Mon, Apr 8, 2019, at 8:48 AM, Ron Dagostino wrote:
> Hi Harsha.  A couple of questions.  I think I know the answers, but it
> would be good to see them explicitly documented.
> 
> 1) If the cold storage technology can be cross-region, is there a
> possibility for a disaster recovery Kafka cluster to share the messages in
> cold storage?  My guess is the answer is no, and messages replicated to the
> D/R cluster have to be migrated to cold storage from there independently.
> (The same cross-region cold storage medium could be used, but every message
> would appear there twice).
> 
> 2) Can/should external (non-Kafka) tools have direct access to the messages
> in cold storage.  I think this might have been addressed when someone asked
> about ACLs, and I believe the answer is "no" -- if some external tool needs
> to operate on that data then that external tool should read that data by
> acting as a Kafka consumer.  Again, just asking to get the answer clearly
> documented in case it is unclear.
> 
> Ron
> 
> 
> On Thu, Apr 4, 2019 at 12:53 AM Harsha  wrote:
> 
> > Hi Viktor,
> >
> >
> > "Now, will the consumer be able to consume a remote segment if:
> > - the remote segment is stored in the remote storage, BUT
> > - the leader broker failed right after this AND
> > - the follower which is to become a leader didn't scan yet for a new
> > segment?"
> >
> > If I understand correctly, after a local log segment copied to remote and
> > leader is failed to write the index files and leadership changed to a
> > follower. In this case we consider the log segment copy failed and newly
> > elected leader will start copying the data from last the known offset in
> > the remote to copy.  Consumers who are looking for the offset which might
> > be in the failed copy log segment will continue to be read the data from
> > local disk since the local log segment will only be deleted once a
> > successful copy of the log segment.
> >
> > "As a follow-up question, what are your experiences, does a failover in a
> > broker causes bigger than usual churn in the consumers? (I'm thinking about
> > the time required to rebuild remote index files.)"
> >
> > Rebuild remote index files will only happen in case of  remote storage
> > missing all the copied index files.  Fail-over will not trigger this
> > rebuild.
> >
> >
> > Hi Ryan,
> >
> > "Harsha, can you comment on this alternative approach: instead of fetching
> > directly from remote storage via a new API, implement something like
> > paging, where segments are paged-in and out of cold storage based on access
> > frequency/recency? For example, when a remote segment is accessed, it could
> > be first fetched to disk and then read from there. I suppose this would
> > require less code changes, or at least less API changes."
> >
> > Copying whole log segment from remote is inefficient. When tiered storage
> > is enabled users might prefer hardware with smaller disks and having to
> > copy the log seg

Re: [VOTE] KIP-433: Block old clients on brokers

2019-04-12 Thread Harsha

Hi,
  
"Relying on min.version seems like a pretty clunky way to achieve the above
> list. The challenge is that it's pretty difficult to do it in a way that
> works for clients across languages. They each add support for new protocol
> versions independently (it could even happen in a bug fix release). So, if
> you tried to block Sarama in #2, you may block Java clients too."

That's the intended effect, right?  if you as the admin/operator configures the 
broker to have min.api.version to be 1.1 
it should block java , sarama clients etc.. which are below the 1.1 protocol.  
As mentioned this is not just related to log.format upgrade problem but in 
general a forcing cause to get the users to upgrade their client version in a 
multi-tenant environment.

"> For #3, it seems simplest to have a config that requires clients to support
> a given message format version (or higher). For #2, it seems like you'd
> want clients to advertise their versions. That would be useful for multiple
> reasons."
This kip offers the ability to block clients based on the protocol they 
support. This should be independent of the message format upgrade. Not all of 
the features or bugs are dependent on a message format and having a message 
format dependency to block clients means we have to upgrade to message.format 
and we cannot just say we've 1.1 brokers with 0.8.2 message format and now we 
want to block all 0.8.x clients.

min.api.version helps at the cluster level to say that all users required to 
upgrade clients to the at minimum need to speak the min.api.version and not tie 
to message.format because not all cases one wants to upgrade the message format 
and block the old clients.


To Gwen's point, I think we should also return in the error message that the 
broker only supports min.api.version and above. So that users can see a clear 
message and upgrade to a newer version.


Thanks,
Harsha


On Fri, Apr 12, 2019, at 12:19 PM, Ismael Juma wrote:
> Hi Ying,
> 
> The actual reasons are important so that people can evaluate the KIP (and
> vote). :) Thanks for providing a few more:
> 
> (1) force users to check pointing in Kafka instead of zookeeper
> (2) forbid an old go (sarama) client library which is known to have some
> serious bugs
> (3) force kafka 1.x clients with the ability to roll back if there's an
> issue (unlike a message format upgrade)
> 
> Relying on min.version seems like a pretty clunky way to achieve the above
> list. The challenge is that it's pretty difficult to do it in a way that
> works for clients across languages. They each add support for new protocol
> versions independently (it could even happen in a bug fix release). So, if
> you tried to block Sarama in #2, you may block Java clients too.
> 
> For #3, it seems simplest to have a config that requires clients to support
> a given message format version (or higher). For #2, it seems like you'd
> want clients to advertise their versions. That would be useful for multiple
> reasons.
> 
> Ismael
> 
> On Fri, Apr 12, 2019 at 8:42 PM Ying Zheng  wrote:
> 
> > Hi Ismael,
> >
> > Those are just examples. I think the administrators should be able to block
> > certain client libraries for whatever reason. Some other possible reasons
> > include, force users to check pointing in Kafka instead of zookeeper,
> > forbid an old go (sarama) client library which is known to have some
> > serious bugs.
> >
> > message.downconversion.enable does not solve our problems. We are now
> > planning to upgrade to message format V3, and force users to upgrade to
> > Kafka 1.x clients. With the proposed min.api.version setting, in case of
> > there is anything wrong, we can roll back the setting. If we upgrade the
> > file format, there is no way to rollback (Kafka doesn't support downgrading
> > message format).
> >
> > On Thu, Apr 11, 2019 at 7:05 PM Ismael Juma  wrote:
> >
> > > Hi Ying,
> > >
> > > It looks to me that all the examples given in the KIP can be handled with
> > > the existing "message.downconversion.enable" config and by configuring
> > the
> > > message format to be the latest:
> > >
> > > 1. Kafka 8 / 9 / 10 consumer hangs when the message contains message
> > header
> > > > ( KAFKA-6739 - Down-conversion fails for records with headers
> > RESOLVED  )
> > > > 2. LZ4 is not correctly handled in Kafka 8 and Kafka 9 ( KAFKA-3160 -
> > > > Kafka LZ4 framing code miscalculates header checksum RESOLVED  )
> > > > 3. Performance penalty of converting message format from V3 to V1 or V2
> > > > for the old consumers (KIP-31 - Move to r

Re: [VOTE] KIP-433: Block old clients on brokers

2019-04-12 Thread Harsha

Hi Ismael,
I meant to say blocking clients based on their API version 
https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/api/ApiVersion.scala#L48

But If I understand what you are saying, since each client release can support 
different versions for each of fetch, produce, offset commit etc.. and it's 
harder to block just based on single min.api.version setting across different 
clients. 
The idea I had in my mind was to do this via ApiVersionRequest, when a client 
makes api request to broker in response we return min and max version supported 
for each Api. When min.api.version enabled on broker, it returns the maxVersion 
it supports for each of the requests in that release as min versions to the 
clients.

Example:
Kafka 1.1.1 broker and min.api.verson set to 
https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/api/ApiVersion.scala#L79
 (KAFKA_1_1_IV0) and client makes a ApiVersionsRequest and in response for 
example produce request 
https://github.com/apache/kafka/blob/1.1/clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java#L112
Instead of returning all of the supported versions it will return 
PRODUCE_REQUEST_V5 as the only supported version.

Irrespective of the above approach I understand your point still stands which 
is sarama might not choose to implement all the higher version protocols for 
Kafka 1.1 release and they might introduce higher version of produce request in 
a subsequent minor release and it will be harder for users to figure out which 
release of sarama client they can use.

Ying, if you have a different apporach which might address this issue please 
add.

Thanks,
Harsha

On Fri, Apr 12, 2019, at 7:23 PM, Ismael Juma wrote:
> Hi Harsha,
> 
> There is no such thing as 1.1 protocol. I encourage you to describe an
> example config that achieves what you are suggesting here. It's pretty
> complicated because the versions are per API and each client evolves
> independently.
> 
> Ismael
> 
> On Sat, Apr 13, 2019 at 4:09 AM Harsha  wrote:
> 
> > Hi,
> >
> > "Relying on min.version seems like a pretty clunky way to achieve the above
> > > list. The challenge is that it's pretty difficult to do it in a way that
> > > works for clients across languages. They each add support for new
> > protocol
> > > versions independently (it could even happen in a bug fix release). So,
> > if
> > > you tried to block Sarama in #2, you may block Java clients too."
> >
> > That's the intended effect, right?  if you as the admin/operator
> > configures the broker to have min.api.version to be 1.1
> > it should block java , sarama clients etc.. which are below the 1.1
> > protocol.  As mentioned this is not just related to log.format upgrade
> > problem but in general a forcing cause to get the users to upgrade their
> > client version in a multi-tenant environment.
> >
> > "> For #3, it seems simplest to have a config that requires clients to
> > support
> > > a given message format version (or higher). For #2, it seems like you'd
> > > want clients to advertise their versions. That would be useful for
> > multiple
> > > reasons."
> > This kip offers the ability to block clients based on the protocol they
> > support. This should be independent of the message format upgrade. Not all
> > of the features or bugs are dependent on a message format and having a
> > message format dependency to block clients means we have to upgrade to
> > message.format and we cannot just say we've 1.1 brokers with 0.8.2 message
> > format and now we want to block all 0.8.x clients.
> >
> > min.api.version helps at the cluster level to say that all users required
> > to upgrade clients to the at minimum need to speak the min.api.version and
> > not tie to message.format because not all cases one wants to upgrade the
> > message format and block the old clients.
> >
> >
> > To Gwen's point, I think we should also return in the error message that
> > the broker only supports min.api.version and above. So that users can see a
> > clear message and upgrade to a newer version.
> >
> >
> > Thanks,
> > Harsha
> >
> >
> > On Fri, Apr 12, 2019, at 12:19 PM, Ismael Juma wrote:
> > > Hi Ying,
> > >
> > > The actual reasons are important so that people can evaluate the KIP (and
> > > vote). :) Thanks for providing a few more:
> > >
> > > (1) force users to check pointing in Kafka instead of zookeeper
> > > (2) forbid an old go (sarama) client library which is known to have some
> > > serious bugs
> > >

Re: [VOTE] KIP-433: Block old clients on brokers

2019-04-19 Thread Harsha

Thanks Ying for updating the KIP. 
Hi Ismael,
 Given min.api.version allows admin/users to specifiy min.version 
for each request this should address your concerns right?

Thanks,
Harsha

On Wed, Apr 17, 2019, at 2:29 PM, Ying Zheng wrote:
> I have updated the config description in the KIP, made the example more
> clear
> 
> The proposed change allows setting different min versions for different
> APIs, and the ApiVersionRequest change is already in the KIP.
> 
> On Fri, Apr 12, 2019 at 8:22 PM Harsha  wrote:
> 
> > Hi Ismael,
> > I meant to say blocking clients based on their API version
> > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/api/ApiVersion.scala#L48
> > But If I understand what you are saying, since each client release can
> > support different versions for each of fetch, produce, offset commit etc..
> > and it's harder to block just based on single min.api.version setting
> > across different clients.
> > The idea I had in my mind was to do this via ApiVersionRequest, when a
> > client makes api request to broker in response we return min and max
> > version supported for each Api. When min.api.version enabled on broker, it
> > returns the maxVersion it supports for each of the requests in that release
> > as min versions to the clients.
> >
> > Example:
> > Kafka 1.1.1 broker and min.api.verson set to
> > https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/api/ApiVersion.scala#L79
> > (KAFKA_1_1_IV0) and client makes a ApiVersionsRequest and in response for
> > example produce request
> >
> > https://github.com/apache/kafka/blob/1.1/clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java#L112
> > Instead of returning all of the supported versions it will return
> > PRODUCE_REQUEST_V5 as the only supported version.
> >
> > Irrespective of the above approach I understand your point still stands
> > which is sarama might not choose to implement all the higher version
> > protocols for Kafka 1.1 release and they might introduce higher version of
> > produce request in a subsequent minor release and it will be harder for
> > users to figure out which release of sarama client they can use.
> >
> >
> > Ying, if you have a different apporach which might address this issue
> > please add.
> >
> >
> > Thanks,
> > Harsha
> >
> > On Fri, Apr 12, 2019, at 7:23 PM, Ismael Juma wrote:
> > > Hi Harsha,
> > >
> > > There is no such thing as 1.1 protocol. I encourage you to describe an
> > > example config that achieves what you are suggesting here. It's pretty
> > > complicated because the versions are per API and each client evolves
> > > independently.
> > >
> > > Ismael
> > >
> > > On Sat, Apr 13, 2019 at 4:09 AM Harsha  wrote:
> > >
> > > > Hi,
> > > >
> > > > "Relying on min.version seems like a pretty clunky way to achieve the
> > above
> > > > > list. The challenge is that it's pretty difficult to do it in a way
> > that
> > > > > works for clients across languages. They each add support for new
> > > > protocol
> > > > > versions independently (it could even happen in a bug fix release).
> > So,
> > > > if
> > > > > you tried to block Sarama in #2, you may block Java clients too."
> > > >
> > > > That's the intended effect, right?  if you as the admin/operator
> > > > configures the broker to have min.api.version to be 1.1
> > > > it should block java , sarama clients etc.. which are below the 1.1
> > > > protocol.  As mentioned this is not just related to log.format upgrade
> > > > problem but in general a forcing cause to get the users to upgrade
> > their
> > > > client version in a multi-tenant environment.
> > > >
> > > > "> For #3, it seems simplest to have a config that requires clients to
> > > > support
> > > > > a given message format version (or higher). For #2, it seems like
> > you'd
> > > > > want clients to advertise their versions. That would be useful for
> > > > multiple
> > > > > reasons."
> > > > This kip offers the ability to block clients based on the protocol they
> > > > support. This should be independent of the message format upgrade. Not
> > all
> > > > of the features or bugs are dependent on a message format and having a
> > >

Re: [VOTE] KIP-433: Block old clients on brokers

2019-04-24 Thread Harsha

Hi Gwen & Ismael,
   Do you have any feedback on with the proposed approach, 
min.api.version allowing users to specify versions for every request.

Thanks,
Harsha

On Fri, Apr 19, 2019, at 10:24 AM, Harsha wrote:
> Thanks Ying for updating the KIP. 
> Hi Ismael,
>  Given min.api.version allows admin/users to specifiy 
> min.version for each request this should address your concerns right?
> 
> Thanks,
> Harsha
> 
> On Wed, Apr 17, 2019, at 2:29 PM, Ying Zheng wrote:
> > I have updated the config description in the KIP, made the example more
> > clear
> > 
> > The proposed change allows setting different min versions for different
> > APIs, and the ApiVersionRequest change is already in the KIP.
> > 
> > On Fri, Apr 12, 2019 at 8:22 PM Harsha  wrote:
> > 
> > > Hi Ismael,
> > > I meant to say blocking clients based on their API version
> > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/api/ApiVersion.scala#L48
> > > But If I understand what you are saying, since each client release can
> > > support different versions for each of fetch, produce, offset commit etc..
> > > and it's harder to block just based on single min.api.version setting
> > > across different clients.
> > > The idea I had in my mind was to do this via ApiVersionRequest, when a
> > > client makes api request to broker in response we return min and max
> > > version supported for each Api. When min.api.version enabled on broker, it
> > > returns the maxVersion it supports for each of the requests in that 
> > > release
> > > as min versions to the clients.
> > >
> > > Example:
> > > Kafka 1.1.1 broker and min.api.verson set to
> > > https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/api/ApiVersion.scala#L79
> > > (KAFKA_1_1_IV0) and client makes a ApiVersionsRequest and in response for
> > > example produce request
> > >
> > > https://github.com/apache/kafka/blob/1.1/clients/src/main/java/org/apache/kafka/common/requests/ProduceRequest.java#L112
> > > Instead of returning all of the supported versions it will return
> > > PRODUCE_REQUEST_V5 as the only supported version.
> > >
> > > Irrespective of the above approach I understand your point still stands
> > > which is sarama might not choose to implement all the higher version
> > > protocols for Kafka 1.1 release and they might introduce higher version of
> > > produce request in a subsequent minor release and it will be harder for
> > > users to figure out which release of sarama client they can use.
> > >
> > >
> > > Ying, if you have a different apporach which might address this issue
> > > please add.
> > >
> > >
> > > Thanks,
> > > Harsha
> > >
> > > On Fri, Apr 12, 2019, at 7:23 PM, Ismael Juma wrote:
> > > > Hi Harsha,
> > > >
> > > > There is no such thing as 1.1 protocol. I encourage you to describe an
> > > > example config that achieves what you are suggesting here. It's pretty
> > > > complicated because the versions are per API and each client evolves
> > > > independently.
> > > >
> > > > Ismael
> > > >
> > > > On Sat, Apr 13, 2019 at 4:09 AM Harsha  wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > "Relying on min.version seems like a pretty clunky way to achieve the
> > > above
> > > > > > list. The challenge is that it's pretty difficult to do it in a way
> > > that
> > > > > > works for clients across languages. They each add support for new
> > > > > protocol
> > > > > > versions independently (it could even happen in a bug fix release).
> > > So,
> > > > > if
> > > > > > you tried to block Sarama in #2, you may block Java clients too."
> > > > >
> > > > > That's the intended effect, right?  if you as the admin/operator
> > > > > configures the broker to have min.api.version to be 1.1
> > > > > it should block java , sarama clients etc.. which are below the 1.1
> > > > > protocol.  As mentioned this is not just related to log.format upgrade
> > > > > problem but in general a forcing cause to get the users to upgrade
> > > their
> > > > > client version in a multi-tenant environment.
> > > > >
> > &

Re: [VOTE] KIP-434: Dead replica fetcher and log cleaner metrics

2019-05-07 Thread Harsha

Thanks for the kip. LGTM +1.

-Harsha

On Mon, Apr 29, 2019, at 8:14 AM, Viktor Somogyi-Vass wrote:
> Hi Jason,
> 
> I too agree this is more of a problem in older versions and therefore we
> could backport it. Were you thinking of any specific versions? I guess the
> 2.x and 1.x versions are definitely targets here but I was thinking that we
> might not want to further.
> 
> Viktor
> 
> On Mon, Apr 29, 2019 at 12:55 AM Stanislav Kozlovski 
> wrote:
> 
> > Thanks for the work done, Viktor! +1 (non-binding)
> >
> > I strongly agree with Jason that this monitoring-focused KIP is worth
> > porting back to older versions. I am sure users will find it very useful
> >
> > Best,
> > Stanislav
> >
> > On Fri, Apr 26, 2019 at 9:38 PM Jason Gustafson 
> > wrote:
> >
> > > Thanks, that works for me. +1
> > >
> > > By the way, we don't normally port KIPs to older releases, but I wonder
> > if
> > > it's worth making an exception here. From recent experience, it tends to
> > be
> > > the older versions that are more prone to fetcher failures. Thoughts?
> > >
> > > -Jason
> > >
> > > On Fri, Apr 26, 2019 at 5:18 AM Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com>
> > > wrote:
> > >
> > > > Let me have a second thought, I'll just add the clientId instead to
> > > follow
> > > > the convention, so it'll change DeadFetcherThreadCount but with the
> > > > clientId tag.
> > > >
> > > > On Fri, Apr 26, 2019 at 11:29 AM Viktor Somogyi-Vass <
> > > > viktorsomo...@gmail.com> wrote:
> > > >
> > > > > Hi Jason,
> > > > >
> > > > > Yea I think it could make sense. In this case I would rename the
> > > > > DeadFetcherThreadCount to DeadReplicaFetcherThreadCount and introduce
> > > the
> > > > > metric you're referring to as DeadLogDirFetcherThreadCount.
> > > > > I'll update the KIP to reflect this.
> > > > >
> > > > > Viktor
> > > > >
> > > > > On Thu, Apr 25, 2019 at 8:07 PM Jason Gustafson 
> > > > > wrote:
> > > > >
> > > > >> Hi Viktor,
> > > > >>
> > > > >> This looks good. Just one question I had is whether we may as well
> > > cover
> > > > >> the log dir fetchers as well.
> > > > >>
> > > > >> Thanks,
> > > > >> Jason
> > > > >>
> > > > >>
> > > > >> On Thu, Apr 25, 2019 at 7:46 AM Viktor Somogyi-Vass <
> > > > >> viktorsomo...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Folks,
> > > > >> >
> > > > >> > This thread sunk a bit but I'd like to bump it hoping to get some
> > > > >> feedback
> > > > >> > and/or votes.
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Viktor
> > > > >> >
> > > > >> > On Thu, Mar 28, 2019 at 8:47 PM Viktor Somogyi-Vass <
> > > > >> > viktorsomo...@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Sorry, the end of the message cut off.
> > > > >> > >
> > > > >> > > So I tried to be consistent with the convention in LogManager,
> > > hence
> > > > >> the
> > > > >> > > hyphens and in AbstractFetcherManager, hence the camel case. It
> > > > would
> > > > >> be
> > > > >> > > nice though to decide with one convention across the whole
> > > project,
> > > > >> > however
> > > > >> > > it requires a major refactor (especially for the components that
> > > > >> leverage
> > > > >> > > metrics for monitoring).
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Viktor
> > > > >> > >
> > > > >> > > On Thu, Mar 28, 2019 at 8:44 PM Viktor Somogyi-Vass <
> > > > >> > > viktorsomo...@gmail.com> wrote:
> > > > >> > >
> > > > >> > >>

Re: [VOTE] KIP-429: Kafka Consumer Incremental Rebalance Protocol

2019-05-22 Thread Harsha

+1 (binding). Thanks for the KIP looking forward for this to be avaiable in 
consumers.

Thanks,
Harsha

On Wed, May 22, 2019, at 12:24 AM, Liquan Pei wrote:
> +1 (non-binding)
> 
> On Tue, May 21, 2019 at 11:34 PM Boyang Chen  wrote:
> 
> > Thank you Guozhang for all the hard work.
> >
> > +1 (non-binding)
> >
> > 
> > From: Guozhang Wang 
> > Sent: Wednesday, May 22, 2019 1:32 AM
> > To: dev
> > Subject: [VOTE] KIP-429: Kafka Consumer Incremental Rebalance Protocol
> >
> > Hello folks,
> >
> > I'd like to start the voting for KIP-429 now, details can be found here:
> >
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol#KIP-429:KafkaConsumerIncrementalRebalanceProtocol-RebalanceCallbackErrorHandling
> >
> > And the on-going PRs available for review:
> >
> > Part I: https://github.com/apache/kafka/pull/6528
> > Part II: https://github.com/apache/kafka/pull/6778
> >
> >
> > Thanks
> > -- Guozhang
> >
> 
> 
> -- 
> Liquan Pei
> Software Engineer, Confluent Inc
>

Re: Possible implementation for KAFKA-560

2019-07-08 Thread Harsha

Hi Carlos,
   This is a really useful feature and we would like to have it as 
well. I think high_watermark == log_start_offset is a good starting point to 
consider but we may also have a case where the topic is empty and the clients 
producing it may be offline so we might end up garbage collecting which is 
still active.  Having a configurable time period when an empty topic can be 
deleted will help in this case. Also, we should check if there are any 
consumers still reading from topics etc.. 
  It will be good to have a KIP around this and add some edge cases 
handling.

Thanks,
Harsha


On Sun, Jun 23, 2019, at 9:40 PM, Carlos Manuel Duclos-Vergara wrote:
> Hi,
> Thanks for the answer. Looking at high water mark, then the logic would be
> to flag the partitions that have
> 
> high_watermark == log_start_offset
> 
> In addition, I'm thinking that having the leader fulfill that criteria is
> enough to flag a partition, maybe check the replicas only if requested by
> the user.
> 
> 
> fre. 21. jun. 2019, 23:35 skrev Colin McCabe :
> 
> > I don't think this requires a change in the protocol.  It seems like you
> > should be able to use the high water mark to figure something out here?
> >
> > best,
> > Colin
> >
> >
> > On Fri, Jun 21, 2019, at 04:56, Carlos Manuel Duclos-Vergara wrote:
> > > Hi,
> > >
> > > This is an ancient task, but I feel it is still current today (specially
> > > since as somebody that deals with a Kafka cluster I know that this
> > happens
> > > more often than not).
> > >
> > > The task is about garbage collection of topics in a sort of automated
> > way.
> > > After some consideration I started a prototype implementation based on a
> > > manual process:
> > >
> > > 1. Using the cli, I can use the --describe-topic to get a list of topics
> > > that have size 0
> > > 2. Massage that list into something that can be then fed into the cli and
> > > remove the topics that have size 0.
> > >
> > > The guiding principle here is the assumption that abandoned topics will
> > > eventually have size 0, because all records will expire. This is not true
> > > for all topics, but it covers a large portion of them and having
> > something
> > > like this would help admins to find "suspicious" topics at least.
> > >
> > > I started implementing this change and I realized that it would require a
> > > change in the protocol, because the sizes are never sent over the wire.
> > > Funny enough we collect the sizes of the log files, but we do not send
> > them.
> > >
> > > I think this kind of changes will require a KIP, but I wanted to ask what
> > > others think about this.
> > >
> > > The in-progress implementation of this can be found here:
> > >
> > https://github.com/carlosduclos/kafka/commit/0dffe5e131c3bd32b77f56b9be8eded89a96df54
> > >
> > > Comments?
> > >
> > > --
> > > Carlos Manuel Duclos Vergara
> > > Backend Software Developer
> > >
> >
>

Re: [DISCUSS] KIP-486 Support for pluggable KeyStore and TrustStore

2019-07-16 Thread Harsha

Hi Maulin,
   This is not required. We are already addressing this. One can 
write a KeyStoreProvider and TrustStoreProvider. Please look at this JIRA 
https://issues.apache.org/jira/browse/KAFKA-8191 . It allows users to add 
custom security provider in which you can write KeyManagerFactory, 
TrustManagerFactory and add that your JVM settings and pass that factory name 
via configs exposed in KAFKA-8191. These are Java APIs and instead of adding 
custom apis like you are proposing in the KIP. 

-Harsha

On Tue, Jul 16, 2019, at 1:51 PM, Maulin Vasavada wrote:
> Bump! Can somebody please review this?
>

Re: [DISCUSS] KIP-486 Support for pluggable KeyStore and TrustStore

2019-07-16 Thread Harsha

You can look at the implementation here for an example 
https://github.com/spiffe/spiffe-example/blob/master/java-spiffe/spiffe-security-provider/src/main/java/spiffe/api/provider/SpiffeProvider.java


On Tue, Jul 16, 2019, at 9:00 PM, Harsha wrote:
> Hi Maulin,
>This is not required. We are already addressing this. 
> One can write a KeyStoreProvider and TrustStoreProvider. Please look at 
> this JIRA https://issues.apache.org/jira/browse/KAFKA-8191 . It allows 
> users to add custom security provider in which you can write 
> KeyManagerFactory, TrustManagerFactory and add that your JVM settings 
> and pass that factory name via configs exposed in KAFKA-8191. These are 
> Java APIs and instead of adding custom apis like you are proposing in 
> the KIP. 
> 
> -Harsha
> 
> On Tue, Jul 16, 2019, at 1:51 PM, Maulin Vasavada wrote:
> > Bump! Can somebody please review this?
> >
>

Re: [DISCUSS] KIP-492 Add java security providers in Kafka Security config

2019-07-16 Thread Harsha

Thanks for the KIP Sandeep. LGTM.

Mani & Rajini, can you please look at the KIP as well.

Thanks,
Harsha

On Tue, Jul 16, 2019, at 2:54 PM, Sandeep Mopuri wrote:
> Thanks for the suggestions, made changes accordingly.
> 
> On Tue, Jul 16, 2019 at 9:27 AM Satish Duggana 
> wrote:
> 
> > Hi Sandeep,
> > Thanks for the KIP, I have few comments below.
> >
> > >>“To take advantage of these custom algorithms, we want to support java
> > security provider parameter in security config. This param can be used by
> > kafka brokers or kafka clients(when connecting to the kafka brokers). The
> > security providers can also be used for configuring security algorithms in
> > SASL based communication.”
> >
> > You may want to mention use case like
> > spiffe.provider.SpiffeProvider[1] in streaming applications like
> > Flink, Spark or Storm etc.
> >
> > >>"We add new config parameter in KafkaConfig named
> > “security.provider.class”. The value of “security.provider” is expected to
> > be a string representing the provider’s full classname. This provider class
> > will be added to the JVM properties through Security.addProvider api.
> > Security class can be used to programmatically add the provider classes to
> > the JVM."
> >
> > It is good to have this property as a list of providers instead of a
> > single property. This will allow configuring multiple providers if it
> > is needed in the future without introducing hacky solutions like
> > security.provider.class.name.x, where x is a sequence number. You can
> > change the property name to “security.provider.class.names” and its
> > value is a list of fully qualified provider class names separated by
> > ‘,'.
> > For example:
> >
> > security.provider.class.names=spiffe.provider.SpiffeProvider,com.foo.MyProvider
> >
> > Typo in existing properties section:
> > “ssl.provider” instead of “ssl.providers”.
> >
> > Thanks,
> > Satish.
> >
> > 1. https://github.com/spiffe/java-spiffe
> >
> >
> > On Mon, Jul 15, 2019 at 11:41 AM Sandeep Mopuri  wrote:
> > >
> > > Hello all,
> > >
> > > I'd like to start a discussion thread for KIP-492.
> > > This KIP plans on introducing a new security config parameter for a
> > custom
> > > security providers. Please take a look and let me know what do you think.
> > >
> > > More information can be found here:
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-492%3A+Add+java+security+providers+in+Kafka+Security+config
> > > --
> > > Thanks,
> > > Sai Sandeep
> >
> 
> 
> -- 
> Thanks,
> M.Sai Sandeep
>

Re: Fwd: [DISCUSS] KIP-492 Add java security providers in Kafka Security config

2019-07-24 Thread Harsha

Thanks for the details. 
Rajini, Can you please take a look and let us know if these addresses your 
concerns.

Thanks,
Harsha

On Mon, Jul 22, 2019, at 9:36 AM, Sandeep Mopuri wrote:
> Hi Rajini,
>  Thanks for raising the above questions. Please find the
> replies below
> 
> On Wed, Jul 17, 2019 at 2:49 AM Rajini Sivaram 
> wrote:
> 
> > Hi Sandeep,
> >
> > Thanks for the KIP. A few questions below:
> >
> >1. Is the main use case for this KIP adding security providers for SSL?
> >If so, wouldn't a more generic solution like KIP-383 work for this?
> >
>We’re trying to solve this for both SSL and SASL. KIP-383 allows the
> creation of custom SSLFactory implementation, however adding the provides
> to new security algorithms doesn’t involve any new implementation of
> SSLFactory. Even after the KIP 383, people still are finding a need for
> loading custom keymanager and trustmanager implementations (KIP 486)
> 
>2. Presumably this config would also apply to clients. If so, have we
> >thought through the implications of changing static JVM-wide security
> >providers in the client applications?
> >
>Yes, this config will be applied to clients as well and the
> responsibility to face the consequences of adding the security providers
> need to be taken by the clients. In cases of resource manager running
> streaming applications such as Yarn, Mesos etc.. each user needs to make
> sure they are passing these JVM arguments.
> 
>3. Since client applications can programmatically invoke the Java
> >Security API anyway, isn't the system property described in `Rejected
> >Alternatives` a reasonable solution for brokers?
> >
>   This is true in a kafka only environment, but with an eco-system of
> streaming applications like flink, spark etc which might produce to kafka,
> it’s difficult to make changes to all the clients
> 
>4. We have SASL login modules in Kafka that automatically add security
> >providers for SASL mechanisms not supported by the JVM. We should
> > describe
> >the impact of this KIP on those and whether we would now require a
> > config
> >to enable these security providers
> 
> In a single JVM, one can register multiple security.providers. By default
> JVM itself provides multiple providers and these will not stepped over each
> other. The only way to activate a provider is through their registered names
> Example:
> 
> $ cat /usr/lib/jvm/jdk-8-oracle-x64/jre/lib/security/java.security
> ...
> security.provider.1=sun.security.provider.Sun
> security.provider.2=sun.security.rsa.SunRsaSign
> security.provider.3=sun.security.ec.SunEC
> security.provider.4=com.sun.net.ssl.internal.ssl.Provider
> security.provider.5=com.sun.crypto.provider.SunJCE
> security.provider.6=sun.security.jgss.SunProvider
> security.provider.7=com.sun.security.sasl.Provider
> security.provider.8=org.jcp.xml.dsig.internal.dom.XMLDSigRI
> security.provider.9=sun.security.smartcardio.SunPCSC
> ...
> 
>A user of a provider will refer through their registered provider
> 
> https://github.com/spiffe/spiffe-example/blob/master/java-spiffe/spiffe-security-provider/src/main/java/spiffe/api/provider/SpiffeProvider.java#L31
> 
>In the above example , we can register the SpiffeProvider and
> multiple other providers into the JVM. When a client or a broker wants to
> integrate with SpiffeProvider they need to add the config
> ssl.keymanager.alogirhtm = "Spiffe" . Another client can refer to a
> different provider or use a default one in the same JVM.
> 
> 
> >5. We have been moving away from JVM-wide configs like the default JAAS
> >config since they are hard to test reliably or update dynamically. The
> >replacement config `sasl.jaas.config` doesn't insert a JVM-wide
> >configuration. Have we investigated similar options for the specific
> >scenario we are addressing here?
> >
>Yes, that is the case with jaas config, however in the case of
> security providers, along with adding the security providers to JVM
> properties, one also need to configure the provider algorithm. For example,
> in the case of SSL configuration, besides adding the security provider to
> the JVM, we need to configure the “ssl.trustmanager.algorithm” and
> “ssl.keymanager.algorithm” inorder for the provider implementation to
> apply. Different components can opt for different key and trustmanager
> algorithms and can work independently simultaneously in the same JVM. This
> case is different from the jaas config.
> 
> 
> >6. Are we always going to insert new prov

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

2019-07-25 Thread Harsha

Hi Habib,
  Yes. Our approach is to have retention as you see it to day i.e 
delete the local log segments after configured amount of time or size is 
reached. We will be shipping logs to remote storage such as HDFS or S3 as soon 
as a log  segment is rotated in a topic-partition.  This will not trigger any 
deletion of local segments. 

Thanks,
Harsha

On Thu, Jul 25, 2019, at 6:01 AM, Habib Nahas wrote:
> Hi,
> 
> Under the proposed definition of RemoteTier, would it be possible to 
> have an implementation that transfers older log segments to a slower 
> storage tier, but one that is still local?
> Examples of slower local(ie mounted locally) tiers being HDDs vs SSDs, 
> or NFS volumes. 
> 
> Let me know if I"m missing an existing solution for this usecase.
> Thanks,
> 
> Habib
> 
> 
> On 2019/04/09 05:04:17, Harsha  wrote: 
> > Thanks, Ron. Updating the KIP. will add answers here as well> 
> > 
> > 1) If the cold storage technology can be cross-region, is there a> 
> > possibility for a disaster recovery Kafka cluster to share the messages in> 
> > cold storage? My guess is the answer is no, and messages replicated to the> 
> > D/R cluster have to be migrated to cold storage from there independently.> 
> > (The same cross-region cold storage medium could be used, but every 
> > message> 
> > would appear there twice).> 
> > 
> > If I understand the question correctly, what you are saying is Kafka A 
> > cluster (active) shipping logs to remote storage which cross-region 
> > replication and another Kafka Cluster B (Passive) will it be able to use 
> > the remote storage copied logs directly.>
> 
> 
> 
> 
> > For the initial version my answer is No. We can handle this in subsequent 
> > changes after this one.> 
> > 
> > 2) Can/should external (non-Kafka) tools have direct access to the 
> > messages> 
> > in cold storage. I think this might have been addressed when someone asked> 
> > about ACLs, and I believe the answer is "no" -- if some external tool 
> > needs> 
> > to operate on that data then that external tool should read that data by> 
> > acting as a Kafka consumer. Again, just asking to get the answer clearly> 
> > documented in case it is unclear.> 
> > 
> > The answer is No. All tools/clients must go through broker APIs to access 
> > any data (local or remote). > 
> > Only Kafka broker user will have access to remote storage logs and 
> > Security/ACLs will work the way it does today.> 
> > Tools/Clients going directly to the remote storage might help in terms of 
> > efficiency but this requires Protocol changes and some way of syncing ACLs 
> > in Kafka to the Remote storage. >
> 
> 
> 
> 
> > 
> > 
> > Thanks,> 
> > Harsha> 
> > 
> > On Mon, Apr 8, 2019, at 8:48 AM, Ron Dagostino wrote:> 
> > > Hi Harsha. A couple of questions. I think I know the answers, but it> 
> > > would be good to see them explicitly documented.> 
> > > > 
> > > 1) If the cold storage technology can be cross-region, is there a> 
> > > possibility for a disaster recovery Kafka cluster to share the messages 
> > > in> 
> > > cold storage? My guess is the answer is no, and messages replicated to 
> > > the> 
> > > D/R cluster have to be migrated to cold storage from there 
> > > independently.> 
> > > (The same cross-region cold storage medium could be used, but every 
> > > message> 
> > > would appear there twice).> 
> > > > 
> > > 2) Can/should external (non-Kafka) tools have direct access to the 
> > > messages> 
> > > in cold storage. I think this might have been addressed when someone 
> > > asked> 
> > > about ACLs, and I believe the answer is "no" -- if some external tool 
> > > needs> 
> > > to operate on that data then that external tool should read that data by> 
> > > acting as a Kafka consumer. Again, just asking to get the answer clearly> 
> > > documented in case it is unclear.> 
> > > > 
> > > Ron> 
> > > > 
> > > > 
> > > On Thu, Apr 4, 2019 at 12:53 AM Harsha  wrote:> 
> > > > 
> > > > Hi Viktor,> 
> > > >> 
> > > >> 
> > > > "Now, will the consumer be able to consume a remote segment if:> 
> > > > - the remote segment is stored in the remote storage, BUT> 
> > > > - the lea

Re: Fwd: [DISCUSS] KIP-492 Add java security providers in Kafka Security config

2019-07-25 Thread Harsha

Thanks Rajini .

> 4) The main difference between SSL and SASL is that for SSL, you register a
> provider with your own algorithm name and you specify your algorithm name
> in a separate config. This algorithm name can be anything you choose. For
> SASL, we register providers for standard SASL mechanism names. If this KIP
> wants to address the SASL case, then basically you need to add a provider
> ahead of the built-in provider for that standard mechanism name. See below:

We can keep this to SSL only. Given for SASL users can configure a provider for 
LoginModule.
Unless there is a usecase where we benefit from having a provider config for 
SASL.



On Thu, Jul 25, 2019, at 5:25 AM, Rajini Sivaram wrote:
> Hi Sandeep/Harsha,
> 
> I don't have any major concerns about this KIP since it solves a specific
> issue and is a relatively minor change. I am unconvinced about the SASL
> case, but it probably is better to add as a config that can be used with
> SASL as well in future anyway.
> 
> Just to complete the conversation above:
> 
> 4) The main difference between SSL and SASL is that for SSL, you register a
> provider with your own algorithm name and you specify your algorithm name
> in a separate config. This algorithm name can be anything you choose. For
> SASL, we register providers for standard SASL mechanism names. If this KIP
> wants to address the SASL case, then basically you need to add a provider
> ahead of the built-in provider for that standard mechanism name. See below:
> 
> 6) JVM starts off with an ordered list of providers, the `java.security`
> file you quote above shows the ordering. When you dynamically add
> providers, you have the choice of adding it anywhere in that list.
> Particularly with SASL where you may have multiple providers with the same
> name, this ordering is significant. Security.insertProviderAt(Provider
> provider, int position) allows you to choose the position. I think the KIP
> intends to add provider at the beginning using Security.addProvider(Provider
> provider), which is basically inserting at position 0. We should clarify
> the behaviour in the KIP.
> 
> We should also clarify how this config will co-exist with existing dynamic
> updates of security providers in the SASL login modules in Kafka. Or
> clarify that you can't override those. What we don't want is
> non-deterministic behaviour which is the biggest issue with these static
> configs.
> 
> 
> On Wed, Jul 24, 2019 at 5:50 PM Harsha  wrote:
> 
> > Thanks for the details.
> > Rajini, Can you please take a look and let us know if these addresses your
> > concerns.
> >
> > Thanks,
> > Harsha
> >
> > On Mon, Jul 22, 2019, at 9:36 AM, Sandeep Mopuri wrote:
> > > Hi Rajini,
> > >  Thanks for raising the above questions. Please find the
> > > replies below
> > >
> > > On Wed, Jul 17, 2019 at 2:49 AM Rajini Sivaram 
> > > wrote:
> > >
> > > > Hi Sandeep,
> > > >
> > > > Thanks for the KIP. A few questions below:
> > > >
> > > >1. Is the main use case for this KIP adding security providers for
> > SSL?
> > > >If so, wouldn't a more generic solution like KIP-383 work for this?
> > > >
> > >We’re trying to solve this for both SSL and SASL. KIP-383 allows
> > the
> > > creation of custom SSLFactory implementation, however adding the provides
> > > to new security algorithms doesn’t involve any new implementation of
> > > SSLFactory. Even after the KIP 383, people still are finding a need for
> > > loading custom keymanager and trustmanager implementations (KIP 486)
> > >
> > >2. Presumably this config would also apply to clients. If so, have we
> > > >thought through the implications of changing static JVM-wide
> > security
> > > >providers in the client applications?
> > > >
> > >Yes, this config will be applied to clients as well and the
> > > responsibility to face the consequences of adding the security providers
> > > need to be taken by the clients. In cases of resource manager running
> > > streaming applications such as Yarn, Mesos etc.. each user needs to make
> > > sure they are passing these JVM arguments.
> > >
> > >3. Since client applications can programmatically invoke the Java
> > > >Security API anyway, isn't the system property described in
> > `Rejected
> > > >Alternatives` a reasonable solution for brokers?
> > > >
> > >   This is true in a kafka only environmen

Re: [DISCUSS] KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

2019-08-01 Thread Harsha

Hi Colin,
 Looks like KIP is missing the images , links are broken.
Thanks,
Harsha

On Thu, Aug 1, 2019, at 2:05 PM, Colin McCabe wrote:
> Hi all,
> 
> I've written a KIP about removing ZooKeeper from Kafka.  Please take a 
> look and let me know what you think:
> 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum
> 
> cheers,
> Colin
>

Re: [DISCUSS] KIP-499 - Unify connection name flag for command line tool

2019-08-01 Thread Harsha

+1 for the KIP. 
-Harsha

On Thu, Aug 1, 2019, at 3:07 PM, Colin McCabe wrote:
> On Wed, Jul 31, 2019, at 05:26, Mitchell wrote:
> > Hi Jason,
> > Thanks for looking at this!
> > 
> > I wasn't exactly sure what to put in the compatibility section.  I wrote
> > the KIP thinking that we should probably mark the old arguments for
> > deprecation for a release or two before actually removing them.  I'm happy
> > to change this either way if it's preferred.
> 
> I think Jason was proposing deprecating (but NOT removing) the old 
> arguments in the next release.
> 
> Thanks for tackling this.  +1 for the KIP.
> 
> best,
> Colin
> 
> > -mitch
> > 
> > On Tue, Jul 30, 2019 at 11:55 PM Jason Gustafson  wrote:
> > 
> > > Hey Mitch, thanks for the KIP! This command line inconsistency frustrates
> > > me almost every day. I'm definitely +1 on this.
> > >
> > > One minor nitpick. The compatibility section mentions there will be no
> > > deprecations, but it sounds like we are planning on deprecating the old
> > > arguments?
> > >
> > > Thanks,
> > > Jason
> > >
> > > On Tue, Jul 30, 2019 at 5:25 PM Mitchell  wrote:
> > >
> > > > Hello,
> > > > I have written a proposal to add the command line argument
> > > > `--bootstrap-server` to 5 of the existing command line tools that do not
> > > > currently use `--broker-list` for passing cluster connection 
> > > > information.
> > > >
> > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-499+-+Unify+connection+name+flag+for+command+line+tool
> > > >
> > > > Please take a look and let me know what you think.
> > > > Thanks,
> > > > -Mitch
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-05-19 Thread Harsha

Hi All,
   Can we have a KIP meeting around this. The KIP is up for
   sometime and if there are any questions lets quickly hash out
   details. 

Thanks,
Harsha

On Thu, May 19, 2016, at 08:40 AM, parth brahmbhatt wrote:
> That is what the hadoop echo system uses so no good reason really. We
> could
> change it to whatever is the newest recommended standard is.
> 
> Thanks
> Parth
> 
> On Thu, May 19, 2016 at 3:33 AM, Ismael Juma  wrote:
> 
> > Hi Parth,
> >
> > Thanks for the KIP. I only started reviewing this and may have additional
> > questions later. The immediate question that came to mind is our choice of
> > "DIGEST-MD5" even though it's marked as OBSOLETE in the IANA Registry of
> > SASL mechanisms and the original RFC (2831) has been moved to Historic
> > status:
> >
> > https://tools.ietf.org/html/rfc6331
> > http://www.iana.org/assignments/sasl-mechanisms/sasl-mechanisms.xhtml
> >
> > What is the reasoning behind that choice?
> >
> > Thanks,
> > Ismael
> >
> > On Fri, May 13, 2016 at 11:29 PM, Gwen Shapira  wrote:
> >
> > > Also comments inline :)
> > >
> > > > * I want to emphasize that even though delegation tokens are a Hadoop
> > > > innovation, I feel very strongly about not adding dependency on Hadoop
> > > > when implementing delegation tokens for Kafka. The KIP doesn't imply
> > > > such dependency, but if you can clarify...
> > > >
> > > >
> > > > *No hadoop dependency.*
> > >
> > > Yay! Just add this to the KIP so no one will read the KIP and panic
> > > three weeks before the next release...
> > >
> > > > * Can we get delegation token at any time after authenticating? only
> > > > immediately after?
> > > >
> > > >
> > > > *As long as you are authenticated you can get delegation tokens. We
> > need
> > > to
> > > > discuss if a client authenticated using delegation token, can also
> > > acquire
> > > > delegation token again or not. Also there is the question of do we
> > allow
> > > > anyone to acquire delegation token or we want specific ACLs (I think
> > its
> > > an
> > > > overkill.)*
> > >
> > > I agree that ACLs is an overkill.
> > >
> > > I think we are debating two options: Either require Kerberos auth for
> > > renewal or require non-owners to renew.
> > > I *think* the latter is simpler (it basically require a "job master"
> > > to take responsibility for the renewal, it will have its own identity
> > > anyway and I think this is the correct design pattern anyway. For
> > > storm, I'd expect Nimbus to coordinate renewals?), but it is hard to
> > > debate simplicity without looking at the code changes required. If you
> > > have a draft of how the "require Kerberos" will look in Kafka code,
> > > I'll be happy to take a look.
> > >
> > > > * My understanding is that tokens will propagate via ZK but without
> > > > additional changes to UpdateMetadata protocol, correct? Clients
> > > > currently don't retry on SASL auth failure (IIRC), but since the
> > > > tokens propagate between brokers asynch, we will need to retry a bit
> > > > to avoid clients failing auth due to timing issues.
> > > >
> > > > *I am considering 2 alternatives right now. The current documented
> > > approach
> > > > is zookeeper based and it does not require any changes to
> > UpdateMetadata
> > > > protocol. An alternative approach can remove zookeeper dependency as
> > well
> > > > but we can discuss that in KIP discussion call.*
> > >
> > > Oooh! Sounds interesting. Do you want to ping Jun to arrange a call?
> > >
> > > > * I liked Ashish's suggestion of having just the controller issue the
> > > > delegation tokens, to avoid syncing a shared secret. Not sure if we
> > > > want to continue the discussion here or on the wiki. I think that we
> > > > can decouple the problem of "token distribution" from "shared secret
> > > > distribution" and use the controller as the only token generator to
> > > > solve the second issue, while still using ZK async to distribute
> > > > tokens.
> > > >
> > > >
> > > > *As mentioned in the previous Email I am fine with that approach as
> > l

Re: [VOTE] 0.10.0.0 RC6

2016-05-20 Thread Harsha

+1 . Ran a 3-node cluster with few system tests on our side. Looks good.

-Harsha

On Thu, May 19, 2016, at 07:47 PM, Jun Rao wrote:
> Thanks for running the release. +1 from me. Verified the quickstart.
> 
> Jun
> 
> On Tue, May 17, 2016 at 10:00 PM, Gwen Shapira  wrote:
> 
> > Hello Kafka users, developers and client-developers,
> >
> > This is the seventh (!) candidate for release of Apache Kafka
> > 0.10.0.0. This is a major release that includes: (1) New message
> > format including timestamps (2) client interceptor API (3) Kafka
> > Streams.
> >
> > This RC was rolled out to fix an issue with our packaging that caused
> > dependencies to leak in ways that broke our licensing, and an issue
> > with protocol versions that broke upgrade for LinkedIn and others who
> > may run from trunk. Thanks to Ewen, Ismael, Becket and Jun for the
> > finding and fixing of issues.
> >
> > Release notes for the 0.10.0.0 release:
> > http://home.apache.org/~gwenshap/0.10.0.0-rc6/RELEASE_NOTES.html
> >
> > Lets try to vote within the 72h release vote window and get this baby
> > out already!
> >
> > *** Please download, test and vote by Friday, May 20, 23:59 PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~gwenshap/0.10.0.0-rc6/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * java-doc
> > http://home.apache.org/~gwenshap/0.10.0.0-rc6/javadoc/
> >
> > * tag to be voted upon (off 0.10.0 branch) is the 0.10.0.0 tag:
> >
> > https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=065899a3bc330618e420673acf9504d123b800f3
> >
> > * Documentation:
> > http://kafka.apache.org/0100/documentation.html
> >
> > * Protocol:
> > http://kafka.apache.org/0100/protocol.html
> >
> > /**
> >
> > Thanks,
> >
> > Gwen
> >

Re: KAFKA-3722 : Discussion about custom PrincipalBuilder and Authorizer configs

2016-05-20 Thread Harsha

Mayuresh,
 Thanks for the write up. With principal builder,
 the idea is to reuse a single principal builder
 across all the security protocols where its
 applicable and given that principal builder has
 access to transportLayer and authenticator it
 should be able to figure out what type of
 transportLayer it is and it should be able
 construct the principal based on that and it should
 handle all the security protocols that we support.
In your options 1,2 & 4 seems to be doing  the same
thing i.e checking what security protocol that a
given transportLayer is and building a principal ,
correct me if I am wrong here.   I like going with 4
as others stated on PR . As passing
security_protocol makes it more specific to the
method that its need to be handled . In the interest
of having less config I think option 4 seems to be
better even though it breaks the interface.

Thanks,
Harsha
On Fri, May 20, 2016, at 05:00 PM, Mayuresh Gharat wrote:
> Hi All,
> 
> I came across an issue with plugging in a custom PrincipalBuilder class
> using the config "principal.builder.class" along with a custom Authorizer
> class using the config "authorizer.class.name".
> 
> Consider the following scenario :
> 
> For PlainText we don't supply any PrincipalBuilder. For SSL we want to
> supply a PrincipalBuilder using the property "principal.builder.class".
> 
> a) Now consider we have a broker running on these 2 ports and supply that
> custom principalBuilder class using that config.
> 
> b) The interbroker communication is using PlainText. I am using a single
> broker cluster for testing.
> 
> c) Now we issue a produce request on the SSL port of the broker.
> 
> d) The controller tries to build a channel for plaintext with this broker
> for the new topic instructions.
> 
> e) PlainText tries to use the principal builder specified in the
> "principal.builder.class" config which was meant only for SSL port since
> the code path is same "ChannelBuilders.createPrincipalBuilder(configs)".
> 
> f) In the custom principal Builder if we are trying to do some cert
> checks
> or down conversion of transportLayer to SSLTransportLayer so that we can
> use its functionality we get error/exception at runtime.
> 
> The basic idea is the PlainText channel should not be using the
> PrincipalBuilder meant for other types of channels.
> 
> Now there are few options/workarounds to avoid this :
> 
> 1) Do instanceOf check in Authorizer.authorize() on TransportLayer
> instance
> passed in and do the correct handling. This is not intuitive and imposes
> a
> strict coding rule on the programmer.
> 
> 2) TransportLayer should expose an API for telling the security protocol
> type. This is not too intuitive either.
> 
> 3) Add extra configs for Authorizer and PrincipalBuilder for each channel
> type. This gives us a flexibility for the PrincipalBuilder and Authorizer
> handle requests on different types of ports in a different way.
> 
> 4) PrincipalBuilder.buildPrincipal() should take in extra parameter for
> the
> type of protocol and we should document this in javadoc to use it to
> handle
> the type of request. This is little better than 1) and 2) but again
> imposes
> a strict coding rule on the programmer.
> 
> Just wanted to know what the community thinks about this and get any
> suggestions/feedback . There's some discussion about this here :
> https://github.com/apache/kafka/pull/1403
> 
> Thanks,
> 
> Mayuresh

Re: Apache Kafka JIRA Worflow: Add Closed -> Reopen transition

2016-05-20 Thread Harsha

Manikumar,
Any reason for this. Before the workflow is to open
a new JIRA if a JIRA closed.
-Harsha

On Fri, May 20, 2016, at 08:54 PM, Manikumar Reddy wrote:
> Jun/Ismail,
> 
> I requested Apache Infra  to change JIRA workflow to add  Closed ->
> Reopen
> transition.
> https://issues.apache.org/jira/browse/INFRA-11857
> 
> Let me know, If any concerns
> 
> Manikumar

Re: KAFKA-3722 : Discussion about custom PrincipalBuilder and Authorizer configs

2016-05-27 Thread Harsha

Mayuresh & Ismael, 
   Agree on not breaking interfaces on public API.
   +1 on option 2.
Thanks,
Harsha

On Mon, May 23, 2016, at 10:30 AM, Mayuresh Gharat wrote:
> Hi Harsha and Ismael,
> 
> Option 2 sounds like a good idea if we want to make this quick fix I
> think.
> Option 4 might require a KIP as its public interface change. I can
> resubmit
> a patch for option 2 or create a KIP if necessary for option 4.
> 
> From the previous conversation here, I think Ismael prefers option 2.
> I don't have a strong opinion here since I understand its not easy to
> make
> public API changes but IMO, would go with option 4.
> 
> Harsha what do you think on this?
> 
> 
> Thanks,
> 
> Mayuresh
> 
> On Mon, May 23, 2016 at 5:45 AM, Ismael Juma  wrote:
> 
> > Hi Mayuresh and Harsha,
> >
> > If we were doing this from scratch, I would prefer option 4 too. However,
> > users have their own custom principal builders now and option 2 with a
> > suitably updated javadoc is the way to go in my opinion.
> >
> > Ismael
> >
> > On Sat, May 21, 2016 at 2:28 AM, Harsha  wrote:
> >
> > > Mayuresh,
> > >  Thanks for the write up. With principal builder,
> > >  the idea is to reuse a single principal builder
> > >  across all the security protocols where its
> > >  applicable and given that principal builder has
> > >  access to transportLayer and authenticator it
> > >  should be able to figure out what type of
> > >  transportLayer it is and it should be able
> > >  construct the principal based on that and it should
> > >  handle all the security protocols that we support.
> > > In your options 1,2 & 4 seems to be doing  the same
> > > thing i.e checking what security protocol that a
> > > given transportLayer is and building a principal ,
> > > correct me if I am wrong here.   I like going with 4
> > > as others stated on PR . As passing
> > > security_protocol makes it more specific to the
> > > method that its need to be handled . In the interest
> > > of having less config I think option 4 seems to be
> > > better even though it breaks the interface.
> > >
> > > Thanks,
> > > Harsha
> > > On Fri, May 20, 2016, at 05:00 PM, Mayuresh Gharat wrote:
> > > > Hi All,
> > > >
> > > > I came across an issue with plugging in a custom PrincipalBuilder class
> > > > using the config "principal.builder.class" along with a custom
> > Authorizer
> > > > class using the config "authorizer.class.name".
> > > >
> > > > Consider the following scenario :
> > > >
> > > > For PlainText we don't supply any PrincipalBuilder. For SSL we want to
> > > > supply a PrincipalBuilder using the property "principal.builder.class".
> > > >
> > > > a) Now consider we have a broker running on these 2 ports and supply
> > that
> > > > custom principalBuilder class using that config.
> > > >
> > > > b) The interbroker communication is using PlainText. I am using a
> > single
> > > > broker cluster for testing.
> > > >
> > > > c) Now we issue a produce request on the SSL port of the broker.
> > > >
> > > > d) The controller tries to build a channel for plaintext with this
> > broker
> > > > for the new topic instructions.
> > > >
> > > > e) PlainText tries to use the principal builder specified in the
> > > > "principal.builder.class" config which was meant only for SSL port
> > since
> > > > the code path is same
> > "ChannelBuilders.createPrincipalBuilder(configs)".
> > > >
> > > > f) In the custom principal Builder if we are trying to do some cert
> > > > checks
> > > > or down conversion of transportLayer to SSLTransportLayer so that we
> > can
> > > > use its functionality we get error/exception at runtime.
> > > >
> > > > The basic idea is the PlainText channel should not be using the
> > > > PrincipalBuilder meant for other types of cha

Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-06-09 Thread Harsha

Jun & Ismael,
 Unfortunately I couldn't attend the KIP meeting
 when delegation tokens discussed. Appreciate if
 you can update the thread if you have any
 further questions.
Thanks,
Harsha

On Tue, May 24, 2016, at 11:32 AM, Liquan Pei wrote:
> It seems that the links to images in the KIP are broken.
> 
> Liquan
> 
> On Tue, May 24, 2016 at 9:33 AM, parth brahmbhatt <
> brahmbhatt.pa...@gmail.com> wrote:
> 
> > 110. What does getDelegationTokenAs mean?
> > In the current proposal we only allow a user to get delegation token for
> > the identity that it authenticated as using another mechanism, i.e. A user
> > that authenticate using a keytab for principal us...@example.com will get
> > delegation tokens for that user only. In future I think we will have to
> > extend support such that we allow some set of users (
> > kafka-rest-u...@example.com, storm-nim...@example.com) to acquire
> > delegation tokens on behalf of other users whose identity they have
> > verified independently.  Kafka brokers will have ACLs to control which
> > users are allowed to impersonate other users and get tokens on behalf of
> > them. Overall Impersonation is a whole different problem in my opinion and
> > I think we can tackle it in separate KIP.
> >
> > 111. What's the typical rate of getting and renewing delegation tokens?
> > Typically this should be very very low, 1 request per minute is a
> > relatively high estimate. However it depends on the token expiration. I am
> > less worried about the extra load it puts on controller vs the added
> > complexity and the value it offers.
> >
> > Thanks
> > Parth
> >
> >
> >
> > On Tue, May 24, 2016 at 7:30 AM, Ismael Juma  wrote:
> >
> > > Thanks Rajini. It would probably require a separate KIP as it will
> > > introduce user visible changes. We could also update KIP-48 to have this
> > > information, but it seems cleaner to do it separately. We can discuss
> > that
> > > in the KIP call today.
> > >
> > > Ismael
> > >
> > > On Tue, May 24, 2016 at 3:19 PM, Rajini Sivaram <
> > > rajinisiva...@googlemail.com> wrote:
> > >
> > > > Ismael,
> > > >
> > > > I have created a JIRA (
> > https://issues.apache.org/jira/browse/KAFKA-3751)
> > > > for adding SCRAM as a SASL mechanism. Would that need another KIP? If
> > > > KIP-48 will use this mechanism, can this just be a JIRA that gets
> > > reviewed
> > > > when the PR is ready?
> > > >
> > > > Thank you,
> > > >
> > > > Rajini
> > > >
> > > > On Tue, May 24, 2016 at 2:46 PM, Ismael Juma 
> > wrote:
> > > >
> > > > > Thanks Rajini, SCRAM seems like a good candidate.
> > > > >
> > > > > Gwen had independently mentioned this as a SASL mechanism that might
> > be
> > > > > useful for Kafka and I have been meaning to check it in more detail.
> > > Good
> > > > > to know that you are willing to contribute an implementation. Maybe
> > we
> > > > > should file a separate JIRA for this?
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Tue, May 24, 2016 at 2:12 PM, Rajini Sivaram <
> > > > > rajinisiva...@googlemail.com> wrote:
> > > > >
> > > > > > SCRAM (Salted Challenge Response Authentication Mechanism) is a
> > > better
> > > > > > mechanism than Digest-MD5. Java doesn't come with a built-in SCRAM
> > > > > > SaslServer or SaslClient, but I will be happy to add support in
> > Kafka
> > > > > since
> > > > > > it would be a useful mechanism to support anyway.
> > > > > > https://tools.ietf.org/html/rfc7677 describes the protocol for
> > > > > > SCRAM-SHA-256.
> > > > > >
> > > > > > On Tue, May 24, 2016 at 2:37 AM, Jun Rao  wrote:
> > > > > >
> > > > > > > Parth,
> > > > > > >
> > > > > > > Thanks for the explanation. A couple of more questions.
> > > > > > >
> > > > > > > 110. What does getDelegationTokenAs mean?
> > > > > > >
> > > > > > > 111. What's the typical rate of getting and renewing delegation
> > > > tokens?

Re: [VOTE] KIP-55: Secure quotas for authenticated users

2016-06-15 Thread Harsha

Rajini,
  How does sub-quotas works in case of authenticated users.
  Where are we maintaining the relation between users and their
  client Ids. Can you add an example of zk data under /users.
Thanks,
Harsha

On Mon, Jun 13, 2016, at 05:01 AM, Rajini Sivaram wrote:
> I have updated KIP-55 to reflect the changes from the discussions in the
> voting thread (
> https://www.mail-archive.com/dev@kafka.apache.org/msg51610.html).
> 
> Jun/Gwen,
> 
> Existing client-id quotas will be used as default client-id quotas for
> users when no user quotas are configured - i.e., default user quota is
> unlimited and no user-specific quota override is specified. This enables
> user rate limits to be configured for ANONYMOUS if required in a cluster
> that has both PLAINTEXT and SSL/SASL. By default, without any user rate
> limits set, rate limits for client-ids will apply, retaining the current
> client-id quota configuration for single-user clusters.
> 
> Zookeeper will have two paths /clients with client-id quotas that apply
> only when user quota is unlimited similar to now. And /users which
> persists
> user quotas for any user including ANONYMOUS.
> 
> Comments and feedback are appreciated.
> 
> Regards,
> 
> Rajini
> 
> 
> On Wed, Jun 8, 2016 at 9:00 PM, Rajini Sivaram
>  > wrote:
> 
> > Jun,
> >
> > Oops, sorry, I hadn't realized that the last note was on the discuss
> > thread. Thank you for pointing it out. I have sent another note for voting.
> >
> >
> > On Wed, Jun 8, 2016 at 4:30 PM, Jun Rao  wrote:
> >
> >> Rajini,
> >>
> >> Perhaps it will be clearer if you start the voting in a new thread (with
> >> VOTE in the subject).
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Tue, Jun 7, 2016 at 1:55 PM, Rajini Sivaram <
> >> rajinisiva...@googlemail.com
> >> > wrote:
> >>
> >> > I would like to initiate the vote for KIP-55.
> >> >
> >> > The KIP details are here: KIP-55: Secure quotas for authenticated users
> >> > <
> >> >
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-55%3A+Secure+Quotas+for+Authenticated+Users
> >> > >
> >> > .
> >> >
> >> > The JIRA  KAFKA-3492  <https://issues.apache.org/jira/browse/KAFKA-3492
> >> > >has
> >> > a draft PR here: https://github.com/apache/kafka/pull/1256.
> >> >
> >> > Thank you...
> >> >
> >> >
> >> > Regards,
> >> >
> >> > Rajini
> >> >
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > Rajini
> >
> 
> 
> 
> -- 
> Regards,
> 
> Rajini

Re: [VOTE] KIP-62: Allow consumer to send heartbeats from a background thread

2016-06-16 Thread Harsha

+1 (binding)
Thanks,
Harsha

On Thu, Jun 16, 2016, at 05:46 PM, Henry Cai wrote:
> +1
> 
> On Thu, Jun 16, 2016 at 3:46 PM, Ismael Juma  wrote:
> 
> > +1 (binding)
> >
> > On Fri, Jun 17, 2016 at 12:44 AM, Guozhang Wang 
> > wrote:
> >
> > > ＋1.
> > >
> > > On Thu, Jun 16, 2016 at 11:44 AM, Jason Gustafson 
> > > wrote:
> > >
> > > > Hi All,
> > > >
> > > > I'd like to open the vote for KIP-62. This proposal attempts to address
> > > one
> > > > of the recurring usability problems that users of the new consumer have
> > > > faced with as little impact as possible. You can read the full details
> > > > here:
> > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-62%3A+Allow+consumer+to+send+heartbeats+from+a+background+thread
> > > > .
> > > >
> > > > After some discussion on this list, I think we were in agreement that
> > > this
> > > > change addresses a major part of the problem and we've left the door
> > open
> > > > for further improvements, such as adding a heartbeat() API or a
> > > separately
> > > > configured rebalance timeout. Thanks in advance to everyone who helped
> > > > review the proposal.
> > > >
> > > > -Jason
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >

Re: [VOTE] KIP-4 Create Topics Schema

2016-06-16 Thread Harsha

+1 (binding)
Thanks,
Harsha

On Thu, Jun 16, 2016, at 04:15 PM, Guozhang Wang wrote:
> +1.
> 
> On Thu, Jun 16, 2016 at 3:47 PM, Ismael Juma  wrote:
> 
> > +1 (binding)
> >
> > On Thu, Jun 16, 2016 at 11:50 PM, Grant Henke  wrote:
> >
> > > I would like to initiate the voting process for the "KIP-4 Create Topics
> > > Schema changes". This is not a vote for all of KIP-4, but specifically
> > for
> > > the create topics changes. I have included the exact changes below for
> > > clarity:
> > > >
> > > > Create Topics Request (KAFKA-2945
> > > > <https://issues.apache.org/jira/browse/KAFKA-2945>)
> > > >
> > > > CreateTopics Request (Version: 0) => [create_topic_requests] timeout
> > > >   create_topic_requests => topic num_partitions replication_factor
> > > [replica_assignment] [configs]
> > > > topic => STRING
> > > > num_partitions => INT32
> > > > replication_factor => INT16
> > > > replica_assignment => partition_id [replicas]
> > > >   partition_id => INT32
> > > >   replicas => INT32
> > > > configs => config_key config_value
> > > >   config_key => STRING
> > > >   config_value => STRING
> > > >   timeout => INT32
> > > >
> > > > CreateTopicsRequest is a batch request to initiate topic creation with
> > > > either predefined or automatic replica assignment and optionally topic
> > > > configuration.
> > > >
> > > > Request semantics:
> > > >
> > > >1. Must be sent to the controller broker
> > > >2. If there are multiple instructions for the same topic in one
> > > >request an InvalidRequestException will be logged on the broker and
> > > the
> > > >client will be disconnected.
> > > >   - This is because the list of topics is modeled server side as a
> > > >   map with TopicName as the key
> > > >3. The principal must be authorized to the "Create" Operation on the
> > > >"Cluster" resource to create topics.
> > > >   - Unauthorized requests will receive a
> > > ClusterAuthorizationException
> > > >4.
> > > >
> > > >Only one from ReplicaAssignment or (num_partitions +
> > > replication_factor
> > > >), can be defined in one instruction.
> > > >- If both parameters are specified an InvalidRequestException will
> > be
> > > >   logged on the broker and the client will be disconnected.
> > > >   - In the case ReplicaAssignment is defined number of partitions
> > and
> > > >   replicas will be calculated from the supplied replica_assignment.
> > > >   - In the case of defined (num_partitions + replication_factor)
> > > >   replica assignment will be automatically generated by the server.
> > > >   - One or the other must be defined. The existing broker side auto
> > > >   create defaults will not be used
> > > >   (default.replication.factor, num.partitions). The client
> > > implementation can
> > > >   have defaults for these options when generating the messages.
> > > >   - The first replica in [replicas] is assumed to be the preferred
> > > >   leader. This matches current behavior elsewhere.
> > > >5. Setting a timeout > 0 will allow the request to block until the
> > > >topic metadata is "complete" on the controller node.
> > > >   - Complete means the local topic metadata cache been completely
> > > >   populated and all partitions have leaders
> > > >  - The topic metadata is updated when the controller sends out
> > > >  update metadata requests to the brokers
> > > >   - If a timeout error occurs, the topic could still be created
> > > >   successfully at a later time. Its up to the client to query for
> > > the state
> > > >   at that point.
> > > >6. Setting a timeout <= 0 will validate arguments and trigger the
> > > >create topics and return immediately.
> > > >   - This is essentially the fully asynchronous mode we have in the
> > > >   Zookeeper tools today.
> > > >   - The error code in the response will either contain an arg

Re: [VOTE] KIP-4 Create Topics Schema

2016-06-20 Thread Harsha

+1 (binding)
-Harsha

On Mon, Jun 20, 2016, at 11:33 AM, Ismael Juma wrote:
> +1 (binding)
> 
> On Mon, Jun 20, 2016 at 8:27 PM, Dana Powers 
> wrote:
> 
> > +1 -- thanks for the update
> >
> > On Mon, Jun 20, 2016 at 10:49 AM, Grant Henke  wrote:
> > > I have update the patch and wiki based on the feedback in the discussion
> > > thread. The only change is that instead of logging and disconnecting in
> > the
> > > case of invalid messages (duplicate topics or both arguments) we now
> > return
> > > and InvalidRequest error back to the client for that topic.
> > >
> > > I would like to restart the vote now including that change. If you have
> > > already voted, please revote in this thread.
> > >
> > > Thank you,
> > > Grant
> > >
> > > On Sun, Jun 19, 2016 at 8:57 PM, Ewen Cheslack-Postava <
> > e...@confluent.io>
> > > wrote:
> > >
> > >> Don't necessarily want to add noise here, but I'm -1 based on the
> > >> disconnect part. See discussion in other thread. (I'm +1 otherwise, and
> > >> happy to have my vote applied assuming we clean up that one issue.)
> > >>
> > >> -Ewen
> > >>
> > >> On Thu, Jun 16, 2016 at 6:05 PM, Harsha  wrote:
> > >>
> > >> > +1 (binding)
> > >> > Thanks,
> > >> > Harsha
> > >> >
> > >> > On Thu, Jun 16, 2016, at 04:15 PM, Guozhang Wang wrote:
> > >> > > +1.
> > >> > >
> > >> > > On Thu, Jun 16, 2016 at 3:47 PM, Ismael Juma 
> > >> wrote:
> > >> > >
> > >> > > > +1 (binding)
> > >> > > >
> > >> > > > On Thu, Jun 16, 2016 at 11:50 PM, Grant Henke <
> > ghe...@cloudera.com>
> > >> > wrote:
> > >> > > >
> > >> > > > > I would like to initiate the voting process for the "KIP-4
> > Create
> > >> > Topics
> > >> > > > > Schema changes". This is not a vote for all of KIP-4, but
> > >> > specifically
> > >> > > > for
> > >> > > > > the create topics changes. I have included the exact changes
> > below
> > >> > for
> > >> > > > > clarity:
> > >> > > > > >
> > >> > > > > > Create Topics Request (KAFKA-2945
> > >> > > > > > <https://issues.apache.org/jira/browse/KAFKA-2945>)
> > >> > > > > >
> > >> > > > > > CreateTopics Request (Version: 0) => [create_topic_requests]
> > >> > timeout
> > >> > > > > >   create_topic_requests => topic num_partitions
> > >> replication_factor
> > >> > > > > [replica_assignment] [configs]
> > >> > > > > > topic => STRING
> > >> > > > > > num_partitions => INT32
> > >> > > > > > replication_factor => INT16
> > >> > > > > > replica_assignment => partition_id [replicas]
> > >> > > > > >   partition_id => INT32
> > >> > > > > >   replicas => INT32
> > >> > > > > > configs => config_key config_value
> > >> > > > > >   config_key => STRING
> > >> > > > > >   config_value => STRING
> > >> > > > > >   timeout => INT32
> > >> > > > > >
> > >> > > > > > CreateTopicsRequest is a batch request to initiate topic
> > creation
> > >> > with
> > >> > > > > > either predefined or automatic replica assignment and
> > optionally
> > >> > topic
> > >> > > > > > configuration.
> > >> > > > > >
> > >> > > > > > Request semantics:
> > >> > > > > >
> > >> > > > > >1. Must be sent to the controller broker
> > >> > > > > >2. If there are multiple instructions for the same topic in
> > >> one
> > >> > > > > >request an InvalidRequestException will be logged on the
> > >> broker
> > >> > and
> > >> > > > > the

Re: [DISCUSS] KIP-11- Authorization design for kafka security

2015-03-31 Thread Harsha

Yes in case of kerberos we will use superACL and this will be equivalent to 
kafka broker’s principal name.
But in SSL as two-way auth is not mandatory the only option if we want enforce 
authorizer in case of ssl is to force two-way auth.
Again this can be an issue on client side , lets say if a producer doesn’t want 
to provide client auth and just needs wire encryption there won’t be any
identity , in this case and we won’t be able to enforce an authorizer as the 
client will be anonymous.

-- 
Harsha


On March 31, 2015 at 10:29:33 AM, Don Bosco Durai (bo...@apache.org) wrote:

>Related interesting question:  
Since a broker is a consumer (of lead replicas), how do we handle the  
broker level of permissions? Do we hardcode a broker-principal name  
and automatically authorize brokers to do anything? Or is there a  
cleaner way?  

I feel, in Kerberos environment, “kafka” keytab would be the ideal  
solution. And “kafka” principal will need to be white listed. SSL  
certificate is another option, but it would be painful to set it up. IP  
whitelisting is another low impact, but less secure option.  

Bosco  



On 3/31/15, 10:20 AM, "Gwen Shapira"  wrote:  

>Related interesting question:  
>Since a broker is a consumer (of lead replicas), how do we handle the  
>broker level of permissions? Do we hardcode a broker-principal name  
>and automatically authorize brokers to do anything? Or is there a  
>cleaner way?  
>  
>  
>On Tue, Mar 31, 2015 at 10:17 AM, Don Bosco Durai   
>wrote:  
>>>21. Operation: What about other types of requests not covered in the  
>>>list,  
>> such as committing and fetching offsets, list topics, fetching consumer  
>> metadata, heartbeat, join group, etc?  
>>  
>> Would “CONFIGURE”, “DESCRIBE”, etc take care of this? Or should we add  
>> high level grouping like “ADMIN”, “OPERATIONS/MANAGEMENT” to cover  
>>related  
>> permissions?  
>>  
>> Bosco  
>>  
>>  
>>  
>> On 3/31/15, 9:21 AM, "Jun Rao"  wrote:  
>>  
>>>Thanks for the writeup. A few more comments.  
>>>  
>>>20. I agree that it would be better to do this after KIP-4 (admin  
>>>commands)  
>>>is done. With KIP-4, all admin operations will be sent as requests to  
>>>the  
>>>brokers instead of accessing ZK directly. This will make authorization  
>>>easier.  
>>>  
>>>21. Operation: What about other types of requests not covered in the  
>>>list,  
>>>such as committing and fetching offsets, list topics, fetching consumer  
>>>metadata, heartbeat, join group, etc?  
>>>  
>>>22. TopicConfigCache: We will need such a cache in KIP-4 as well. It  
>>>would  
>>>be useful to make sure that the implementation can be reused.  
>>>  
>>>23. Authorizer:  
>>>23.1 Do cluster level operations go through authorize() too? If so, what  
>>>will be the resource?  
>>>23.2 I assume that the authorize() check will be called on every  
>>>request.  
>>>So, we will have to make sure that the check is cheap.  
>>>  
>>>24. The acl json string in the config: Should we version this so that we  
>>>can evolve it in the future (e.g., adding group support)?  
>>>  
>>>Jun  
>>>  
>>>On Sun, Mar 29, 2015 at 3:56 PM, Parth Brahmbhatt <  
>>>pbrahmbh...@hortonworks.com> wrote:  
>>>  
>>>> Hi Gwen,  
>>>>  
>>>> Thanks a lot for taking the time to review this. I have tried to  
>>>>address  
>>>> all your questions below.  
>>>>  
>>>> Thanks  
>>>> Parth  
>>>> On 3/28/15, 8:08 PM, "Gwen Shapira" >>> gshap...@cloudera.com>> wrote:  
>>>>  
>>>> Preparing for Tuesday meeting, I went over the KIP :)  
>>>>  
>>>> First, Parth did an amazing job, the KIP is fantastic - detailed and  
>>>> readable. Thank you!  
>>>>  
>>>> Second, I have a lng list of questions :) No objections, just some  
>>>> things I'm unclear on and random minor comments. In general, I like  
>>>> the design, I just feel I'm missing parts of the picture.  
>>>>  
>>>> 1. "Yes, Create topic will have an optional acls, the output of  
>>>> describe will display owner and acls and alter topic will allow to  
>>>> modify the acls.” - will be nice to see what the CLI will look like.  
>>>>  
>>>> * I will modify the KIP but I was going to add “—acl  
>>>>”  
>>>> t

Re: kafka system tests

2015-04-01 Thread Harsha


Geoffrey,
  One thing I want to consider in re-writing the system tests is
  the ability to run these on other OS not just linux.  If we
  can support both linux and windows that will be great. I am
  not sure how much of that work will be involved on the
  framework side than on the system tests but in evaluating
  frameworks can we consider windows as another option too?. 

Thanks,
Harsha

On Wed, Mar 25, 2015, at 01:02 PM, Geoffrey Anderson wrote:
> Hi Gwen,
> 
> Sorry about that, the ducttape repository was not yet public, but now it
> is.
> 
> Cheers,
> Geoff
> 
> On Wed, Mar 25, 2015 at 12:08 PM, Gwen Shapira 
> wrote:
> 
> > Thanks for summarizing! I think we are all feeling the pain here and want
> > to make life easier moving forward.
> >
> > The framework discussion is particularly interesting - unfortunately, the
> > link to ducttape is broken at the moment.
> >
> > Gwen
> >
> > On Wed, Mar 25, 2015 at 11:46 AM, Geoffrey Anderson 
> > wrote:
> >
> > > Hi dev list,
> > >
> > > I've been discussing the current state of system tests with Jun and
> > others,
> > > and have summarized goals moving forward at:
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/System+Test+Improvements
> > >
> > > Feedback is welcome!
> > >
> > > Thanks,
> > > Geoff
> > >
> >

Re: [DISCUSS] New partitioning for better load balancing

2015-04-03 Thread Harsha

Gianmarco,
I am coming from storm community. I think PKG is a very
interesting and we can provide an implementation of Partitioner for PKG. Can
you open a JIRA for this.

--
Harsha
Sent with Airmail

On April 3, 2015 at 4:49:15 AM, Gianmarco De Francisci Morales
(g...@apache.org) wrote:

Hi,

We have recently studied the problem of load balancing in distributed
stream processing systems such as Samza [1].
In particular, we focused on what happens when the key distribution of the
stream is skewed when using key grouping.
We developed a new stream partitioning scheme (which we call Partial Key
Grouping). It achieves better load balancing than hashing while being more
scalable than round robin in terms of memory.

In the paper we show a number of mining algorithms that are easy to
implement with partial key grouping, and whose performance can benefit from
it. We think that it might also be useful for a larger class of algorithms.

PKG has already been integrated in Storm [2], and I would like to be able
to use it in Samza as well. As far as I understand, Kafka producers are the
ones that decide how to partition the stream (or Kafka topic). Even after
doing a bit of reading, I am still not sure if I should be writing this
email here or on the Samza dev list. Anyway, my first guess is Kafka.

I do not have experience with Kafka, however partial key grouping is very
easy to implement: it requires just a few lines of code in Java when
implemented as a custom grouping in Storm [3].
I believe it should be very easy to integrate.

For all these reasons, I believe it will be a nice addition to Kafka/Samza.
If the community thinks it's a good idea, I will be happy to offer support
in the porting.

References:
[1]
https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf

[2] https://issues.apache.org/jira/browse/STORM-632
[3] https://github.com/gdfm/partial-key-grouping
--
Gianmarco

Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-05-03 Thread Harsha

Thanks Jay & Gianmarco for the comments. I picked the option A, if user
sends a partition id than it will applied and partitioner.class method
will only called if partition id is null . 
Please take a look at the updated KIP here
https://cwiki.apache.org/confluence/display/KAFKA/KIP-+22+-+Expose+a+Partitioner+interface+in+the+new+producer
.  Let me know if you see anything missing.

Thanks,
Harsha

On Fri, Apr 24, 2015, at 02:15 AM, Gianmarco De Francisci Morales wrote:
> Hi,
> 
> 
> Here are the questions I think we should consider:
> > 1. Do we need this at all given that we have the partition argument in
> > ProducerRecord which gives full control? I think we do need it because this
> > is a way to plug in a different partitioning strategy at run time and do it
> > in a fairly transparent way.
> >
> 
> Yes, we need it if we want to support different partitioning strategies
> inside Kafka rather than requiring the user to code them externally.
> 
> 
> > 3. Do we need to add the value? I suspect people will have uses for
> > computing something off a few fields in the value to choose the partition.
> > This would be useful in cases where the key was being used for log
> > compaction purposes and did not contain the full information for computing
> > the partition.
> >
> 
> I am not entirely sure about this. I guess that most partitioners should
> not use it.
> I think it makes it easier to reason about the system if the partitioner
> only works on the key.
> Hoever, if the value (and its serialization) are already available, there
> is not much harm in passing them along.
> 
> 
> > 4. This interface doesn't include either an init() or close() method. It
> > should implement Closable and Configurable, right?
> >
> 
> Right now the only application I can think of to have an init() and
> close()
> is to read some state information (e.g., load information) that is
> published on some external distributed storage (e.g., zookeeper) by the
> brokers.
> It might be useful also for reconfiguration and state migration.
> 
> I think it's not a very common use case right now, but if the added
> complexity is not too much it might be worth to have support for these
> methods.
> 
> 
> 
> > 5. What happens if the user both sets the partition id in the
> > ProducerRecord and sets a partitioner? Does the partition id just get
> > passed in to the partitioner (as sort of implied in this interface?). This
> > is a bit weird since if you pass in the partition id you kind of expect it
> > to get used, right? Or is it the case that if you specify a partition the
> > partitioner isn't used at all (in which case no point in including
> > partition in the Partitioner api).
> >
> >
> The user should be able to override the partitioner on a per-record basis
> by explicitly setting the partition id.
> I don't think it makes sense for the partitioners to take "hints" on the
> partition.
> 
> I would even go the extra step, and have a default logic that accepts
> both
> key and partition id (current interface) and calls partition() only if
> the
> partition id is not set. The partition() method does *not* take the
> partition ID as input (only key-value).
> 
> 
> Cheers,
> --
> Gianmarco
> 
> 
> 
> > Cheers,
> >
> > -Jay
> >
> > On Thu, Apr 23, 2015 at 6:55 AM, Sriharsha Chintalapani 
> > wrote:
> >
> > > Hi,
> > > Here is the KIP for adding a partitioner interface for producer.
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-+22+-+Expose+a+Partitioner+interface+in+the+new+producer
> > > There is one open question about how interface should look like. Please
> > > take a look and let me know if you prefer one way or the other.
> > >
> > > Thanks,
> > > Harsha
> > >
> > >
> >

Re: [KIP-DISCUSSION] KIP-22 Expose a Partitioner interface in the new producer

2015-05-06 Thread Harsha

Thanks for the review Joel. I agree don't need a init method we can use
configure. I'll update the KIP.
-Harsha

On Wed, May 6, 2015, at 04:45 PM, Joel Koshy wrote:
> +1 with a minor comment: do we need an init method given it extends
> Configurable?
> 
> Also, can you move this wiki out of drafts and add it to the table in
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals?
> 
> Thanks,
> 
> Joel
> 
> On Wed, May 06, 2015 at 07:46:46AM -0700, Sriharsha Chintalapani wrote:
> > Thanks Jay. I removed partitioner.metadata from KIP. I’ll send an updated 
> > patch.
> > 
> > -- 
> > Harsha
> > Sent with Airmail
> > 
> > On May 5, 2015 at 6:31:47 AM, Sriharsha Chintalapani (harsh...@fastmail.fm) 
> > wrote:
> > 
> > Thanks for the comments everyone.
> > Hi Jay,
> >      I do have a question regarding configurable interface on how to pass a 
> > Map properties. I couldn’t find any other classes using it. JMX 
> > reporter overrides it but doesn’t implement it.  So with configurable 
> > partitioner how can a user pass in partitioner configuration since its 
> > getting instantiated within the producer.
> > 
> > Thanks,
> > Harsha
> > 
> > 
> > On May 4, 2015 at 10:36:45 AM, Jay Kreps (jay.kr...@gmail.com) wrote:
> > 
> > Hey Harsha,
> > 
> > That proposal sounds good. One minor thing--I don't think we need to have
> > the partitioner.metadata property. Our reason for using string properties
> > is exactly to make config extensible at runtime. So a given partitioner can
> > add whatever properties make sense using the configure() api it defines.
> > 
> > -Jay
> > 
> > On Sun, May 3, 2015 at 5:57 PM, Harsha  wrote:
> > 
> > > Thanks Jay & Gianmarco for the comments. I picked the option A, if user
> > > sends a partition id than it will applied and partitioner.class method
> > > will only called if partition id is null .
> > > Please take a look at the updated KIP here
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-+22+-+Expose+a+Partitioner+interface+in+the+new+producer
> > > . Let me know if you see anything missing.
> > >
> > > Thanks,
> > > Harsha
> > >
> > > On Fri, Apr 24, 2015, at 02:15 AM, Gianmarco De Francisci Morales wrote:
> > > > Hi,
> > > >
> > > >
> > > > Here are the questions I think we should consider:
> > > > > 1. Do we need this at all given that we have the partition argument in
> > > > > ProducerRecord which gives full control? I think we do need it because
> > > this
> > > > > is a way to plug in a different partitioning strategy at run time and
> > > do it
> > > > > in a fairly transparent way.
> > > > >
> > > >
> > > > Yes, we need it if we want to support different partitioning strategies
> > > > inside Kafka rather than requiring the user to code them externally.
> > > >
> > > >
> > > > > 3. Do we need to add the value? I suspect people will have uses for
> > > > > computing something off a few fields in the value to choose the
> > > partition.
> > > > > This would be useful in cases where the key was being used for log
> > > > > compaction purposes and did not contain the full information for
> > > computing
> > > > > the partition.
> > > > >
> > > >
> > > > I am not entirely sure about this. I guess that most partitioners should
> > > > not use it.
> > > > I think it makes it easier to reason about the system if the partitioner
> > > > only works on the key.
> > > > Hoever, if the value (and its serialization) are already available, 
> > > > there
> > > > is not much harm in passing them along.
> > > >
> > > >
> > > > > 4. This interface doesn't include either an init() or close() method.
> > > It
> > > > > should implement Closable and Configurable, right?
> > > > >
> > > >
> > > > Right now the only application I can think of to have an init() and
> > > > close()
> > > > is to read some state information (e.g., load information) that is
> > > > published on some external distributed storage (e.g., zookeeper) by the
> > > > brokers.
> > > > It might be useful also for reconfiguration and state migration.
> &g

Re: [Vote] KIP-11 Authorization design for kafka security

2015-05-15 Thread Harsha

+1 non-binding






On Fri, May 15, 2015 at 9:18 AM -0700, "Parth Brahmbhatt" 
 wrote:










Hi,

Opening the voting thread for KIP-11.

Link to the KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-11+-+Authorization+Interface
Link to Jira: https://issues.apache.org/jira/browse/KAFKA-1688

Thanks
Parth

Re: ProducerFailureHandlingTest.testCannotSendToInternalTopic is failing

2015-01-17 Thread Harsha

I don't see any failures in tests with the latest trunk or 0.8.2. I ran
it few times in a loop.
-Harsha

On Sat, Jan 17, 2015, at 08:38 AM, Manikumar Reddy wrote:
> ProducerFailureHandlingTest.testCannotSendToInternalTopic is failing on
> both 0.8.2 and trunk.
> 
> Error on 0.8.2:
> kafka.api.ProducerFailureHandlingTest > testCannotSendToInternalTopic
> FAILED
> java.util.concurrent.ExecutionException:
> org.apache.kafka.common.errors.TimeoutException: Failed to update
> metadata
> after 3000 ms.
> at
> org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.(KafkaProducer.java:437)
> at
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:352)
> at
> org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:248)
> at
> kafka.api.ProducerFailureHandlingTest.testCannotSendToInternalTopic(ProducerFailureHandlingTest.scala:309)
> 
> Caused by:
> org.apache.kafka.common.errors.TimeoutException: Failed to update
> metadata after 3000 ms.
> 
> 
> Error on Trunk:
> kafka.api.test.ProducerFailureHandlingTest >
> testCannotSendToInternalTopic
> FAILED
> java.lang.AssertionError: null
> at org.junit.Assert.fail(Assert.java:69)
> at org.junit.Assert.assertTrue(Assert.java:32)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at
> kafka.api.test.ProducerFailureHandlingTest.testCannotSendToInternalTopic(ProducerFailureHandlingTest.scala:312)

1 2 3 >

1 - 100 of 294 matches

Mail list logo