Jenkins build is back to normal : kafka-trunk-jdk10 #106

2018-05-15 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-15 Thread Jorge Esteban Quilcate Otoya
@Guozhang added. Thanks!

El mar., 15 may. 2018 a las 5:50, Matthias J. Sax ()
escribió:

> +1 (binding)
>
> Thanks a lot for the KIP!
>
> -Matthias
>
> On 5/14/18 10:17 AM, Guozhang Wang wrote:
> > +1 from me
> >
> > One more comment on the wiki: while reviewing the PR I realized that in `
> > MockProcessorContext.java
> > <
> https://github.com/apache/kafka/pull/4955/files#diff-d5440e7338f775230019a86e6bcacccb
> >`
> > we are also adding one additional API plus modifying the existing
> > `setRecordMetadata` API. Since this class is part of the public
> test-utils
> > package we should claim it in the wiki as well.
> >
> >
> > Guozhang
> >
> > On Mon, May 14, 2018 at 8:43 AM, Ted Yu  wrote:
> >
> >> +1
> >>
> >> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
> >> quilcate.jo...@gmail.com> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> I would like to start a vote on KIP-244: Add Record Header support to
> >> Kafka
> >>> Streams
> >>>
> >>> KIP wiki page:
> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> >>>
> >>> The discussion thread is here:
> >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.
> >>> mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%
> >>> 40mail.gmail.com%3E
> >>>
> >>> Cheers,
> >>> Jorge.
> >>>
> >>
> >
> >
> >
>
>


Re: [VOTE] KIP-277 - Fine Grained ACL for CreateTopics API

2018-05-15 Thread Edoardo Comar
Hi,
bumping the thread as the current vote count for this KIP is
2 binding +1
5 non-binding +1

thanks, Edo

On 8 May 2018 at 16:14, Edoardo Comar  wrote:
> Hi,
> bumping the thread as the current vote count for this KIP is
> 2 binding +1
> 5 non-binding +1
>
> so still missing a binding vote please
>
> thanks,
> Edo
>
>
> On 30 April 2018 at 12:49, Manikumar  wrote:
>>
>> +1 (non-binding)
>>
>> Thanks
>>
>> On Thu, Apr 26, 2018 at 9:59 PM, Colin McCabe  wrote:
>>
>> > +1 (non-binding)
>> >
>> > best,
>> > Colin
>> >
>> >
>> > On Wed, Apr 25, 2018, at 02:45, Edoardo Comar wrote:
>> > > Hi,
>> > >
>> > > The discuss thread on KIP-277 (
>> > > https://www.mail-archive.com/dev@kafka.apache.org/msg86540.html )
>> > > seems to have been fruitful and concerns have been addressed, please
>> > allow
>> > > me start a vote on it:
>> > >
>> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > 277+-+Fine+Grained+ACL+for+CreateTopics+API
>> > >
>> > > I will update the small PR to the latest KIP semantics if the vote
>> > passes
>> > > (as I hope :-).
>> > >
>> > > cheers
>> > > Edo
>> > > --
>> > >
>> > > Edoardo Comar
>> > >
>> > > IBM Message Hub
>> > >
>> > > IBM UK Ltd, Hursley Park, SO21 2JN
>> > > Unless stated otherwise above:
>> > > IBM United Kingdom Limited - Registered in England and Wales with
>> > > number
>> > > 741598.
>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> > 3AU
>> >
>
>
>
>
> --
> "When the people fear their government, there is tyranny; when the
> government fears the people, there is liberty." [Thomas Jefferson]
>



-- 
"When the people fear their government, there is tyranny; when the
government fears the people, there is liberty." [Thomas Jefferson]


Jenkins build is back to normal : kafka-trunk-jdk8 #2645

2018-05-15 Thread Apache Jenkins Server
See 




Re: [DISCUSS] Apache Kafka 2.0.0 Release Plan

2018-05-15 Thread Rajini Sivaram
Hi all,

This is just a reminder that KIP freeze for 2.0.0 release is in a week's
time and we still have a lot of KIPs in progress. KIPs that are currently
being discussed should start the voting process soon to get voting complete
by 22nd of May. Please participate in discussions and votes to enable these
to be added to the release (or postponed if required).

Voting is in progress for the following KIPs:


   - KIP-235:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-235%3A+Add+DNS+alias+support+for+secured+connection
   - KIP-244:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
   - KIP-248:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient
   - KIP-255:
   https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75968876
   - KIP-275:
   https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75977607
   - KIP-277:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-277+-+Fine+Grained+ACL+for+CreateTopics+API
   - KIP-278:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-278+-+Add+version+option+to+Kafka%27s+commands
   - KIP-282:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-282%3A+Add+the+listener+name+to+the+authentication+context
   - KIP-283:
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-283%3A+Efficient+Memory+Usage+for+Down-Conversion

Regards,

Rajini

On Thu, May 10, 2018 at 9:31 AM, Rajini Sivaram 
wrote:

> Hi all,
>
> A reminder that KIP freeze for 2.0.0 is May 22. I have updated the release
> page with all the approved KIPs:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80448820
>
> We have many KIPs still under discussion and/or ready for voting. Please
> participate in discussions and votes to enable these to be added to the
> release (or postponed if required). Voting needs to be complete by May 22
> for the KIP to be added to the release.
>
> Voting is in progress for these KIPs:
>
>
>- KIP-206: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>206%3A+Add+support+for+UUID+serialization+and+deserialization
>
> 
>- KIP-235: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>235%3A+Add+DNS+alias+support+for+secured+connection
>
> 
>- KIP-248: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>248+-+Create+New+ConfigCommand+That+Uses+The+New+AdminClient#KIP-248-
>CreateNewConfigCommandThatUsesTheNewAdminClient-DescribeQuotas
>
> 
>- KIP-255: https://cwiki.apache.org/confluence/pages/viewpage.
>action?pageId=75968876
>- KIP-266: https://cwiki.apache.org/confluence/pages/viewpage.
>action?pageId=75974886
>- KIP-275: https://cwiki.apache.org/confluence/pages/viewpage.
>action?pageId=75977607
>- KIP-277: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>277+-+Fine+Grained+ACL+for+CreateTopics+API
>
> 
>- KIP-281: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>281%3A+ConsumerPerformance%3A+Increase+Polling+Loop+Timeout+
>and+Make+It+Reachable+by+the+End+User
>
> 
>- KIP-282: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>282%3A+Add+the+listener+name+to+the+authentication+context
>
> 
>- KIP-283: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>283%3A+Efficient+Memory+Usage+for+Down-Conversion
>
> 
>- KIP-294: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>294+-+Enable+TLS+hostname+verification+by+default
>
> 
>
>
> Thanks,
>
> Rajini
>
> On Wed, Apr 25, 2018 at 11:18 AM, Rajini Sivaram 
> wrote:
>
>> I would like to volunteer to be the release manager for our next
>> time-based feature release (v2.0.0). See https://cwiki.apache.org/
>> confluence/display/KAFKA/Time+Based+Release+Plan
>> 
>> if you missed previous communication on time-based releas

Kafka Elasticsearch Connector

2018-05-15 Thread Raj, Gokul (External)
Hi Team,
Am new to this tech. I need to connect Kafka to Elastic search using Windows(To 
consume the data from Kafka with the help of Elastic search). I have tried so 
many methods but I can't finish this one. Is there any way to do this in 
windows or direct me to the respective place. Thanks in advance and glad to be 
a part of this forum.

Thanks & Regards
Gokul Raj S



Re: [DISCUSS] KIP-302 - Enable Kafka clients to use all DNS resolved IP addresses

2018-05-15 Thread Rajini Sivaram
Hi Edo,

I agree that KIP-235 and KIP-302 address different scenarios. And I agree
that each one is not sufficient in itself to address both the scenarios.
But I also think that they conflict and hence they need to be looked at
together and perhaps use a single config.

As an example:

If I run:

for (InetAddress address : InetAddress.getAllByName("www.apache.org")) {
System.out.printf("HostName %s canonicalHostName %s IP %s\n",
address.getHostName(), address.getCanonicalHostName(),
address.getHostAddress());
}

I get:

HostName www.apache.org canonicalHostName tlp-eu-west.apache.org IP
195.154.151.36
HostName www.apache.org canonicalHostName 40.79.78.1 IP 40.79.78.1
HostName www.apache.org canonicalHostName themis.apache.org IP
140.211.11.105
HostName www.apache.org canonicalHostName 2001:bc8:2142:300:0:0:0:0 IP
2001:bc8:2142:300:0:0:0:0


If www.apache.org is used as a bootstrap address, KIP-302 would connect to (
 www.apache.org/195.154.151.36 and www.apache.org/140.211.11.105) while
KIP-235 would connect to (tlp-eu-west.apache.org/195.154.151.3. and
themis.apache.org/140.211.11.105). This is a significant difference not
just for Kerberos, but for any secure environment where hostname is
verified to prevent man-in-the-middle attacks. In your case, I presume you
would have SSL certificates with the equivalent of www.apache.org on both
the load balancers. In Jonathan's case, I presume he has Kerberos
principals for the equivalent of tlp-eu-west.apache.org and
themis.apache.org. We would want to support both scenarios regardless of
the security protocol, just need to come up with configuration options that
don't conflict.


On Mon, May 14, 2018 at 5:24 PM, Edoardo Comar  wrote:

> Thanks Rajini
>
> I still don't see the overlap between the two KIPS
>
> KIP-235 allows an expansion of hostnames on the bootstrap list.
>
> KIP-302 allows alternative IPs to be used to attempt a connection
> (either at bootstrap and when processing the MetadataResponse) to a
> given hostname.
>
> A use case would be that of active/passive LB fronting the brokers.
>
> Arguably, if Java honored the DNS-set TTL, and the TTL was low and on
> subsequent requests, the order of IPs returned by the
> InetAddress.getAllByName() was random, we may not need such an
> enhancement.
> In practice, a Java client can get stuck on a "bad" IP forever if it
> only relies on the first IP returned.
>
> HTH,
> Edo
>
> On 14 May 2018 at 16:23, Rajini Sivaram  wrote:
> > Hi Edo,
> >
> > Thanks for the KIP. I think it will be good to include a diagram to make
> it
> > easier to distinguish this scenario from that of KIP-235 without reading
> > the PR.
> >
> > It may be worth considering if KIP-235 and this KIP could use a single
> > config name with different values instead of two boolean configs:
> >
> > bootstrap.reverse.dns.lookup = true/false
> > enable.all.dns.ips = true/false
> >
> > Not all values of (bootstrap.reverse.dns.lookup, enable.all.dns.ips) seem
> > to make sense. And not all scenarios are handled. Even if we use multiple
> > configs, it seems to me that we may want to name them differently.
> >
> > The possible combinations are:
> >
> > 1) Bootstrap
> >
> > a) No lookup
> > b) Use all DNS entries with host name
> > c) Use all DNS entries with canonical host name
> >
> > 2) Advertised listeners
> >
> > a) No lookup
> > b) Use all DNS entries with host name
> > c) Use all DNS entries with canonical host name
> >
> > The combinations that are enabled by the two boolean configs (
> > bootstrap.reverse.dns.lookup, enable.all.dns.ips)  are:
> >
> >- (false, false) => (1a, 2a)
> >- (true, false) => (1c, 2a)
> >- (false, true) => (1b, 2b)
> >- (true, true) => (??, 2b)
> >
> > It will be good if we can clearly identify which combinations we want to
> > support and the scenarios where they may be useful. Perhaps (1a, 2a),
> (1c,
> > 2a), (1b, 2b) and (1c, 2c) are useful?
> >
> >
> > On Mon, May 14, 2018 at 2:58 PM, Skrzypek, Jonathan <
> > jonathan.skrzy...@gs.com> wrote:
> >
> >> Ah, apologies didn't see there was already a decent amount of discussion
> >> on this in the PR.
> >>
> >> This kind of sounds related to the environment you're running to me.
> >> What is the rationale behind using the advertised listeners to do your
> >> load balancing advertisement rather than a top level alias that has
> >> everything ?
> >>
> >> It sounds like in your case there is a mismatch between
> bootstrap.servers
> >> and advertised.listeners, and you want advertised.listeners to take
> >> precedence and have the client iterate over what is returned by the
> broker.
> >> So the extra parameter doesn't only have to do with DNS but it's also
> >> appending from the broker, maybe the parameter name should reflect this
> ?
> >>
> >> Jonathan Skrzypek
> >>
> >>
> >> -Original Message-
> >> From: Skrzypek, Jonathan [Tech]
> >> Sent: 14 May 2018 14:46
> >> To: dev@kafka.apache.org
> >> Subject: RE: [DISCUSS] KIP-302 - Enable

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-15 Thread Colin McCabe
Hi Piyush,

I think AclBinding should operate the same way as AclBindingFilter.

So you should be able to do something like this:
> AclBindingFilter filter = new AclBindingFiler(new 
> ResourceFilter(ResourceType.GROUP, "foo*"))
> AclBinding binding = new AclBinding(new Resource(ResourceType.GROUP, "foo*"))
> assertTrue(filter.matches(binding));

Thinking about this more, it's starting to feel really messy to create new 
"pattern" constructors for Resource and ResourceFilter.  I don't think people 
will be able to figure this out.  Maybe we should just have a limited 
compatibility break here, where it is now required to escape weird consumer 
group names when creating ACLs for them.

To future-proof this, we should reserve a bunch of characters at once, like *, 
@, $, %, ^, &, +, [, ], etc.  If these characters appear in a resource name, it 
should be an error, unless they are escaped with a backslash.  That way, we can 
use them in the future.  We should create a Resource.escapeName function which 
adds the correct escape characters to resource names (so it would translate 
foo* into foo\*, foo+bar into foo\+bar, etc. etc.

best,
Colin


On Mon, May 14, 2018, at 17:08, Piyush Vijay wrote:
> Colin,
> 
> createAcls take a AclBinding, however, instead of AclBindingFilter. What
> are your thoughts here?
> 
> public abstract DescribeAclsResult describeAcls(AclBindingFilter
> filter, DescribeAclsOptions options);
> 
> public abstract CreateAclsResult createAcls(Collection
> acls, CreateAclsOptions options);
> 
> public abstract DeleteAclsResult
> deleteAcls(Collection filters, DeleteAclsOptions
> options);
> 
> 
> Thanks
> 
> Piyush Vijay
> 
> On Mon, May 14, 2018 at 9:26 AM, Andy Coates  wrote:
> 
> > +1
> >
> > On 11 May 2018 at 17:14, Colin McCabe  wrote:
> >
> > > Hi Andy,
> > >
> > > I see what you mean.  I guess my thought here is that if the fields are
> > > private, we can change it later if we need to.
> > >
> > > I definitely agree that we should use the scheme you describe for sending
> > > ACLs over the wire (just the string + version number)
> > >
> > > cheers,
> > > Colin
> > >
> > >
> > > On Fri, May 11, 2018, at 09:39, Andy Coates wrote:
> > > > i think I'm agreeing with you. I was merely suggesting that having an
> > > > additional field that controls how the current field is interpreted is
> > > more
> > > > flexible / extensible in the future than using a 'union' style
> > approach,
> > > > where only one of several possible fields should be populated. But
> > it's a
> > > > minor thing.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 10 May 2018 at 09:29, Colin McCabe  wrote:
> > > >
> > > > > Hi Andy,
> > > > >
> > > > > The issue that I was trying to solve here is the Java API.  Right
> > now,
> > > > > someone can write "new ResourceFilter(ResourceType.TRANSACTIONAL_ID,
> > > > > "foo*") and have a ResourceFilter that applies to a Transactional ID
> > > named
> > > > > "foo*".  This has to continue to work, or else we have broken
> > > compatibility.
> > > > >
> > > > > I was proposing that there would be something like a new function
> > like
> > > > > ResourceFilter.fromPattern(ResourceType.TRANSACTIONAL_ID, "foo*")
> > > which
> > > > > would create a ResourceFilter that applied to transactional IDs
> > > starting
> > > > > with "foo", rather than transactional IDs named "foo*" specifically.
> > > > >
> > > > > I don't think it's important whether the Java class has an integer,
> > an
> > > > > enum, or two string fields.  The important thing is that there's a
> > new
> > > > > static function, or new constructor overload, etc. that works for
> > > patterns
> > > > > rather than literal strings.
> > > > >
> > > > > On Thu, May 10, 2018, at 03:30, Andy Coates wrote:
> > > > > > Rather than having name and pattern fields on the ResourceFilter,
> > > where
> > > > > > it’s only valid for one to be set, and we want to restrict the
> > > character
> > > > > > set in case future enhancements need them, we could instead add a
> > new
> > > > > > integer ‘nameType’ field, and use constants to indicate how the
> > name
> > > > > > field should be interpreted, e.g. 0 = literal, 1 = wildcard. This
> > > would
> > > > > > be extendable, e.g we can later add 2 = regex, or what ever, and
> > > > > > wouldn’t require any escaping.
> > > > >
> > > > > This is very user-unfriendly, though.  Users don't want to have to
> > > > > explicitly supply a version number when using the API, which is what
> > > this
> > > > > would force them to do.  I don't think users are going to want to
> > > memorize
> > > > > that version 4 supprted "+", whereas version 3 only supported
> > "[0-9]",
> > > or
> > > > > whatever.
> > > > >
> > > > > Just as an example, do you remember which versions of FetchRequest
> > > added
> > > > > which features?  I don't.  I always have to look at the code to
> > > remember.
> > > > >
> > > > > Also, escaping is still required any time you overload a character to
> > > mea

Re: [DISCUSS] KIP-285: Connect Rest Extension Plugin

2018-05-15 Thread Magesh Nandakumar
Randall- I think I have addressed all the comments. Let me know if we can
take this to Vote.

Thanks
Magesh

On Tue, May 8, 2018 at 10:12 PM, Magesh Nandakumar 
wrote:

> Hi All,
>
> I have updated the KIP to reflect changes based on the PR
> https://github.com/apache/kafka/pull/4931. Its mostly has minor changes
> to the interfaces and includes details on packages for the interfaces and
> the classes. Let me know your thoughts.
>
> Thanks
> Magesh
>
> On Fri, Apr 27, 2018 at 12:03 PM, Randall Hauch  wrote:
>
>> Great work, Magesh. I like the overall approach a lot, so I left some
>> pretty nuanced comments about specific details.
>>
>> Best regards,
>>
>> Randall
>>
>> On Wed, Apr 25, 2018 at 3:03 PM, Magesh Nandakumar 
>> wrote:
>>
>> > Thanks Randall for your thoughts. I have created a replica of the
>> required
>> > entities in the draft implementation. If you can take a look at the PR
>> and
>> > let me know your thoughts, I will update the KIP to reflect the same
>> >
>> > https://github.com/apache/kafka/pull/4931
>> >
>> > On Tue, Apr 24, 2018 at 11:44 AM, Randall Hauch 
>> wrote:
>> >
>> > > Magesh, I think our last emails cross in mid-stream.
>> > >
>> > > We definitely want to put the new public interfaces/classes in the API
>> > > module, and implementation in the runtime module. Yes, this will
>> affect
>> > the
>> > > design, since for example we don't want to expose runtime types to the
>> > API,
>> > > and we want to prevent breaking changes. We don't really want to move
>> the
>> > > REST entities if we don't have to, since that may break projects that
>> are
>> > > extending the runtime module -- even though the runtime module is not
>> a
>> > > public API we still want to _try_ to change things.
>> > >
>> > > Do you want to try to create a prototype to see what kind of impact
>> and
>> > > choices we'll have to make?
>> > >
>> > > Best regards,
>> > >
>> > > Randall
>> > >
>> > > On Tue, Apr 24, 2018 at 12:48 PM, Randall Hauch 
>> > wrote:
>> > >
>> > > > Thanks for updating the KIP, Magesh. You've resolved all of my
>> > concerns,
>> > > > though I have one more: we should specify the package names for all
>> new
>> > > > interfaces/classes.
>> > > >
>> > > > I'm looking forward to more feedback from others.
>> > > >
>> > > > Best regards,
>> > > >
>> > > > Randall
>> > > >
>> > > > On Fri, Apr 20, 2018 at 12:17 AM, Magesh Nandakumar <
>> > > mage...@confluent.io>
>> > > > wrote:
>> > > >
>> > > >> Hi All,
>> > > >>
>> > > >> I have updated the KIP with following changes
>> > > >>
>> > > >>1. Expanded the Motivation section
>> > > >>2. Included details about the interface in the public interface
>> > > section
>> > > >>3. Modified the config name to rest.extension.classes
>> > > >>4. Modified the ConnectRestExtension to include Configurable
>> > instead
>> > > of
>> > > >>ResourceConfig
>> > > >>5. Modified the "Rest Extension Integration with Connect" in
>> > > "Proposed
>> > > >>Approach" to include a new Custom implementation for
>> Configurable
>> > > >>6. Provided examples for the Java Service provider mechanism
>> > > >>7. Included a reference implementation in scope
>> > > >>
>> > > >> Kindly let me know your thoughts on the updates.
>> > > >>
>> > > >> Thanks
>> > > >> Magesh
>> > > >>
>> > > >> On Thu, Apr 19, 2018 at 10:39 AM, Magesh Nandakumar <
>> > > mage...@confluent.io
>> > > >> >
>> > > >> wrote:
>> > > >>
>> > > >> > Hi Randall,
>> > > >> >
>> > > >> > Thanks for your feedback. I also would like to go with
>> > > >> > rest.extension.classes`.
>> > > >> >
>> > > >> > For exposing Configurable, my original intention was just to
>> expose
>> > > that
>> > > >> > to the extension because that's all one needs to register JAX RS
>> > > >> resources.
>> > > >> > The fact that we use Jersey shouldn't even be exposed in the
>> > > interface.
>> > > >> > Hence it doesn't affect the public API by any means.
>> > > >> >
>> > > >> > I will update the KIP and let everyone know.
>> > > >> >
>> > > >> > Thanks
>> > > >> > Magesh
>> > > >> >
>> > > >> > On Thu, Apr 19, 2018 at 9:51 AM, Randall Hauch > >
>> > > >> wrote:
>> > > >> >
>> > > >> >> On Thu, Apr 12, 2018 at 10:59 AM, Magesh Nandakumar <
>> > > >> mage...@confluent.io
>> > > >> >> >
>> > > >> >> wrote:
>> > > >> >>
>> > > >> >> > Hi Randall,
>> > > >> >> >
>> > > >> >> > Thanks a lot for your feedback.
>> > > >> >> >
>> > > >> >> > I will update the KIP to reflect your comments in (1), (2),
>> (7)
>> > and
>> > > >> (8).
>> > > >> >> >
>> > > >> >>
>> > > >> >> Looking forward to these.
>> > > >> >>
>> > > >> >>
>> > > >> >> >
>> > > >> >> > For comment # (3) , while it would be great to automatically
>> > > >> configure
>> > > >> >> the
>> > > >> >> > Rest Extensions, I would prefer that to be specified
>> explicitly.
>> > > Lets
>> > > >> >> > assume a connector archive includes a implementation for the
>> > > >> >> RestExtension
>> > > >> >> > to do authentication using some header. We don

use orther encoding

2018-05-15 Thread wuy...@boco.com.cn
hello!
I want to use kafka to produce  some message in GBK encoding.

what should i do?



wuy...@boco.com.cn 
发件人:吴芋努


Re: [VOTE] KIP-255: OAuth Authentication via SASL/OAUTHBEARER

2018-05-15 Thread Jun Rao
Hi, Ron,

Thanks for the reply. I understood your answers to #2 and #3.

For #1, will the server map all clients' principal name to the value
associated with "sub" claim? How do we support mapping different clients to
different principal names?

Jun

On Mon, May 14, 2018 at 7:02 PM, Ron Dagostino  wrote:

> Hi Jun.  Thanks for the +1 vote.
>
> Regarding the first question about token claims, yes, you have it correct
> about translating the OAuth token to a principle name via a JAAS module
> option in the default unsecured case.  Specifically, the OAuth SASL Server
> implementation is responsible for setting the authorization ID, and it gets
> the authorization ID from the OAuthBearerToken's principalName() method.
> The listener.name.sasl_ssl.oauthbearer.sasl.server.callback.handler.class
> is responsible for handling an instance of OAuthBearerValidatorCallback to
> accept a token compact serialization from the client and return an instance
> of OAuthBearerToken (assuming the compact serialization validates), and in
> the default unsecured case the builtin unsecured validator callback handler
> defines the OAuthBearerToken.principalName() method to return the 'sub'
> claim value by default (with the actual claim it uses being configurable
> via the unsecuredValidatorPrincipalClaimName JAAS module option).  So that
> is how we translate from a token to a principal name in the default
> unsecured (out-of-the-box) case.
>
> For production use cases, the implementation associated with
> listener.name.sasl_ssl.oauthbearer.sasl.server.callback.handler.class can
> do whatever it wants.  As an example, I have written a class that wraps a
> com.nimbusds.jwt.SignedJWT instance (see
> https://connect2id.com/products/nimbus-jose-jwt/) and presents it as an
> OAuthBearerToken:
>
> public class NimbusSignedJwtOAuthBearerToken implements OAuthBearerToken {
> private final SignedJWT signedJwt;
> private final String principalName;
> private final Set scope;
> private final Long startTimeMs;
> private final long lifetimeMs;
>
> /**
>  * Constructor
>  *
>  * @param signedJwt
>  *the mandatory signed JWT
>  * @param principalClaimName
>  *the mandatory claim name identifying the claim from which
> the
>  *principal name will be extracted. The claim must exist
> and must be
>  *a String.
>  * @param optionalScopeClaimName
>  *the optional claim name identifying the claim from which
> any scope
>  *will be extracted. If specified and the claim exists then
> the
>  *value must be either a String or a String List.
>  * @throws ParseException
>  * if the principal claim does not exist or is not a
> String; the
>  * scope claim is neither a String nor a String List; the
> 'exp'
>  * claim does not exist or is not a number; the 'iat' claim
> exists
>  * but is not a number; or the 'nbf' claim exists and is
> not a
>  * number.
>  */
> public NimbusSignedJwtOAuthBearerToken(SignedJWT signedJwt, String
> principalClaimName,
> String optionalScopeClaimName) throws ParseException {
> // etc...
> }
>
> The callback handler runs the following code if the digital signature
> validates:
>
> callback.token(new NimbusSignedJwtOAuthBearerToken(signedJwt, "sub",
> null));
>
> I hope that answers the first question.  If not let me know what I
> missed/misunderstood and I'll be glad to try to address it.
>
> Regarding the second question, the classes OAuthBearerTokenCallback and
> OAuthBearerValidatorCallback implement the Callback interface -- they are
> the callbacks that the AuthenticateCallbackHandler implementations need to
> handle.  Specifically, unless the unsecured functionality is what is
> desired, the two configuration values [listener.name.sasl_ssl.oauthbearer.
> ]sasl.login.callback.handler.class and
> listener.name.sasl_ssl.oauthbearer.sasl.server.callback.handler.class
> define the callback handlers that need to handle OAuthBearerTokenCallback
> and OAuthBearerValidatorCallback, respectively.
>
> Regarding the third question, yes, I see your point that the way the spec
> is worded could be taken to imply that the error code is a single
> character: "A single ASCII..." (
> https://tools.ietf.org/html/rfc6749#section-5.2).  However, it is not a
> single character.  See the end of that section 5.2 for an example that
> shows "error":"invalid_request" as the response.
>
> Thanks again for the +1 vote, Jun, and please do let me know if I can cover
> anything else.
>
> Ron
>
>
> On Mon, May 14, 2018 at 7:10 PM, Jun Rao  wrote:
>
> > Hi, Ron,
> >
> > Thanks for the KIP. +1 from me. Just a few minor comments below.
> >
> > 1. It seems that we can translate an OAuth token to a principle name
> > through the claim name configured in JASS. However, it's not clear to me

Re: [VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-15 Thread Damian Guy
Thanks. +1 (binding)

On Tue, 15 May 2018 at 01:04 Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> @Guozhang added. Thanks!
>
> El mar., 15 may. 2018 a las 5:50, Matthias J. Sax ( >)
> escribió:
>
> > +1 (binding)
> >
> > Thanks a lot for the KIP!
> >
> > -Matthias
> >
> > On 5/14/18 10:17 AM, Guozhang Wang wrote:
> > > +1 from me
> > >
> > > One more comment on the wiki: while reviewing the PR I realized that
> in `
> > > MockProcessorContext.java
> > > <
> >
> https://github.com/apache/kafka/pull/4955/files#diff-d5440e7338f775230019a86e6bcacccb
> > >`
> > > we are also adding one additional API plus modifying the existing
> > > `setRecordMetadata` API. Since this class is part of the public
> > test-utils
> > > package we should claim it in the wiki as well.
> > >
> > >
> > > Guozhang
> > >
> > > On Mon, May 14, 2018 at 8:43 AM, Ted Yu  wrote:
> > >
> > >> +1
> > >>
> > >> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
> > >> quilcate.jo...@gmail.com> wrote:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> I would like to start a vote on KIP-244: Add Record Header support to
> > >> Kafka
> > >>> Streams
> > >>>
> > >>> KIP wiki page:
> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> > >>>
> > >>> The discussion thread is here:
> > >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.
> > >>> mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%
> > >>> 40mail.gmail.com%3E
> > >>>
> > >>> Cheers,
> > >>> Jorge.
> > >>>
> > >>
> > >
> > >
> > >
> >
> >
>


Re: [VOTE] KIP-244: Add Record Header support to Kafka Streams

2018-05-15 Thread Bill Bejeck
Thanks for the KIP!

+1

-Bill

On Tue, May 15, 2018 at 1:47 PM, Damian Guy  wrote:

> Thanks. +1 (binding)
>
> On Tue, 15 May 2018 at 01:04 Jorge Esteban Quilcate Otoya <
> quilcate.jo...@gmail.com> wrote:
>
> > @Guozhang added. Thanks!
> >
> > El mar., 15 may. 2018 a las 5:50, Matthias J. Sax (<
> matth...@confluent.io
> > >)
> > escribió:
> >
> > > +1 (binding)
> > >
> > > Thanks a lot for the KIP!
> > >
> > > -Matthias
> > >
> > > On 5/14/18 10:17 AM, Guozhang Wang wrote:
> > > > +1 from me
> > > >
> > > > One more comment on the wiki: while reviewing the PR I realized that
> > in `
> > > > MockProcessorContext.java
> > > > <
> > >
> > https://github.com/apache/kafka/pull/4955/files#diff-
> d5440e7338f775230019a86e6bcacccb
> > > >`
> > > > we are also adding one additional API plus modifying the existing
> > > > `setRecordMetadata` API. Since this class is part of the public
> > > test-utils
> > > > package we should claim it in the wiki as well.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Mon, May 14, 2018 at 8:43 AM, Ted Yu  wrote:
> > > >
> > > >> +1
> > > >>
> > > >> On Mon, May 14, 2018 at 8:31 AM, Jorge Esteban Quilcate Otoya <
> > > >> quilcate.jo...@gmail.com> wrote:
> > > >>
> > > >>> Hi everyone,
> > > >>>
> > > >>> I would like to start a vote on KIP-244: Add Record Header support
> to
> > > >> Kafka
> > > >>> Streams
> > > >>>
> > > >>> KIP wiki page:
> > > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
> > > >>>
> > > >>> The discussion thread is here:
> > > >>> http://mail-archives.apache.org/mod_mbox/kafka-dev/201805.
> > > >>> mbox/%3CCAC3UcJvrgcBfe6%3DiW6%2BuTWsLB%2B4CsHgRmDx9TvCzJQrWvfg7_w%
> > > >>> 40mail.gmail.com%3E
> > > >>>
> > > >>> Cheers,
> > > >>> Jorge.
> > > >>>
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
>


Re: [VOTE] KIP-283: Efficient Memory Usage for Down-Conversion

2018-05-15 Thread Ismael Juma
Thanks for the KIP Dhruvil, this is a welcome improvement! My understanding
is that you have done some work to validate that the change has the desired
effect, it would be good to include that information in the "Testing
Strategy" section.

+1 (binding)

Ismael

On Wed, May 2, 2018 at 9:27 AM Dhruvil Shah  wrote:

> Hi all,
>
> I would like to start the vote on KIP-238: Efficient Memory Usage for
> Down-Conversion.
>
> For reference, the link to the KIP is here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-283%3A+Efficient+Memory+Usage+for+Down-Conversion
>
> and the discussion thread is here:
> https://www.mail-archive.com/dev@kafka.apache.org/msg86799.html
>
> Thanks,
> Dhruvil
>


Re: [VOTE] KIP-282: Add the listener name to the authentication context

2018-05-15 Thread Ismael Juma
Thanks for the KIP, +1 (binding).

Ismael

On Wed, Apr 25, 2018 at 1:52 AM Mickael Maison 
wrote:

> Hi,
>
> There has been no objections in the DISCUSS thread so I'd like to
> start a vote on KIP-282:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-282%3A+Add+the+listener+name+to+the+authentication+context
>
> Thanks
>


Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-15 Thread Piyush Vijay
Hi Colin,

Escaping at this level is making sense to me but let me think more and get
back to you.

But should we not just get rid of one of AclBinding or AclBindingFilter
then? Is there a reason to keep both given that AclBindingFilter and
AclBinding look exact copy of each other after this change? This will be a
one-time breaking change in APIs marked as "Evolving", but makes sense in
the long term? Am I missing something here?



Piyush Vijay

On Tue, May 15, 2018 at 9:01 AM, Colin McCabe  wrote:

> Hi Piyush,
>
> I think AclBinding should operate the same way as AclBindingFilter.
>
> So you should be able to do something like this:
> > AclBindingFilter filter = new AclBindingFiler(new
> ResourceFilter(ResourceType.GROUP, "foo*"))
> > AclBinding binding = new AclBinding(new Resource(ResourceType.GROUP,
> "foo*"))
> > assertTrue(filter.matches(binding));
>
> Thinking about this more, it's starting to feel really messy to create new
> "pattern" constructors for Resource and ResourceFilter.  I don't think
> people will be able to figure this out.  Maybe we should just have a
> limited compatibility break here, where it is now required to escape weird
> consumer group names when creating ACLs for them.
>
> To future-proof this, we should reserve a bunch of characters at once,
> like *, @, $, %, ^, &, +, [, ], etc.  If these characters appear in a
> resource name, it should be an error, unless they are escaped with a
> backslash.  That way, we can use them in the future.  We should create a
> Resource.escapeName function which adds the correct escape characters to
> resource names (so it would translate foo* into foo\*, foo+bar into
> foo\+bar, etc. etc.
>
> best,
> Colin
>
>
> On Mon, May 14, 2018, at 17:08, Piyush Vijay wrote:
> > Colin,
> >
> > createAcls take a AclBinding, however, instead of AclBindingFilter. What
> > are your thoughts here?
> >
> > public abstract DescribeAclsResult describeAcls(AclBindingFilter
> > filter, DescribeAclsOptions options);
> >
> > public abstract CreateAclsResult createAcls(Collection
> > acls, CreateAclsOptions options);
> >
> > public abstract DeleteAclsResult
> > deleteAcls(Collection filters, DeleteAclsOptions
> > options);
> >
> >
> > Thanks
> >
> > Piyush Vijay
> >
> > On Mon, May 14, 2018 at 9:26 AM, Andy Coates  wrote:
> >
> > > +1
> > >
> > > On 11 May 2018 at 17:14, Colin McCabe  wrote:
> > >
> > > > Hi Andy,
> > > >
> > > > I see what you mean.  I guess my thought here is that if the fields
> are
> > > > private, we can change it later if we need to.
> > > >
> > > > I definitely agree that we should use the scheme you describe for
> sending
> > > > ACLs over the wire (just the string + version number)
> > > >
> > > > cheers,
> > > > Colin
> > > >
> > > >
> > > > On Fri, May 11, 2018, at 09:39, Andy Coates wrote:
> > > > > i think I'm agreeing with you. I was merely suggesting that having
> an
> > > > > additional field that controls how the current field is
> interpreted is
> > > > more
> > > > > flexible / extensible in the future than using a 'union' style
> > > approach,
> > > > > where only one of several possible fields should be populated. But
> > > it's a
> > > > > minor thing.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 10 May 2018 at 09:29, Colin McCabe  wrote:
> > > > >
> > > > > > Hi Andy,
> > > > > >
> > > > > > The issue that I was trying to solve here is the Java API.  Right
> > > now,
> > > > > > someone can write "new ResourceFilter(ResourceType.
> TRANSACTIONAL_ID,
> > > > > > "foo*") and have a ResourceFilter that applies to a
> Transactional ID
> > > > named
> > > > > > "foo*".  This has to continue to work, or else we have broken
> > > > compatibility.
> > > > > >
> > > > > > I was proposing that there would be something like a new function
> > > like
> > > > > > ResourceFilter.fromPattern(ResourceType.TRANSACTIONAL_ID,
> "foo*")
> > > > which
> > > > > > would create a ResourceFilter that applied to transactional IDs
> > > > starting
> > > > > > with "foo", rather than transactional IDs named "foo*"
> specifically.
> > > > > >
> > > > > > I don't think it's important whether the Java class has an
> integer,
> > > an
> > > > > > enum, or two string fields.  The important thing is that there's
> a
> > > new
> > > > > > static function, or new constructor overload, etc. that works for
> > > > patterns
> > > > > > rather than literal strings.
> > > > > >
> > > > > > On Thu, May 10, 2018, at 03:30, Andy Coates wrote:
> > > > > > > Rather than having name and pattern fields on the
> ResourceFilter,
> > > > where
> > > > > > > it’s only valid for one to be set, and we want to restrict the
> > > > character
> > > > > > > set in case future enhancements need them, we could instead
> add a
> > > new
> > > > > > > integer ‘nameType’ field, and use constants to indicate how the
> > > name
> > > > > > > field should be interpreted, e.g. 0 = literal, 1 = wildcard.
> This
> > > > would
> > > > > > > be extendable, e.g we can l

Re: [DISCUSS] KIP-290: Support for wildcard suffixed ACLs

2018-05-15 Thread Rajini Sivaram
Hi Piyush,

It is possible to configure PrincipalBuilder for SASL (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-189%3A+Improve+principal+builder+interface+and+add+support+for+SASL).
If that satisfies your requirements, perhaps we can move wildcarded
principals out of this KIP and focus on wildcarded resources?


On Tue, May 15, 2018 at 7:15 PM, Piyush Vijay 
wrote:

> Hi Colin,
>
> Escaping at this level is making sense to me but let me think more and get
> back to you.
>
> But should we not just get rid of one of AclBinding or AclBindingFilter
> then? Is there a reason to keep both given that AclBindingFilter and
> AclBinding look exact copy of each other after this change? This will be a
> one-time breaking change in APIs marked as "Evolving", but makes sense in
> the long term? Am I missing something here?
>
>
>
> Piyush Vijay
>
> On Tue, May 15, 2018 at 9:01 AM, Colin McCabe  wrote:
>
> > Hi Piyush,
> >
> > I think AclBinding should operate the same way as AclBindingFilter.
> >
> > So you should be able to do something like this:
> > > AclBindingFilter filter = new AclBindingFiler(new
> > ResourceFilter(ResourceType.GROUP, "foo*"))
> > > AclBinding binding = new AclBinding(new Resource(ResourceType.GROUP,
> > "foo*"))
> > > assertTrue(filter.matches(binding));
> >
> > Thinking about this more, it's starting to feel really messy to create
> new
> > "pattern" constructors for Resource and ResourceFilter.  I don't think
> > people will be able to figure this out.  Maybe we should just have a
> > limited compatibility break here, where it is now required to escape
> weird
> > consumer group names when creating ACLs for them.
> >
> > To future-proof this, we should reserve a bunch of characters at once,
> > like *, @, $, %, ^, &, +, [, ], etc.  If these characters appear in a
> > resource name, it should be an error, unless they are escaped with a
> > backslash.  That way, we can use them in the future.  We should create a
> > Resource.escapeName function which adds the correct escape characters to
> > resource names (so it would translate foo* into foo\*, foo+bar into
> > foo\+bar, etc. etc.
> >
> > best,
> > Colin
> >
> >
> > On Mon, May 14, 2018, at 17:08, Piyush Vijay wrote:
> > > Colin,
> > >
> > > createAcls take a AclBinding, however, instead of AclBindingFilter.
> What
> > > are your thoughts here?
> > >
> > > public abstract DescribeAclsResult describeAcls(AclBindingFilter
> > > filter, DescribeAclsOptions options);
> > >
> > > public abstract CreateAclsResult createAcls(Collection
> > > acls, CreateAclsOptions options);
> > >
> > > public abstract DeleteAclsResult
> > > deleteAcls(Collection filters, DeleteAclsOptions
> > > options);
> > >
> > >
> > > Thanks
> > >
> > > Piyush Vijay
> > >
> > > On Mon, May 14, 2018 at 9:26 AM, Andy Coates 
> wrote:
> > >
> > > > +1
> > > >
> > > > On 11 May 2018 at 17:14, Colin McCabe  wrote:
> > > >
> > > > > Hi Andy,
> > > > >
> > > > > I see what you mean.  I guess my thought here is that if the fields
> > are
> > > > > private, we can change it later if we need to.
> > > > >
> > > > > I definitely agree that we should use the scheme you describe for
> > sending
> > > > > ACLs over the wire (just the string + version number)
> > > > >
> > > > > cheers,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Fri, May 11, 2018, at 09:39, Andy Coates wrote:
> > > > > > i think I'm agreeing with you. I was merely suggesting that
> having
> > an
> > > > > > additional field that controls how the current field is
> > interpreted is
> > > > > more
> > > > > > flexible / extensible in the future than using a 'union' style
> > > > approach,
> > > > > > where only one of several possible fields should be populated.
> But
> > > > it's a
> > > > > > minor thing.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 10 May 2018 at 09:29, Colin McCabe 
> wrote:
> > > > > >
> > > > > > > Hi Andy,
> > > > > > >
> > > > > > > The issue that I was trying to solve here is the Java API.
> Right
> > > > now,
> > > > > > > someone can write "new ResourceFilter(ResourceType.
> > TRANSACTIONAL_ID,
> > > > > > > "foo*") and have a ResourceFilter that applies to a
> > Transactional ID
> > > > > named
> > > > > > > "foo*".  This has to continue to work, or else we have broken
> > > > > compatibility.
> > > > > > >
> > > > > > > I was proposing that there would be something like a new
> function
> > > > like
> > > > > > > ResourceFilter.fromPattern(ResourceType.TRANSACTIONAL_ID,
> > "foo*")
> > > > > which
> > > > > > > would create a ResourceFilter that applied to transactional IDs
> > > > > starting
> > > > > > > with "foo", rather than transactional IDs named "foo*"
> > specifically.
> > > > > > >
> > > > > > > I don't think it's important whether the Java class has an
> > integer,
> > > > an
> > > > > > > enum, or two string fields.  The important thing is that
> there's
> > a
> > > > new
> > > > > > > static function, or new constructor overload, e

Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread John Roesler
Thanks for the KIP, Guozhang.

It looks good overall to me; I just have one question:
* Why do we bound the generics of KVMapper in KStream to be superclass-of-K
and superclass-of-V instead of exactly K and V, as in Topology? I might be
thinking about it wrong, but that seems backwards to me. If anything,
bounding to be a subclass-of-K or subclass-of-V would seem right to me.

One extra thought: I agree that KVMapper is an
applicable type for extracting the topic name, but I wonder what the value
of reusing the KVMapper for this purpose is. Would defining a new class,
say TopicNameExtractor, provide the same functionality while being a
bit more self-documenting?

Thanks,
-John

On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang  wrote:

> Hello folks,
>
> I'd like to start a discussion on adding dynamic routing functionality in
> Streams sink node. I.e. users do not need to specify the topic name at
> compilation time but can dynamically determine which topic to send to based
> on each record's key value pairs. Please find a KIP here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 303%3A+Add+Dynamic+Routing+in+Streams+Sink
>
> Any feedbacks are highly appreciated.
>
> Thanks!
>
> -- Guozhang
>


Re: [DISCUSS] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread John Roesler
Hi Bill,

Thanks for the KIP. Now that we're using strings describing the "set of
optimizations", such as "none" and "all", should we change the config name
to just "topology.optimizations"?

The "enable." feels like a holdover from the boolean-valued config.

Thanks,
-John

On Tue, May 8, 2018 at 9:13 PM, Bill Bejeck  wrote:

> Thanks, Guozhang and Matthias for the comments.
>
> I was thinking of changing the config type back to a String and enforcing
> the values to be "true" or "false", but "none" or "all" is just as good.
>
> Since those values seem to work, I'll update the KIP accordingly.
>
> Thanks,
> Bill
>
> On Tue, May 8, 2018 at 9:38 PM, Matthias J. Sax 
> wrote:
>
> > Sounds good to me.
> >
> > On 5/8/18 5:47 PM, Guozhang Wang wrote:
> > > Thanks Matthias.
> > >
> > > I was also thinking about whether in the future we'd want to enable
> > > optimizations at different levels that may or may not impact
> > compatibility.
> > > That's why I asked if we have thought about "allowing part of the
> > > optimizations in the future".
> > >
> > > With that in mind, I'd change my preference and take string typed
> config.
> > > Even if we ended up with no finer grained optimizations in the future
> we
> > > can still have the string typed parameter with only two allowed values,
> > > like what we did for EOS. But I think in 2.0 allowing any not-null
> string
> > > values as enabled is still a bit odd, so how about we make two string
> > > values, like `none` (default value) and `all`?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Tue, May 8, 2018 at 3:35 PM, Matthias J. Sax  >
> > > wrote:
> > >
> > >> One thought I want to bring up about switching optimization on/off:
> > >>
> > >> While for the initial release, a boolean flag seems to be sufficient,
> I
> > >> could imagine that we apply different and potentially
> > >> upgrade-incompatible optimizations in future releases. Thus, to me it
> > >> would make sense to use a String type, to indicate what optimizations
> > >> are possible based on the release. For example, in next release we
> > >> accept `null` for disabled and "2.0". If there are any incompatible
> > >> changes, people can stay with "2.0" optimizations level when upgrading
> > >> to "2.1" while new applications can use "2.1" optimization level. Old
> > >> applications would need to do an offline upgrade to get "2.1"
> > >> optimizations.
> > >>
> > >> I agree with Bill, that switching individual optimizations on/off is
> too
> > >> fine grained and hard to maintain. However, for compatibility, it
> might
> > >> make sense, to have certain "levels of optimizations" (based on the
> > >> release) that allow users to stay with on an older level for upgrade
> > >> purpose. Using the release numbers to encode those "levels" is easy to
> > >> understand for users and should minimize the mental effort to get the
> > >> config right.
> > >>
> > >> Let me know what you think about this.
> > >>
> > >>
> > >> -Matthias
> > >>
> > >> On 5/8/18 3:08 PM, Ted Yu wrote:
> > >>> Bill:That makes sense.
> > >>> Using boolean should suffice.
> > >>>  Original message From: Bill Bejeck <
> bbej...@gmail.com
> > >
> > >> Date: 5/8/18  2:48 PM  (GMT-08:00) To: dev@kafka.apache.org Subject:
> > Re:
> > >> [DISCUSS] KIP-295: Add Streams Configuration Allowing for Optional
> > Topology
> > >> Optimization
> > >>> Thanks for the comments Guozhang and Ted.
> > >>>
> > >>> Guozhang:
> > >>>
> > >>>   1) I'll update the KIP in the "Compatibility, Deprecation and
> > Migration
> > >>> Plan" with the expected impact of turning on optimization. But at
> this
> > >>> point, I have not identified a migration plan that doesn't involve
> > having
> > >>> to stop all instances and restart.
> > >>>
> > >>>   2) Setting the type to String was just so we could have the default
> > of
> > >>> null, indicating run no optimizations. As for partially enabling
> > >>> optimizations, I'm not sure I had that in mind, at least at this
> point.
> > >>>  To me having the topology optimized should be an "all or nothing"
> > >>> proposition.  For now, I'll change the type to boolean (with a
> default
> > >>> value of false) to better reflect the intent of the configuration.
> > >>>
> > >>> Ted, thanks again for the comments.
> > >>>
> > >>> The intent of the new configuration, as I mentioned above, is whether
> > to
> > >>> turn optimization on or off in aggregate.  The main reason for having
> > the
> > >>> configuration is for backward compatibility.  As optimization may
> > result
> > >> in
> > >>> a slightly different topology from the original one, we need to allow
> > >> users
> > >>> to turn it off until they are ready for migrating to the new
> topology.
> > >>>
> > >>> I don't think having to select each optimization is a feasible
> > >> solution.  I
> > >>> say this for few reasons:
> > >>>
> > >>>1. Maintenance will be an issue.  While the initial target is
> only a
> > >>>small number of optimizations, bu

[jira] [Created] (KAFKA-6905) Document that Processor objects can be reused

2018-05-15 Thread David Glasser (JIRA)
David Glasser created KAFKA-6905:


 Summary: Document that Processor objects can be reused
 Key: KAFKA-6905
 URL: https://issues.apache.org/jira/browse/KAFKA-6905
 Project: Kafka
  Issue Type: Improvement
  Components: documentation, streams
Reporter: David Glasser


We learned the hard way that Kafka Streams will reuse Processor objects by 
calling init() on them after they've been close()d.  This caused a bug in our 
application as we assumed we didn't have to reset all of our Processor's state 
to a proper starting state on init().

As far as I can tell, this is completely undocumented. The fact that we provide 
Processors to Kafka Streams via a ProcessorSupplier factory rather than just by 
passing in a Processor object made it seem likely that in fact Streams was 
creating Processors from scratch each time it needed a new one.

The developer guide 
([https://docs.confluent.io/current/streams/developer-guide/processor-api.html)]
 doesn't even allude to the existence of the close() method, let alone the idea 
that init() may be called after close().

The Javadocs for Processor.init says: "The framework ensures this is called 
once per processor when the topology that contains it is initialized."  I 
personally interpreted that as meaning that it only is ever called once!  I can 
see that you could interpret it otherwise, but it's definitely unclear.

I can send a PR but first want to confirm that this is a doc problem and not a 
bug!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-298: Error Handling in Connect

2018-05-15 Thread Magesh Nandakumar
Hi Arjun,

I think this a great KIP and would be a great addition to have in connect.
Had a couple of minor questions:

1. What would be the value in logging the connector config using
errors.log.include.configs
for every message?
2. Not being picky on format here but it might be clearer if the behavior
is called out for each stage separately and what the connector developers
need to do ( may be a tabular format). Also, I think all retriable
exception when talking to Broker are never propagated to the Connect
Framework since the producer is configured to try indefinitely
3. If a message fails in serialization, would the raw bytes be available to
the dlq or the error log
4. Its not necessary to mention in KIP, but it might be better to separate
the error records to a separate log file as part of the default log4j
properties
5. If we disable message logging, would there be any other metadata
available like offset that helps reference the record?
6. If I need error handler for all my connectors, would I have to set it up
for each of them? I would think most people might want the behavior applied
to all the connectors.

Let me know your thoughts :).

Thanks
Magesh

On Tue, May 8, 2018 at 11:59 PM, Arjun Satish 
wrote:

> All,
>
> I'd like to start a discussion on adding ways to handle and report record
> processing errors in Connect. Please find a KIP here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 298%3A+Error+Handling+in+Connect
>
> Any feedback will be highly appreciated.
>
> Thanks very much,
> Arjun
>


Re: [DISCUSS] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread Bill Bejeck
Thanks for the comments, John.  I've updated the config name on the KIP..

-Bill


On Tue, May 15, 2018 at 2:22 PM, John Roesler  wrote:

> Hi Bill,
>
> Thanks for the KIP. Now that we're using strings describing the "set of
> optimizations", such as "none" and "all", should we change the config name
> to just "topology.optimizations"?
>
> The "enable." feels like a holdover from the boolean-valued config.
>
> Thanks,
> -John
>
> On Tue, May 8, 2018 at 9:13 PM, Bill Bejeck  wrote:
>
> > Thanks, Guozhang and Matthias for the comments.
> >
> > I was thinking of changing the config type back to a String and enforcing
> > the values to be "true" or "false", but "none" or "all" is just as good.
> >
> > Since those values seem to work, I'll update the KIP accordingly.
> >
> > Thanks,
> > Bill
> >
> > On Tue, May 8, 2018 at 9:38 PM, Matthias J. Sax 
> > wrote:
> >
> > > Sounds good to me.
> > >
> > > On 5/8/18 5:47 PM, Guozhang Wang wrote:
> > > > Thanks Matthias.
> > > >
> > > > I was also thinking about whether in the future we'd want to enable
> > > > optimizations at different levels that may or may not impact
> > > compatibility.
> > > > That's why I asked if we have thought about "allowing part of the
> > > > optimizations in the future".
> > > >
> > > > With that in mind, I'd change my preference and take string typed
> > config.
> > > > Even if we ended up with no finer grained optimizations in the future
> > we
> > > > can still have the string typed parameter with only two allowed
> values,
> > > > like what we did for EOS. But I think in 2.0 allowing any not-null
> > string
> > > > values as enabled is still a bit odd, so how about we make two string
> > > > values, like `none` (default value) and `all`?
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Tue, May 8, 2018 at 3:35 PM, Matthias J. Sax <
> matth...@confluent.io
> > >
> > > > wrote:
> > > >
> > > >> One thought I want to bring up about switching optimization on/off:
> > > >>
> > > >> While for the initial release, a boolean flag seems to be
> sufficient,
> > I
> > > >> could imagine that we apply different and potentially
> > > >> upgrade-incompatible optimizations in future releases. Thus, to me
> it
> > > >> would make sense to use a String type, to indicate what
> optimizations
> > > >> are possible based on the release. For example, in next release we
> > > >> accept `null` for disabled and "2.0". If there are any incompatible
> > > >> changes, people can stay with "2.0" optimizations level when
> upgrading
> > > >> to "2.1" while new applications can use "2.1" optimization level.
> Old
> > > >> applications would need to do an offline upgrade to get "2.1"
> > > >> optimizations.
> > > >>
> > > >> I agree with Bill, that switching individual optimizations on/off is
> > too
> > > >> fine grained and hard to maintain. However, for compatibility, it
> > might
> > > >> make sense, to have certain "levels of optimizations" (based on the
> > > >> release) that allow users to stay with on an older level for upgrade
> > > >> purpose. Using the release numbers to encode those "levels" is easy
> to
> > > >> understand for users and should minimize the mental effort to get
> the
> > > >> config right.
> > > >>
> > > >> Let me know what you think about this.
> > > >>
> > > >>
> > > >> -Matthias
> > > >>
> > > >> On 5/8/18 3:08 PM, Ted Yu wrote:
> > > >>> Bill:That makes sense.
> > > >>> Using boolean should suffice.
> > > >>>  Original message From: Bill Bejeck <
> > bbej...@gmail.com
> > > >
> > > >> Date: 5/8/18  2:48 PM  (GMT-08:00) To: dev@kafka.apache.org
> Subject:
> > > Re:
> > > >> [DISCUSS] KIP-295: Add Streams Configuration Allowing for Optional
> > > Topology
> > > >> Optimization
> > > >>> Thanks for the comments Guozhang and Ted.
> > > >>>
> > > >>> Guozhang:
> > > >>>
> > > >>>   1) I'll update the KIP in the "Compatibility, Deprecation and
> > > Migration
> > > >>> Plan" with the expected impact of turning on optimization. But at
> > this
> > > >>> point, I have not identified a migration plan that doesn't involve
> > > having
> > > >>> to stop all instances and restart.
> > > >>>
> > > >>>   2) Setting the type to String was just so we could have the
> default
> > > of
> > > >>> null, indicating run no optimizations. As for partially enabling
> > > >>> optimizations, I'm not sure I had that in mind, at least at this
> > point.
> > > >>>  To me having the topology optimized should be an "all or nothing"
> > > >>> proposition.  For now, I'll change the type to boolean (with a
> > default
> > > >>> value of false) to better reflect the intent of the configuration.
> > > >>>
> > > >>> Ted, thanks again for the comments.
> > > >>>
> > > >>> The intent of the new configuration, as I mentioned above, is
> whether
> > > to
> > > >>> turn optimization on or off in aggregate.  The main reason for
> having
> > > the
> > > >>> configuration is for backward compatibility.  As optimization may
> > > result
> > > >> in
> >

Re: [DISCUSS] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread Bill Bejeck
At this point, I think the overall discussion on the new config is
completed and I'll start a voting thread.

Thanks,
Bill



On Tue, May 15, 2018 at 4:37 PM, Bill Bejeck  wrote:

> Thanks for the comments, John.  I've updated the config name on the KIP..
>
> -Bill
>
>
> On Tue, May 15, 2018 at 2:22 PM, John Roesler  wrote:
>
>> Hi Bill,
>>
>> Thanks for the KIP. Now that we're using strings describing the "set of
>> optimizations", such as "none" and "all", should we change the config name
>> to just "topology.optimizations"?
>>
>> The "enable." feels like a holdover from the boolean-valued config.
>>
>> Thanks,
>> -John
>>
>> On Tue, May 8, 2018 at 9:13 PM, Bill Bejeck  wrote:
>>
>> > Thanks, Guozhang and Matthias for the comments.
>> >
>> > I was thinking of changing the config type back to a String and
>> enforcing
>> > the values to be "true" or "false", but "none" or "all" is just as good.
>> >
>> > Since those values seem to work, I'll update the KIP accordingly.
>> >
>> > Thanks,
>> > Bill
>> >
>> > On Tue, May 8, 2018 at 9:38 PM, Matthias J. Sax 
>> > wrote:
>> >
>> > > Sounds good to me.
>> > >
>> > > On 5/8/18 5:47 PM, Guozhang Wang wrote:
>> > > > Thanks Matthias.
>> > > >
>> > > > I was also thinking about whether in the future we'd want to enable
>> > > > optimizations at different levels that may or may not impact
>> > > compatibility.
>> > > > That's why I asked if we have thought about "allowing part of the
>> > > > optimizations in the future".
>> > > >
>> > > > With that in mind, I'd change my preference and take string typed
>> > config.
>> > > > Even if we ended up with no finer grained optimizations in the
>> future
>> > we
>> > > > can still have the string typed parameter with only two allowed
>> values,
>> > > > like what we did for EOS. But I think in 2.0 allowing any not-null
>> > string
>> > > > values as enabled is still a bit odd, so how about we make two
>> string
>> > > > values, like `none` (default value) and `all`?
>> > > >
>> > > >
>> > > > Guozhang
>> > > >
>> > > >
>> > > > On Tue, May 8, 2018 at 3:35 PM, Matthias J. Sax <
>> matth...@confluent.io
>> > >
>> > > > wrote:
>> > > >
>> > > >> One thought I want to bring up about switching optimization on/off:
>> > > >>
>> > > >> While for the initial release, a boolean flag seems to be
>> sufficient,
>> > I
>> > > >> could imagine that we apply different and potentially
>> > > >> upgrade-incompatible optimizations in future releases. Thus, to me
>> it
>> > > >> would make sense to use a String type, to indicate what
>> optimizations
>> > > >> are possible based on the release. For example, in next release we
>> > > >> accept `null` for disabled and "2.0". If there are any incompatible
>> > > >> changes, people can stay with "2.0" optimizations level when
>> upgrading
>> > > >> to "2.1" while new applications can use "2.1" optimization level.
>> Old
>> > > >> applications would need to do an offline upgrade to get "2.1"
>> > > >> optimizations.
>> > > >>
>> > > >> I agree with Bill, that switching individual optimizations on/off
>> is
>> > too
>> > > >> fine grained and hard to maintain. However, for compatibility, it
>> > might
>> > > >> make sense, to have certain "levels of optimizations" (based on the
>> > > >> release) that allow users to stay with on an older level for
>> upgrade
>> > > >> purpose. Using the release numbers to encode those "levels" is
>> easy to
>> > > >> understand for users and should minimize the mental effort to get
>> the
>> > > >> config right.
>> > > >>
>> > > >> Let me know what you think about this.
>> > > >>
>> > > >>
>> > > >> -Matthias
>> > > >>
>> > > >> On 5/8/18 3:08 PM, Ted Yu wrote:
>> > > >>> Bill:That makes sense.
>> > > >>> Using boolean should suffice.
>> > > >>>  Original message From: Bill Bejeck <
>> > bbej...@gmail.com
>> > > >
>> > > >> Date: 5/8/18  2:48 PM  (GMT-08:00) To: dev@kafka.apache.org
>> Subject:
>> > > Re:
>> > > >> [DISCUSS] KIP-295: Add Streams Configuration Allowing for Optional
>> > > Topology
>> > > >> Optimization
>> > > >>> Thanks for the comments Guozhang and Ted.
>> > > >>>
>> > > >>> Guozhang:
>> > > >>>
>> > > >>>   1) I'll update the KIP in the "Compatibility, Deprecation and
>> > > Migration
>> > > >>> Plan" with the expected impact of turning on optimization. But at
>> > this
>> > > >>> point, I have not identified a migration plan that doesn't involve
>> > > having
>> > > >>> to stop all instances and restart.
>> > > >>>
>> > > >>>   2) Setting the type to String was just so we could have the
>> default
>> > > of
>> > > >>> null, indicating run no optimizations. As for partially enabling
>> > > >>> optimizations, I'm not sure I had that in mind, at least at this
>> > point.
>> > > >>>  To me having the topology optimized should be an "all or nothing"
>> > > >>> proposition.  For now, I'll change the type to boolean (with a
>> > default
>> > > >>> value of false) to better reflect the intent of the configuration.
>> > > >>>
>> > > >>>

[VOTE] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread Bill Bejeck
Hi all,

I'd like to start a vote on KIP-295: Add Streams Configuration Allowing for
Optional Topology Optimization.

KIP wiki page:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-295%3A+Add+Streams+Configuration+Allowing+for+Optional+Topology+Optimization

Discussion thread:
https://www.mail-archive.com/dev@kafka.apache.org/msg87593.html

Thanks,
Bill


Re: [VOTE] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread Matthias J. Sax
+1 (binding)


On 5/15/18 1:45 PM, Bill Bejeck wrote:
> Hi all,
> 
> I'd like to start a vote on KIP-295: Add Streams Configuration Allowing for
> Optional Topology Optimization.
> 
> KIP wiki page:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-295%3A+Add+Streams+Configuration+Allowing+for+Optional+Topology+Optimization
> 
> Discussion thread:
> https://www.mail-archive.com/dev@kafka.apache.org/msg87593.html
> 
> Thanks,
> Bill
> 



signature.asc
Description: OpenPGP digital signature


Re: [EXTERNAL] Kafka Connect: New Kafka Source Connector

2018-05-15 Thread McCaig, Rhys
Hi Team,

Would someone be able to provide me with Confluence permission in order to 
write a KIP for the below code.

User: https://cwiki.apache.org/confluence/display/~mccaig

Cheers,
Rhys

On May 11, 2018, at 4:45 PM, McCaig, Rhys 
mailto:rhys_mcc...@comcast.com>> wrote:

Hi there,

Over at Comcast we just open sourced a Kafka source connector for Kafka 
Connect. (https://github.com/Comcast/MirrorTool-for-Kafka-Connect) We’ve used 
this as an alternative to MirrorMaker on a couple of projects.
While discussing open sourcing the project, we realized that the functionality 
is similar to a connector that was suggested in the original Kafka Connect KIP 
(https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=58851767#KIP-26-AddKafkaConnectframeworkfordataimport/export-MirrorMaker).

Given this - we we’re wondering if there would be interest from the Kafka 
community in adopting the connector into the main Kafka codebase. We’d be more 
than happy to donate the code and help get it integrated.

Cheers,
Rhys McCaig



[jira] [Resolved] (KAFKA-6896) add producer metrics exporting in KafkaStreams.java

2018-05-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-6896.
--
Resolution: Fixed

> add producer metrics exporting in KafkaStreams.java
> ---
>
> Key: KAFKA-6896
> URL: https://issues.apache.org/jira/browse/KAFKA-6896
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Boyang Chen
>Assignee: Boyang Chen
>Priority: Major
>
> We would like to also export the producer metrics from {{StreamThread}} just 
> like consumer metrics, so that we could gain more visibility of stream 
> application. The approach is to pass in the \{{threadProducer}}into the 
> StreamThread so that we could export its metrics in dynamic.
> Note that this is a pure internal change that doesn't require a KIP, and in 
> the future we also want to export admin client metrics. A followup KIP for 
> admin client will be created once this is merged.
> Pull request here: https://github.com/apache/kafka/pull/4998



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread Guozhang Wang
+1 (binding).

On Tue, May 15, 2018 at 2:16 PM, Matthias J. Sax 
wrote:

> +1 (binding)
>
>
> On 5/15/18 1:45 PM, Bill Bejeck wrote:
> > Hi all,
> >
> > I'd like to start a vote on KIP-295: Add Streams Configuration Allowing
> for
> > Optional Topology Optimization.
> >
> > KIP wiki page:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 295%3A+Add+Streams+Configuration+Allowing+for+
> Optional+Topology+Optimization
> >
> > Discussion thread:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg87593.html
> >
> > Thanks,
> > Bill
> >
>
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-301 Schema Inferencing in JsonConverters

2018-05-15 Thread Allen Tang
I've went through several iterations of back-and-forth with @rhauch on the
PR and on Confluent's Slack Community. The current thinking is that assuming
an empty array is a String array is not necessarily the best option, nor is
assuming that all null values in a JSON node is a String.

We might be able to account for these potentially false
assumptions/inferences by introducing new task properties (with
value.converter prefix) that explicitly define overrides for either
specific json field keys, or give the option for Kafka Connect users to
provide a full immutabl schema they know are true for the topics impacted
by the Sink Connector.

What do you think?

- Allen


On Mon, May 14, 2018 at 2:58 PM, Allen Tang  wrote:

> Hi,
>
> I just opened a KIP to add Schema Inferencing in JsonConverters for Kafka 
> Connect.
>
> The details of the proposal can be found here: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-301%3A+Schema+Inferencing+for+JsonConverter
>
> Also, I have created a -
>
> 1.) JIRA ticket: https://issues.apache.org/jira/browse/KAFKA-6895
>
> 2.) Provisional PR with initial discussion: 
> https://github.com/apache/kafka/pull/5001
>
> Looking forward to the community's feedback! Cheers!
>
> -Allen
>
>


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Matthias J. Sax
Thanks for the KIP.

+1

@John: compare
https://cwiki.apache.org/confluence/display/KAFKA/KIP-100+-+Relax+Type+constraints+in+Kafka+Streams+API
about the generics


-Matthias

On 5/15/18 11:19 AM, John Roesler wrote:
> Thanks for the KIP, Guozhang.
> 
> It looks good overall to me; I just have one question:
> * Why do we bound the generics of KVMapper in KStream to be superclass-of-K
> and superclass-of-V instead of exactly K and V, as in Topology? I might be
> thinking about it wrong, but that seems backwards to me. If anything,
> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
> 
> One extra thought: I agree that KVMapper is an
> applicable type for extracting the topic name, but I wonder what the value
> of reusing the KVMapper for this purpose is. Would defining a new class,
> say TopicNameExtractor, provide the same functionality while being a
> bit more self-documenting?
> 
> Thanks,
> -John
> 
> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang  wrote:
> 
>> Hello folks,
>>
>> I'd like to start a discussion on adding dynamic routing functionality in
>> Streams sink node. I.e. users do not need to specify the topic name at
>> compilation time but can dynamically determine which topic to send to based
>> on each record's key value pairs. Please find a KIP here:
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 303%3A+Add+Dynamic+Routing+in+Streams+Sink
>>
>> Any feedbacks are highly appreciated.
>>
>> Thanks!
>>
>> -- Guozhang
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Guozhang Wang
Hello John:

* As for the type superclass, it is mainly for allowing super class serdes.
More details can be found here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-100+-+Relax+Type+constraints+in+Kafka+Streams+API

* I may have slight preference on reusing existing classes but I think most
of my rationales are quite subjective. Personally I do not find `self
documenting` worth a new class but I can be convinced if people have other
motivations doing it :)


Guozhang


On Tue, May 15, 2018 at 11:19 AM, John Roesler  wrote:

> Thanks for the KIP, Guozhang.
>
> It looks good overall to me; I just have one question:
> * Why do we bound the generics of KVMapper in KStream to be superclass-of-K
> and superclass-of-V instead of exactly K and V, as in Topology? I might be
> thinking about it wrong, but that seems backwards to me. If anything,
> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
>
> One extra thought: I agree that KVMapper is an
> applicable type for extracting the topic name, but I wonder what the value
> of reusing the KVMapper for this purpose is. Would defining a new class,
> say TopicNameExtractor, provide the same functionality while being a
> bit more self-documenting?
>
> Thanks,
> -John
>
> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
> wrote:
>
> > Hello folks,
> >
> > I'd like to start a discussion on adding dynamic routing functionality in
> > Streams sink node. I.e. users do not need to specify the topic name at
> > compilation time but can dynamically determine which topic to send to
> based
> > on each record's key value pairs. Please find a KIP here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 303%3A+Add+Dynamic+Routing+in+Streams+Sink
> >
> > Any feedbacks are highly appreciated.
> >
> > Thanks!
> >
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: [VOTE] KIP-295: Add Streams Configuration Allowing for Optional Topology Optimization

2018-05-15 Thread Ted Yu
+1
 Original message From: Guozhang Wang  
Date: 5/15/18  2:34 PM  (GMT-08:00) To: dev@kafka.apache.org Subject: Re: 
[VOTE] KIP-295: Add Streams Configuration Allowing for Optional Topology 
Optimization 
+1 (binding).

On Tue, May 15, 2018 at 2:16 PM, Matthias J. Sax 
wrote:

> +1 (binding)
>
>
> On 5/15/18 1:45 PM, Bill Bejeck wrote:
> > Hi all,
> >
> > I'd like to start a vote on KIP-295: Add Streams Configuration Allowing
> for
> > Optional Topology Optimization.
> >
> > KIP wiki page:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 295%3A+Add+Streams+Configuration+Allowing+for+
> Optional+Topology+Optimization
> >
> > Discussion thread:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg87593.html
> >
> > Thanks,
> > Bill
> >
>
>


-- 
-- Guozhang


[jira] [Created] (KAFKA-6906) Kafka Streams does not commit transactions if data is produced via wall-clock punctuation

2018-05-15 Thread Matthias J. Sax (JIRA)
Matthias J. Sax created KAFKA-6906:
--

 Summary: Kafka Streams does not commit transactions if data is 
produced via wall-clock punctuation
 Key: KAFKA-6906
 URL: https://issues.apache.org/jira/browse/KAFKA-6906
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 1.1.0
Reporter: Matthias J. Sax


Committing in Kafka Streams happens in regular intervals. However, committing 
only happens if new input records got processed since the last commit (via 
setting flag `commitOffsetNeeded` within `StreamTask#process()`)

However, data could also be emitted via wall-clock based punctuation calls. 
Especially if EOS is enabled, this is an issue (maybe also for non-EOS) because 
the current running transaction is not committed and thus might time out 
leading to a fatal error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-297: Externalizing Secrets for Connect Configurations

2018-05-15 Thread Colin McCabe
Hi Robert,

Thanks for posting this.  In the past we've been kind of reluctant to add more 
complexity to configuration.  I think Connect does have a clear need for this 
kind of functionality, though.  As you mention, Connect integrates with 
external systems, which are very likely to have passwords stored in Vault, 
KeyWhiz or some other external system.

The KIP says that "Vault is very popular and has been described as 'the current 
gold standard in secret management and provisioning'."  I think this might be a 
bit too much detail -- we don't really need to pick favorites, right? :)

I think we should make configuration consistent between the broker and Connect. 
 If people can use constructs like 
jdbc.config.key="${vault:jdbc.user}${vault:jdbc.password}" in Connect, they'll 
want to do it on the broker too, in a consistent way.

If I understand correctly, ConfigProvider represents an external configuration 
source, such as VaultConfigProvider, KeyWhizConfigProvider, etc.

I think we should make the substitution part of the generic configuration code, 
rather than specific to individual ConfigProviders.  We don't really want it to 
work differently for Vault vs. KeyWhiz vs. AWS secrets, etc. etc.

We should also spell out exactly how substitution works.  For example, is 
substitution limited to 1 level deep?  In other words, If I have foo="${bar}" 
and bar=${baz}, probably foo should just be set equal to "${baz}" rather than 
chasing more than one level of indirection.

We should also spell out how this interacts with KIP-226 configurations.  I 
would suggest that KIP-226 variables not be subjected to substitution.  The 
reason is because in theory substitution could lead to different results on 
different brokers, since the different brokers may not have the same 
ConfigProviders configured.  Also, having substitutions in the KIP-226 
configuration makes it more difficult for the admin to understand what the 
centrally managed configuration is.

It seems the main goal is the ability to load a batch of key/value pairs from 
the ConfigProvider, and the ability to subscribe to notifications about changes 
to certain parameters.  Maybe a good generic interface would be like this:

 > public interface ConfigProvider extends Closeable {
>  // batched get is potentially more efficient
 > Map get(Collection keys);
>
>// The ConfigProvider is responsible for making this callback whenever the 
> key changes.
>// Some ConfigProviders may want to have a background thread with a 
> configurable update interval.
 > void subscribe(String key, ConfigurationChangeCallback callback);
>
>// Inverse of subscribe
 > void unsubscribe(String key);
>
>// Close all subscriptions and clean up all resources
 > void close();
 > }
 > 
 > interface ConfigurationChangeCallback {
 > void onChange(String key, String value);
 > }

With regard to ConfigTransformer: do we need to include all this code in the 
KIP?  Seems like an implementation detail.

> Other connectors such as the S3 connector are tightly coupled with a 
> particular secret manager, and may
> wish to handle rotation on their own.  

Is there a way to avoid this couping?  Seems like some users might want to use 
their own secret manager here.

best,
Colin


On Wed, May 9, 2018, at 16:32, Robert Yokota wrote:
> Hi Magesh,
> 
> I updated the KIP with a link to a PR for a working prototype.  The
> prototype does not yet use the Connect plugin machinery for class loader
> isolation, but should give you an idea of what the final implementation
> will look like.  Here is the link:
> https://github.com/apache/kafka/pull/4990/files.
> 
> I also added an example of a FileConfigProvider to the KIP.
> 
> Thanks,
> Robert
> 
> On Wed, May 9, 2018 at 10:04 AM, Robert Yokota  wrote:
> 
> > Hi Magesh,
> >
> > Thanks for the feedback!
> >
> > I will put together a PR to demonstrate what the implementation might look
> > like, as well as a reference FileConfigProvider.
> >
> > 1.  The delayMs for a (potentially) scheduled reload is determined by the
> > ConfigProvider.  For example, a (hypothetical) VaultConfigProvider, upon
> > contacting Vault for a particular secret, might also obtain a lease
> > duration indicating that the secret expires in 1 hour. The
> > VaultConfigProvider could then call scheduleConfigReload with delayMs set
> > to 360ms (1 hour).  This would cause the Connector to restart in an
> > hour, forcing it to reload the configs and re-resolve all indirect
> > references.
> >
> > 2. Yes, the start() methods in SourceTask and SinkTask would get the
> > configs with all the indirect references resolved.   Those config() methods
> > are for Connectors that want to get the latest configs (with all indirect
> > references re-resolved) at some time after start().  For example, if a Task
> > encountered some security exception because a secret expired, it could call
> > config() to get the config with the latest values.  This

Re: [DISCUSS] KIP-297: Externalizing Secrets for Connect Configurations

2018-05-15 Thread Robert Yokota
Hi Colin,

Thanks for the feedback!


> The KIP says that "Vault is very popular and has been described as 'the
current gold standard in secret management and provisioning'."  I think
this might be a bit too much detail -- we don't really need to
> favorites, right? :)

I've removed this line :)


> I think we should make the substitution part of the generic configuration
code, rather than specific to individual ConfigProviders.  We don't really
want it to work differently for Vault vs. KeyWhiz vs.
> AWS secrets, etc. etc.

Yes, the ConfigProviders merely serve up key-value pairs.  A helper class
like ConfigTransformer would perform the variable substitutions if desired.


> We should also spell out exactly how substitution works.

By one-level of indirection I just meant a simple replacement of variables
(which are the indirect references).  So if you have foo=${bar} and
bar=${baz} and your file contains bar=hello, baz=world, then the final
result would be foo=hello and bar=world.  I've added this example to the
KIP.

You can see this as the DEFAULT_PATTERN in the ConfigTransformer.  The
ConfigTransformer only provides one level of indirection.


> We should also spell out how this interacts with KIP-226 configurations.

Yes, I mention at the beginning that KIP-226 could use the ConfigProvider
but not the ConfigTransformer.


> Maybe a good generic interface would be like this:

I've added the subscription APIs but would like to keep the other APIs as I
will need them for integration with Vault.  With Vault I obtain the lease
duration at the time the key is obtained, so at that time I would want to
use the lease duration to schedule a configuration reload in the future.
This is similar to how the integration between Vault and the Spring
Framework works.   Also, the lease duration would be included in the
metadata map vs. the data map.  Finally, I need an additional "path" or
"bucket" parameter, which is used by Vault to indicate which set of
key-values are to be retrieved.


> With regard to ConfigTransformer: do we need to include all this code in
the KIP?  Seems like an implementation detail.

I use the ConfigTransformer to show how the pattern ${provider:key} is
defined and how the substitution only involves one level of indirection.
If you feel it does not add anything to the text, I can remove it.


> Is there a way to avoid this couping?

I'd have to look into it and get back to you.  However, I assume that the
answer is not relevant for this KIP :)


Thanks,
Robert





On Tue, May 15, 2018 at 4:04 PM, Colin McCabe  wrote:

> Hi Robert,
>
> Thanks for posting this.  In the past we've been kind of reluctant to add
> more complexity to configuration.  I think Connect does have a clear need
> for this kind of functionality, though.  As you mention, Connect integrates
> with external systems, which are very likely to have passwords stored in
> Vault, KeyWhiz or some other external system.
>
> The KIP says that "Vault is very popular and has been described as 'the
> current gold standard in secret management and provisioning'."  I think
> this might be a bit too much detail -- we don't really need to pick
> favorites, right? :)
>
> I think we should make configuration consistent between the broker and
> Connect.  If people can use constructs like 
> jdbc.config.key="${vault:jdbc.user}${vault:jdbc.password}"
> in Connect, they'll want to do it on the broker too, in a consistent way.
>
> If I understand correctly, ConfigProvider represents an external
> configuration source, such as VaultConfigProvider, KeyWhizConfigProvider,
> etc.
>
> I think we should make the substitution part of the generic configuration
> code, rather than specific to individual ConfigProviders.  We don't really
> want it to work differently for Vault vs. KeyWhiz vs. AWS secrets, etc. etc.
>
> We should also spell out exactly how substitution works.  For example, is
> substitution limited to 1 level deep?  In other words, If I have
> foo="${bar}" and bar=${baz}, probably foo should just be set equal to
> "${baz}" rather than chasing more than one level of indirection.
>
> We should also spell out how this interacts with KIP-226 configurations.
> I would suggest that KIP-226 variables not be subjected to substitution.
> The reason is because in theory substitution could lead to different
> results on different brokers, since the different brokers may not have the
> same ConfigProviders configured.  Also, having substitutions in the KIP-226
> configuration makes it more difficult for the admin to understand what the
> centrally managed configuration is.
>
> It seems the main goal is the ability to load a batch of key/value pairs
> from the ConfigProvider, and the ability to subscribe to notifications
> about changes to certain parameters.  Maybe a good generic interface would
> be like this:
>
>  > public interface ConfigProvider extends Closeable {
> >  // batched get is potentially more efficient
>  > Map get(Collection keys

Build failed in Jenkins: kafka-trunk-jdk10 #107

2018-05-15 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-6896: Add producer metrics exporting in KafkaStreams (#4998)

--
[...truncated 1.50 MB...]
kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToEarliest STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToEarliest PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsExportImportPlan 
STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsExportImportPlan 
PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToSpecificOffset 
STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToSpecificOffset 
PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsShiftPlus STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsShiftPlus PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLatest STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLatest PASSED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsNewConsumerExistingTopic STARTED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsNewConsumerExistingTopic PASSED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsShiftByLowerThanEarliest STARTED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsShiftByLowerThanEarliest PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsByDuration STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsByDuration PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLocalDateTime 
STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLocalDateTime 
PASSED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsToEarliestOnTopicsAndPartitions STARTED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsToEarliestOnTopicsAndPartitions PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToEarliestOnTopics 
STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToEarliestOnTopics 
PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfBlankArg STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfBlankArg PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowVerifyWithoutReassignmentOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowVerifyWithoutReassignmentOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowTopicsOptionWithVerify STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowTopicsOptionWithVerify PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithThrottleOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithThrottleOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfNoArgs STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfNoArgs PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithoutReassignmentOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithoutReassignmentOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowBrokersListWithVerifyOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowBrokersListWithVerifyOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumExecuteOptions STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumExecuteOptions PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithReassignmentOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithReassignmentOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumGenerateOptions STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumGenerateOptions PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersAndTopicsOptions STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersAndTopicsOptions PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowThrottleWithVerifyOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowThrottleWithVerifyOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldUseDefaultsIfEnabled 
STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldUseDefaultsIfEnabled 
PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldAllowThrottleOptionOnExecute STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldAllo

Re: [DISCUSS] KIP-298: Error Handling in Connect

2018-05-15 Thread Arjun Satish
Matt,

Thanks so much for your comments. Really appreciate it!

1. Good point about the acronym. I can use deadletterqueue instead of dlq
(using all lowercase to be consistent with the other configs in Kafka).
What do you think?

2. Could you please tell us what errors caused these tasks to fail? Were
they because of external system failures? And if so, could they be
implemented in the Connector itself? Or using retries with backoffs?

3. I like this idea. But did not include it here since it might be a
stretch. One thing to note is that ConnectExceptions can be thrown from a
variety of places in a connector. I think it should be OK for the Connector
to throw RetriableException or something that extends it for the operation
to be retried. By changing this behavior, a lot of existing connectors
would have to be updated so that they don't rewrite messages into this
sink. For example, a sink connector might write some data into the external
system partially, and then fail with a ConnectException. Since the
framework has no way of knowing what was written and what was not, a retry
here might cause the same data to written again into the sink.

Best,


On Mon, May 14, 2018 at 12:46 PM, Matt Farmer  wrote:

> Hi Arjun,
>
> I'm following this very closely as better error handling in Connect is a
> high priority
> for MailChimp's Data Systems team.
>
> A few thoughts (in no particular order):
>
> For the dead letter queue configuration, could we use deadLetterQueue
> instead of
> dlq? Acronyms are notoriously hard to keep straight in everyone's head and
> unless
> there's a compelling reason it would be nice to use the characters and be
> explicit.
>
> Have you considered any behavior that would periodically attempt to restart
> failed
> tasks after a certain amount of time? To get around our issues internally
> we've
> deployed a tool that monitors for failed tasks and restarts the task by
> hitting the
> REST API after the failure. Such a config would allow us to get rid of this
> tool.
>
> Have you considered a config setting to allow-list additional classes as
> retryable? In the situation we ran into, we were getting ConnectExceptions
> that
> were intermittent due to an unrelated service. With such a setting we could
> have
> deployed a config that temporarily whitelisted that Exception as
> retry-worthy
> and continued attempting to make progress while the other team worked
> on mitigating the problem.
>
> Thanks for the KIP!
>
> On Wed, May 9, 2018 at 2:59 AM, Arjun Satish 
> wrote:
>
> > All,
> >
> > I'd like to start a discussion on adding ways to handle and report record
> > processing errors in Connect. Please find a KIP here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 298%3A+Error+Handling+in+Connect
> >
> > Any feedback will be highly appreciated.
> >
> > Thanks very much,
> > Arjun
> >
>


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Bill Bejeck
Thanks for the KIP Guozhang, it's a +1 for me.

As for re-using the KeyValueMapper for choosing the topic, I am on the
fence, a more explicitly named class would be more clear, but I'm not sure
it's worth a new class that will primarily perform the same actions as the
KeyValueMapper.

Thanks,
Bill

On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang  wrote:

> Hello John:
>
> * As for the type superclass, it is mainly for allowing super class serdes.
> More details can be found here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>
> * I may have slight preference on reusing existing classes but I think most
> of my rationales are quite subjective. Personally I do not find `self
> documenting` worth a new class but I can be convinced if people have other
> motivations doing it :)
>
>
> Guozhang
>
>
> On Tue, May 15, 2018 at 11:19 AM, John Roesler  wrote:
>
> > Thanks for the KIP, Guozhang.
> >
> > It looks good overall to me; I just have one question:
> > * Why do we bound the generics of KVMapper in KStream to be
> superclass-of-K
> > and superclass-of-V instead of exactly K and V, as in Topology? I might
> be
> > thinking about it wrong, but that seems backwards to me. If anything,
> > bounding to be a subclass-of-K or subclass-of-V would seem right to me.
> >
> > One extra thought: I agree that KVMapper is an
> > applicable type for extracting the topic name, but I wonder what the
> value
> > of reusing the KVMapper for this purpose is. Would defining a new class,
> > say TopicNameExtractor, provide the same functionality while being a
> > bit more self-documenting?
> >
> > Thanks,
> > -John
> >
> > On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
> > wrote:
> >
> > > Hello folks,
> > >
> > > I'd like to start a discussion on adding dynamic routing functionality
> in
> > > Streams sink node. I.e. users do not need to specify the topic name at
> > > compilation time but can dynamically determine which topic to send to
> > based
> > > on each record's key value pairs. Please find a KIP here:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 303%3A+Add+Dynamic+Routing+in+Streams+Sink
> > >
> > > Any feedbacks are highly appreciated.
> > >
> > > Thanks!
> > >
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-297: Externalizing Secrets for Connect Configurations

2018-05-15 Thread Ron Dagostino
Hi Robert.  Regarding your comment "use the lease duration to schedule a
configuration reload in the future", you might be interested in the code
that does refresh for OAuth Bearer Tokens in KIP-255; specifically, the
class
org.apache.kafka.common.security.oauthbearer.internal.expiring.ExpiringCredentialRefreshingLogin.
The class performs JAAS logins/relogins based on the expiration time of a
retrieved expiring credential.  The implementation of that class is
inspired by the code that currently does refresh for Kerberos tickets but
is more reusable.  I don't know if you will leverage JAAS for defining how
to go get a credential (you could since you have to provide credentials to
authenticate to the remote systems anyway), but regardless, that class
should be useful at least in some minimal sense if not more than that.  See
https://github.com/apache/kafka/pull/4994.

Ron

Ron

On Tue, May 15, 2018 at 8:01 PM, Robert Yokota  wrote:

> Hi Colin,
>
> Thanks for the feedback!
>
>
> > The KIP says that "Vault is very popular and has been described as 'the
> current gold standard in secret management and provisioning'."  I think
> this might be a bit too much detail -- we don't really need to
> > favorites, right? :)
>
> I've removed this line :)
>
>
> > I think we should make the substitution part of the generic configuration
> code, rather than specific to individual ConfigProviders.  We don't really
> want it to work differently for Vault vs. KeyWhiz vs.
> > AWS secrets, etc. etc.
>
> Yes, the ConfigProviders merely serve up key-value pairs.  A helper class
> like ConfigTransformer would perform the variable substitutions if desired.
>
>
> > We should also spell out exactly how substitution works.
>
> By one-level of indirection I just meant a simple replacement of variables
> (which are the indirect references).  So if you have foo=${bar} and
> bar=${baz} and your file contains bar=hello, baz=world, then the final
> result would be foo=hello and bar=world.  I've added this example to the
> KIP.
>
> You can see this as the DEFAULT_PATTERN in the ConfigTransformer.  The
> ConfigTransformer only provides one level of indirection.
>
>
> > We should also spell out how this interacts with KIP-226 configurations.
>
> Yes, I mention at the beginning that KIP-226 could use the ConfigProvider
> but not the ConfigTransformer.
>
>
> > Maybe a good generic interface would be like this:
>
> I've added the subscription APIs but would like to keep the other APIs as I
> will need them for integration with Vault.  With Vault I obtain the lease
> duration at the time the key is obtained, so at that time I would want to
> use the lease duration to schedule a configuration reload in the future.
> This is similar to how the integration between Vault and the Spring
> Framework works.   Also, the lease duration would be included in the
> metadata map vs. the data map.  Finally, I need an additional "path" or
> "bucket" parameter, which is used by Vault to indicate which set of
> key-values are to be retrieved.
>
>
> > With regard to ConfigTransformer: do we need to include all this code in
> the KIP?  Seems like an implementation detail.
>
> I use the ConfigTransformer to show how the pattern ${provider:key} is
> defined and how the substitution only involves one level of indirection.
> If you feel it does not add anything to the text, I can remove it.
>
>
> > Is there a way to avoid this couping?
>
> I'd have to look into it and get back to you.  However, I assume that the
> answer is not relevant for this KIP :)
>
>
> Thanks,
> Robert
>
>
>
>
>
> On Tue, May 15, 2018 at 4:04 PM, Colin McCabe  wrote:
>
> > Hi Robert,
> >
> > Thanks for posting this.  In the past we've been kind of reluctant to add
> > more complexity to configuration.  I think Connect does have a clear need
> > for this kind of functionality, though.  As you mention, Connect
> integrates
> > with external systems, which are very likely to have passwords stored in
> > Vault, KeyWhiz or some other external system.
> >
> > The KIP says that "Vault is very popular and has been described as 'the
> > current gold standard in secret management and provisioning'."  I think
> > this might be a bit too much detail -- we don't really need to pick
> > favorites, right? :)
> >
> > I think we should make configuration consistent between the broker and
> > Connect.  If people can use constructs like
> jdbc.config.key="${vault:jdbc.user}${vault:jdbc.password}"
> > in Connect, they'll want to do it on the broker too, in a consistent way.
> >
> > If I understand correctly, ConfigProvider represents an external
> > configuration source, such as VaultConfigProvider, KeyWhizConfigProvider,
> > etc.
> >
> > I think we should make the substitution part of the generic configuration
> > code, rather than specific to individual ConfigProviders.  We don't
> really
> > want it to work differently for Vault vs. KeyWhiz vs. AWS secrets, etc.
> etc.
> >
> > We should also spell out exactly how s

Re: [DISCUSS] KIP-297: Externalizing Secrets for Connect Configurations

2018-05-15 Thread Robert Yokota
Thanks, Ron!  I will take a look.

Regards,
Robert

On Tue, May 15, 2018 at 5:59 PM, Ron Dagostino  wrote:

> Hi Robert.  Regarding your comment "use the lease duration to schedule a
> configuration reload in the future", you might be interested in the code
> that does refresh for OAuth Bearer Tokens in KIP-255; specifically, the
> class
> org.apache.kafka.common.security.oauthbearer.internal.expiring.
> ExpiringCredentialRefreshingLogin.
> The class performs JAAS logins/relogins based on the expiration time of a
> retrieved expiring credential.  The implementation of that class is
> inspired by the code that currently does refresh for Kerberos tickets but
> is more reusable.  I don't know if you will leverage JAAS for defining how
> to go get a credential (you could since you have to provide credentials to
> authenticate to the remote systems anyway), but regardless, that class
> should be useful at least in some minimal sense if not more than that.  See
> https://github.com/apache/kafka/pull/4994.
>
> Ron
>
> Ron
>
> On Tue, May 15, 2018 at 8:01 PM, Robert Yokota  wrote:
>
> > Hi Colin,
> >
> > Thanks for the feedback!
> >
> >
> > > The KIP says that "Vault is very popular and has been described as 'the
> > current gold standard in secret management and provisioning'."  I think
> > this might be a bit too much detail -- we don't really need to
> > > favorites, right? :)
> >
> > I've removed this line :)
> >
> >
> > > I think we should make the substitution part of the generic
> configuration
> > code, rather than specific to individual ConfigProviders.  We don't
> really
> > want it to work differently for Vault vs. KeyWhiz vs.
> > > AWS secrets, etc. etc.
> >
> > Yes, the ConfigProviders merely serve up key-value pairs.  A helper class
> > like ConfigTransformer would perform the variable substitutions if
> desired.
> >
> >
> > > We should also spell out exactly how substitution works.
> >
> > By one-level of indirection I just meant a simple replacement of
> variables
> > (which are the indirect references).  So if you have foo=${bar} and
> > bar=${baz} and your file contains bar=hello, baz=world, then the final
> > result would be foo=hello and bar=world.  I've added this example to the
> > KIP.
> >
> > You can see this as the DEFAULT_PATTERN in the ConfigTransformer.  The
> > ConfigTransformer only provides one level of indirection.
> >
> >
> > > We should also spell out how this interacts with KIP-226
> configurations.
> >
> > Yes, I mention at the beginning that KIP-226 could use the ConfigProvider
> > but not the ConfigTransformer.
> >
> >
> > > Maybe a good generic interface would be like this:
> >
> > I've added the subscription APIs but would like to keep the other APIs
> as I
> > will need them for integration with Vault.  With Vault I obtain the lease
> > duration at the time the key is obtained, so at that time I would want to
> > use the lease duration to schedule a configuration reload in the future.
> > This is similar to how the integration between Vault and the Spring
> > Framework works.   Also, the lease duration would be included in the
> > metadata map vs. the data map.  Finally, I need an additional "path" or
> > "bucket" parameter, which is used by Vault to indicate which set of
> > key-values are to be retrieved.
> >
> >
> > > With regard to ConfigTransformer: do we need to include all this code
> in
> > the KIP?  Seems like an implementation detail.
> >
> > I use the ConfigTransformer to show how the pattern ${provider:key} is
> > defined and how the substitution only involves one level of indirection.
> > If you feel it does not add anything to the text, I can remove it.
> >
> >
> > > Is there a way to avoid this couping?
> >
> > I'd have to look into it and get back to you.  However, I assume that the
> > answer is not relevant for this KIP :)
> >
> >
> > Thanks,
> > Robert
> >
> >
> >
> >
> >
> > On Tue, May 15, 2018 at 4:04 PM, Colin McCabe 
> wrote:
> >
> > > Hi Robert,
> > >
> > > Thanks for posting this.  In the past we've been kind of reluctant to
> add
> > > more complexity to configuration.  I think Connect does have a clear
> need
> > > for this kind of functionality, though.  As you mention, Connect
> > integrates
> > > with external systems, which are very likely to have passwords stored
> in
> > > Vault, KeyWhiz or some other external system.
> > >
> > > The KIP says that "Vault is very popular and has been described as 'the
> > > current gold standard in secret management and provisioning'."  I think
> > > this might be a bit too much detail -- we don't really need to pick
> > > favorites, right? :)
> > >
> > > I think we should make configuration consistent between the broker and
> > > Connect.  If people can use constructs like
> > jdbc.config.key="${vault:jdbc.user}${vault:jdbc.password}"
> > > in Connect, they'll want to do it on the broker too, in a consistent
> way.
> > >
> > > If I understand correctly, ConfigProvider represents an external
> > > configurat

Re: [VOTE] KIP-255: OAuth Authentication via SASL/OAUTHBEARER

2018-05-15 Thread Ron Dagostino
Hi Jun.  I think you are getting at the fact that OAuth 2 is a flexible
framework that allows different installations to do things differently.  It
is true that the principal name in Kafka could come from any claim in the
token.  Most of the time it would come from the 'sub' claim, but it could
certainly come from another claim, or it could be only indirectly based on
a claim value (maybe certain text would be trimmed or prefixed/suffixed).
The point, which I think you are getting at, is that because the framework
is flexible we need to accommodate that flexibility.  The callback handler
class defined by the listener.name.sasl_ssl.oauthbearer.sasl.server.
callback.handler.class configuration value gives us the required
flexibility.  As an example, I have an implementation that leverages a
popular open source JOSE library to parse the compact serialization,
retrieve the public key if it has not yet been retrieved, verify the
digital signature, and map the 'sub' claim to the OAuthBearerToken's
principal name (which becomes the SASL authorization ID, which becomes the
Kafka principal name).  I could just as easily have mapped a different
claim to the OAuthBearerToken's principal name, manipulated the 'sub' claim
value in some way, etc.  I write the callback handler code, so I complete
flexibility to do whatever my OAuth 2 installation requires me to do.

Ron

On Tue, May 15, 2018 at 1:39 PM, Jun Rao  wrote:

> Hi, Ron,
>
> Thanks for the reply. I understood your answers to #2 and #3.
>
> For #1, will the server map all clients' principal name to the value
> associated with "sub" claim? How do we support mapping different clients to
> different principal names?
>
> Jun
>
> On Mon, May 14, 2018 at 7:02 PM, Ron Dagostino  wrote:
>
> > Hi Jun.  Thanks for the +1 vote.
> >
> > Regarding the first question about token claims, yes, you have it correct
> > about translating the OAuth token to a principle name via a JAAS module
> > option in the default unsecured case.  Specifically, the OAuth SASL
> Server
> > implementation is responsible for setting the authorization ID, and it
> gets
> > the authorization ID from the OAuthBearerToken's principalName() method.
> > The listener.name.sasl_ssl.oauthbearer.sasl.server.
> callback.handler.class
> > is responsible for handling an instance of OAuthBearerValidatorCallback
> to
> > accept a token compact serialization from the client and return an
> instance
> > of OAuthBearerToken (assuming the compact serialization validates), and
> in
> > the default unsecured case the builtin unsecured validator callback
> handler
> > defines the OAuthBearerToken.principalName() method to return the 'sub'
> > claim value by default (with the actual claim it uses being configurable
> > via the unsecuredValidatorPrincipalClaimName JAAS module option).  So
> that
> > is how we translate from a token to a principal name in the default
> > unsecured (out-of-the-box) case.
> >
> > For production use cases, the implementation associated with
> > listener.name.sasl_ssl.oauthbearer.sasl.server.callback.handler.class
> can
> > do whatever it wants.  As an example, I have written a class that wraps a
> > com.nimbusds.jwt.SignedJWT instance (see
> > https://connect2id.com/products/nimbus-jose-jwt/) and presents it as an
> > OAuthBearerToken:
> >
> > public class NimbusSignedJwtOAuthBearerToken implements
> OAuthBearerToken {
> > private final SignedJWT signedJwt;
> > private final String principalName;
> > private final Set scope;
> > private final Long startTimeMs;
> > private final long lifetimeMs;
> >
> > /**
> >  * Constructor
> >  *
> >  * @param signedJwt
> >  *the mandatory signed JWT
> >  * @param principalClaimName
> >  *the mandatory claim name identifying the claim from
> which
> > the
> >  *principal name will be extracted. The claim must exist
> > and must be
> >  *a String.
> >  * @param optionalScopeClaimName
> >  *the optional claim name identifying the claim from
> which
> > any scope
> >  *will be extracted. If specified and the claim exists
> then
> > the
> >  *value must be either a String or a String List.
> >  * @throws ParseException
> >  * if the principal claim does not exist or is not a
> > String; the
> >  * scope claim is neither a String nor a String List; the
> > 'exp'
> >  * claim does not exist or is not a number; the 'iat'
> claim
> > exists
> >  * but is not a number; or the 'nbf' claim exists and is
> > not a
> >  * number.
> >  */
> > public NimbusSignedJwtOAuthBearerToken(SignedJWT signedJwt, String
> > principalClaimName,
> > String optionalScopeClaimName) throws ParseException {
> > // etc...
> > }
> >
> > The callback handler runs the following code if the digital signature
> > validates:
> >

Re: [DISCUSS] KIP-298: Error Handling in Connect

2018-05-15 Thread Arjun Satish
Magesh,

Thanks for the feedback! Really appreciate your comments.

1. I updated the KIP to state that only the configs of the failed operation
will be emitted. Thank you!

The purpose of bundling the configs of the failed operation along with the
error context is to have a single place to find everything relevant to the
failure. This way, we can only look at the error logs to find the most
common pieces to "failure" puzzles: the operation, the config and the input
record. Ideally, a programmer should be able to take these pieces and
reproduce the error locally.

2. Added a table to describe this in the KIP.

3. Raw bytes will be base64 encoded before being logged. Updated the KIP to
state this. Thank you!

4. I'll add an example log4j config to show we can take logs from a class
and redirect it to a different location. Made a note in the PR for this.

5. When we talk about logging messages, this could mean instances of
SinkRecords or SourceRecords. When we disable logging of messages, these
records would be replaced by a "null". If you think it makes sense, instead
of completely dropping the object, we could drop only the key and value
objects from ConnectRecord? That way some context will still be retained.

6. Yes, for now I think it is good to have explicit config in Connectors
which dictates the error handling behavior. If this becomes an
inconvenience, we can think of having a cluster global default, or better
defaults in the configs.

Best,


On Tue, May 15, 2018 at 1:07 PM, Magesh Nandakumar 
wrote:

> Hi Arjun,
>
> I think this a great KIP and would be a great addition to have in connect.
> Had a couple of minor questions:
>
> 1. What would be the value in logging the connector config using
> errors.log.include.configs
> for every message?
> 2. Not being picky on format here but it might be clearer if the behavior
> is called out for each stage separately and what the connector developers
> need to do ( may be a tabular format). Also, I think all retriable
> exception when talking to Broker are never propagated to the Connect
> Framework since the producer is configured to try indefinitely
> 3. If a message fails in serialization, would the raw bytes be available to
> the dlq or the error log
> 4. Its not necessary to mention in KIP, but it might be better to separate
> the error records to a separate log file as part of the default log4j
> properties
> 5. If we disable message logging, would there be any other metadata
> available like offset that helps reference the record?
> 6. If I need error handler for all my connectors, would I have to set it up
> for each of them? I would think most people might want the behavior applied
> to all the connectors.
>
> Let me know your thoughts :).
>
> Thanks
> Magesh
>
> On Tue, May 8, 2018 at 11:59 PM, Arjun Satish 
> wrote:
>
> > All,
> >
> > I'd like to start a discussion on adding ways to handle and report record
> > processing errors in Connect. Please find a KIP here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 298%3A+Error+Handling+in+Connect
> >
> > Any feedback will be highly appreciated.
> >
> > Thanks very much,
> > Arjun
> >
>


Re: [DISCUSS] KIP-303: Add Dynamic Routing in Streams Sink

2018-05-15 Thread Matthias J. Sax
Just my 2 cents:

I am fine with `KeyValueMapper` (+1 for code reusage) -- the JavaDocs
will explain what the `KeyValueMapper` is supposed to do, ie, extract
and return the sink topic name from the key-value pair.

A side remark though: do we think that accessing key/value is
sufficient? Or should we provide access to the full metadata? We could
also do this with KIP-159 of course -- but this would come earliest in
2.1. As an alternative we could add a `TopicNameExtractor` to expose the
whole record context. The advantage would be, that we don't need to
change it via KIP-159 later again. WDYT?

-Matthias

On 5/15/18 5:57 PM, Bill Bejeck wrote:
> Thanks for the KIP Guozhang, it's a +1 for me.
> 
> As for re-using the KeyValueMapper for choosing the topic, I am on the
> fence, a more explicitly named class would be more clear, but I'm not sure
> it's worth a new class that will primarily perform the same actions as the
> KeyValueMapper.
> 
> Thanks,
> Bill
> 
> On Tue, May 15, 2018 at 5:52 PM, Guozhang Wang  wrote:
> 
>> Hello John:
>>
>> * As for the type superclass, it is mainly for allowing super class serdes.
>> More details can be found here:
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 100+-+Relax+Type+constraints+in+Kafka+Streams+API
>>
>> * I may have slight preference on reusing existing classes but I think most
>> of my rationales are quite subjective. Personally I do not find `self
>> documenting` worth a new class but I can be convinced if people have other
>> motivations doing it :)
>>
>>
>> Guozhang
>>
>>
>> On Tue, May 15, 2018 at 11:19 AM, John Roesler  wrote:
>>
>>> Thanks for the KIP, Guozhang.
>>>
>>> It looks good overall to me; I just have one question:
>>> * Why do we bound the generics of KVMapper in KStream to be
>> superclass-of-K
>>> and superclass-of-V instead of exactly K and V, as in Topology? I might
>> be
>>> thinking about it wrong, but that seems backwards to me. If anything,
>>> bounding to be a subclass-of-K or subclass-of-V would seem right to me.
>>>
>>> One extra thought: I agree that KVMapper is an
>>> applicable type for extracting the topic name, but I wonder what the
>> value
>>> of reusing the KVMapper for this purpose is. Would defining a new class,
>>> say TopicNameExtractor, provide the same functionality while being a
>>> bit more self-documenting?
>>>
>>> Thanks,
>>> -John
>>>
>>> On Tue, May 15, 2018 at 12:32 AM, Guozhang Wang 
>>> wrote:
>>>
 Hello folks,

 I'd like to start a discussion on adding dynamic routing functionality
>> in
 Streams sink node. I.e. users do not need to specify the topic name at
 compilation time but can dynamically determine which topic to send to
>>> based
 on each record's key value pairs. Please find a KIP here:

 https://cwiki.apache.org/confluence/display/KAFKA/KIP-
 303%3A+Add+Dynamic+Routing+in+Streams+Sink

 Any feedbacks are highly appreciated.

 Thanks!

 -- Guozhang

>>>
>>
>>
>>
>> --
>> -- Guozhang
>>
> 



signature.asc
Description: OpenPGP digital signature


Build failed in Jenkins: kafka-trunk-jdk8 #2647

2018-05-15 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: doc change for deprecate removal (#5006)

--
[...truncated 423.06 KB...]

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsShiftPlus STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsShiftPlus PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLatest STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLatest PASSED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsNewConsumerExistingTopic STARTED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsNewConsumerExistingTopic PASSED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsShiftByLowerThanEarliest STARTED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsShiftByLowerThanEarliest PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsByDuration STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsByDuration PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLocalDateTime 
STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToLocalDateTime 
PASSED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsToEarliestOnTopicsAndPartitions STARTED

kafka.admin.ResetConsumerGroupOffsetTest > 
testResetOffsetsToEarliestOnTopicsAndPartitions PASSED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToEarliestOnTopics 
STARTED

kafka.admin.ResetConsumerGroupOffsetTest > testResetOffsetsToEarliestOnTopics 
PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfBlankArg STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfBlankArg PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowVerifyWithoutReassignmentOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowVerifyWithoutReassignmentOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowTopicsOptionWithVerify STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowTopicsOptionWithVerify PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithThrottleOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithThrottleOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfNoArgs STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldFailIfNoArgs PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithoutReassignmentOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithoutReassignmentOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowBrokersListWithVerifyOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowBrokersListWithVerifyOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumExecuteOptions STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumExecuteOptions PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithReassignmentOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithReassignmentOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumGenerateOptions STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumGenerateOptions PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersAndTopicsOptions STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowGenerateWithoutBrokersAndTopicsOptions PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowThrottleWithVerifyOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowThrottleWithVerifyOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldUseDefaultsIfEnabled 
STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > shouldUseDefaultsIfEnabled 
PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldAllowThrottleOptionOnExecute STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldAllowThrottleOptionOnExecute PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithBrokers STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithBrokers PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithTopicsOption STARTED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldNotAllowExecuteWithTopicsOption PASSED

kafka.admin.ReassignPartitionsCommandArgsTest > 
shouldCorrectlyParseValidMinimumVerifyOptions STARTED

kafka.ad

Re: [DISCUSS] KIP-298: Error Handling in Connect

2018-05-15 Thread Arjun Satish
Magesh,

Just to add to your point about retriable exceptions: the producer can
throw retriable exceptions which we are handling it here:

https://github.com/apache/kafka/blob/trunk/connect/
runtime/src/main/java/org/apache/kafka/connect/runtime/
WorkerSourceTask.java#L275

BTW, exceptions like TimeoutExceptions (which extend RetriableExceptions)
are bubbled back to the application, and need to be handled as per
application requirements.

Best,

On Tue, May 15, 2018 at 8:30 PM, Arjun Satish 
wrote:

> Magesh,
>
> Thanks for the feedback! Really appreciate your comments.
>
> 1. I updated the KIP to state that only the configs of the failed
> operation will be emitted. Thank you!
>
> The purpose of bundling the configs of the failed operation along with the
> error context is to have a single place to find everything relevant to the
> failure. This way, we can only look at the error logs to find the most
> common pieces to "failure" puzzles: the operation, the config and the input
> record. Ideally, a programmer should be able to take these pieces and
> reproduce the error locally.
>
> 2. Added a table to describe this in the KIP.
>
> 3. Raw bytes will be base64 encoded before being logged. Updated the KIP
> to state this. Thank you!
>
> 4. I'll add an example log4j config to show we can take logs from a class
> and redirect it to a different location. Made a note in the PR for this.
>
> 5. When we talk about logging messages, this could mean instances of
> SinkRecords or SourceRecords. When we disable logging of messages, these
> records would be replaced by a "null". If you think it makes sense, instead
> of completely dropping the object, we could drop only the key and value
> objects from ConnectRecord? That way some context will still be retained.
>
> 6. Yes, for now I think it is good to have explicit config in Connectors
> which dictates the error handling behavior. If this becomes an
> inconvenience, we can think of having a cluster global default, or better
> defaults in the configs.
>
> Best,
>
>
> On Tue, May 15, 2018 at 1:07 PM, Magesh Nandakumar 
> wrote:
>
>> Hi Arjun,
>>
>> I think this a great KIP and would be a great addition to have in connect.
>> Had a couple of minor questions:
>>
>> 1. What would be the value in logging the connector config using
>> errors.log.include.configs
>> for every message?
>> 2. Not being picky on format here but it might be clearer if the behavior
>> is called out for each stage separately and what the connector developers
>> need to do ( may be a tabular format). Also, I think all retriable
>> exception when talking to Broker are never propagated to the Connect
>> Framework since the producer is configured to try indefinitely
>> 3. If a message fails in serialization, would the raw bytes be available
>> to
>> the dlq or the error log
>> 4. Its not necessary to mention in KIP, but it might be better to separate
>> the error records to a separate log file as part of the default log4j
>> properties
>> 5. If we disable message logging, would there be any other metadata
>> available like offset that helps reference the record?
>> 6. If I need error handler for all my connectors, would I have to set it
>> up
>> for each of them? I would think most people might want the behavior
>> applied
>> to all the connectors.
>>
>> Let me know your thoughts :).
>>
>> Thanks
>> Magesh
>>
>> On Tue, May 8, 2018 at 11:59 PM, Arjun Satish 
>> wrote:
>>
>> > All,
>> >
>> > I'd like to start a discussion on adding ways to handle and report
>> record
>> > processing errors in Connect. Please find a KIP here:
>> >
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > 298%3A+Error+Handling+in+Connect
>> >
>> > Any feedback will be highly appreciated.
>> >
>> > Thanks very much,
>> > Arjun
>> >
>>
>
>


Jenkins build is back to normal : kafka-trunk-jdk10 #108

2018-05-15 Thread Apache Jenkins Server
See