Re: At Least Once semantics for Kafka Streams

2017-02-03 Thread Mahendra Kariya
Thanks a lot for this Matthias. I have a follow up question. There is a COMMIT_INTERVAL_MS_CONFIG config for streams. This confuses things a little bit. If the value of this config is set to, say 100 ms, does it mean that the offset will be committed after 100 ms? If yes, then how does at least on

Re: Running cluster of stream processing application

2017-02-03 Thread Sachin Mittal
Hello All, I am revisiting this topic as now I am actually configuring a partitioned topic and would like multiple threads of my streams application running on different instances to process this partitioned topic in parallel. So I have once source topic partitioned into 40 partitions. The message

Re: Running cluster of stream processing application

2017-02-03 Thread Damian Guy
Hi Sachin, On 3 February 2017 at 09:07, Sachin Mittal wrote: > > 1. Now what I wanted to know is that for separate machines running same > instance of the streams application, my application.id would be same > right. > That is correct. > If yes then how does kafka cluser know which partition

Re: At Least Once semantics for Kafka Streams

2017-02-03 Thread Damian Guy
Hi Mahendra, The commit is done on the same thread as the processing, so only offsets that have been fully processed by the topology will be committed. Thanks, Damian On Fri, 3 Feb 2017 at 08:40 Mahendra Kariya wrote: > Thanks a lot for this Matthias. > > I have a follow up question. There is

Re: At Least Once semantics for Kafka Streams

2017-02-03 Thread Jan Filipiak
Hey, with a little more effort you can try to make your stream application idempotent. Maybe giving you the same results. Say you want to aggregate a KStream by some key. Instead of keeping the aggregate, you keep a Set of raw values and then do the aggregate calculations with a map(). This

[DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Ismael Juma
Hi all, I have posted a KIP for dropping support for Java 7 in Kafka 0.11: https://cwiki.apache.org/confluence/display/KAFKA/KIP-118%3A+Drop+Support+for+Java+7+in+Kafka+0.11 Most people were supportive when we last discussed the topic[1], but there were a few concerns. I believe the following sh

Re: Running cluster of stream processing application

2017-02-03 Thread Sachin Mittal
> Any reason why you don't just let streams create the changelog topic? Yes you should partition it the same as the source topic. Only reason is that I need to use my max.message.bytes and in version 0.10.0.1 configuring the same to state store supplier is not supported. But I understood that numb

Re: At Least Once semantics for Kafka Streams

2017-02-03 Thread Mahendra Kariya
Thanks Damian for this info. On Fri, Feb 3, 2017 at 3:29 PM, Damian Guy wrote: > The commit is done on the same thread as the processing, so only offsets > that have been fully processed by the topology will be committed. > I am still not clear about why do we need the COMMIT_INTERVAL_MS_CONFI

KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-02-03 Thread Ismael Juma
Hi all, I have posted a KIP for dropping support for Scala 2.10 in Kafka 0.11: https://cwiki.apache.org/confluence/display/KAFKA/KIP-119%3A+Drop+Support+for+Scala+2.10+in+Kafka+0.11 Please take a look. Your feedback is appreciated. Thanks, Ismael

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Damian Guy
Thanks Ismael. Makes sense to me. On Fri, 3 Feb 2017 at 10:39 Ismael Juma wrote: > Hi all, > > I have posted a KIP for dropping support for Java 7 in Kafka 0.11: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-118%3A+Drop+Support+for+Java+7+in+Kafka+0.11 > > Most people were supporti

Re: At Least Once semantics for Kafka Streams

2017-02-03 Thread Damian Guy
It controls the minimum frequency at which the offsets are committed. The StreamThread runs in a loop that looks something like this: while(true) records = consumer.poll(..) for each record task = findTask(record) task.process(..) end maybeCommit() end This is grossly simp

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Grant Henke
Looks good to me. Thanks for handling the KIP. On Fri, Feb 3, 2017 at 8:49 AM, Damian Guy wrote: > Thanks Ismael. Makes sense to me. > > On Fri, 3 Feb 2017 at 10:39 Ismael Juma wrote: > > > Hi all, > > > > I have posted a KIP for dropping support for Java 7 in Kafka 0.11: > > > > > > https://cw

Re: Kafka Streams delivery semantics and state store

2017-02-03 Thread Krzysztof Lesniewski, Nexiot AG
Thank you Eno for the information on KIP-98. Making downstream topic and state store's changelog writes atomic would do simplify the problem. I did not dive into the design, so I am not able to tell if it would bring other implications, but as KIP-98 is so far under discussion, I have to settle

Re: KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-02-03 Thread Grant Henke
Thanks for proposing this Ismael. This makes sense to me. In this KIP and the java KIP you state: A reasonable policy is to support the 2 most recently released versions so > that we can strike a good balance between supporting older versions, > maintainability and taking advantage of language an

Re: KIP-119: Drop Support for Scala 2.10 in Kafka 0.11

2017-02-03 Thread Ismael Juma
Hi Grant, That's an interesting point. It would be good to hear what others' think of making that the official policy instead of starting a discussion/vote each time. If there is consensus, I am happy to revise the KIPs. Otherwise, we keep them as they are and discuss/vote on this instance only.

Re: At Least Once semantics for Kafka Streams

2017-02-03 Thread Mahendra Kariya
Ah OK! Thanks a lot for this clarification. it will only commit the offsets if the value of COMMIT_INTERVAL_MS_CONFIG > has > passed. >

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Roger Hoover
This is great. Thanks, Ismael. On Fri, Feb 3, 2017 at 7:35 AM, Grant Henke wrote: > Looks good to me. Thanks for handling the KIP. > > On Fri, Feb 3, 2017 at 8:49 AM, Damian Guy wrote: > > > Thanks Ismael. Makes sense to me. > > > > On Fri, 3 Feb 2017 at 10:39 Ismael Juma wrote: > > > > > Hi

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Eno Thereska
Makes sense. Eno > On 3 Feb 2017, at 10:38, Ismael Juma wrote: > > Hi all, > > I have posted a KIP for dropping support for Java 7 in Kafka 0.11: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-118%3A+Drop+Support+for+Java+7+in+Kafka+0.11 > > Most people were supportive when we las

Re: Kafka Streams delivery semantics and state store

2017-02-03 Thread Matthias J. Sax
Answers inline. -Matthias On 2/3/17 7:37 AM, Krzysztof Lesniewski, Nexiot AG wrote: > Thank you Eno for the information on KIP-98. Making downstream topic and > state store's changelog writes atomic would do simplify the problem. I > did not dive into the design, so I am not able to tell if it w

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Guozhang Wang
LGTM too. On Fri, Feb 3, 2017 at 10:39 AM, Eno Thereska wrote: > Makes sense. > > Eno > > > On 3 Feb 2017, at 10:38, Ismael Juma wrote: > > > > Hi all, > > > > I have posted a KIP for dropping support for Java 7 in Kafka 0.11: > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 118

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Apurva Mehta
Thanks Ismael, this makes sense. On Fri, Feb 3, 2017 at 11:50 AM, Guozhang Wang wrote: > LGTM too. > > On Fri, Feb 3, 2017 at 10:39 AM, Eno Thereska > wrote: > > > Makes sense. > > > > Eno > > > > > On 3 Feb 2017, at 10:38, Ismael Juma wrote: > > > > > > Hi all, > > > > > > I have posted a KIP

Re: [DISCUSS] KIP-118: Drop Support for Java 7 in Kafka 0.11

2017-02-03 Thread Dong Lin
Thanks for the KIP. LGTM as well. On Fri, Feb 3, 2017 at 2:05 PM, Apurva Mehta wrote: > Thanks Ismael, this makes sense. > > On Fri, Feb 3, 2017 at 11:50 AM, Guozhang Wang wrote: > > > LGTM too. > > > > On Fri, Feb 3, 2017 at 10:39 AM, Eno Thereska > > wrote: > > > > > Makes sense. > > > > > >

[DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-02-03 Thread Matthias J. Sax
Hi All, I did prepare a KIP to do some cleanup some of Kafka's Streaming API. Please have a look here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-120%3A+Cleanup+Kafka+Streams+builder+API Looking forward to your feedback! -Matthias signature.asc Description: OpenPGP digital signat

Re: Running cluster of stream processing application

2017-02-03 Thread Guozhang Wang
The re-discover new consumer member within the group is part of the Consumer Rebalance protocol that Streams simply relies on. More details can be found here: https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal A one sentence summary is that the new consumer wi

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-02-03 Thread radai
why is it so important to make those classes final ? On Fri, Feb 3, 2017 at 3:33 PM, Matthias J. Sax wrote: > Hi All, > > I did prepare a KIP to do some cleanup some of Kafka's Streaming API. > > Please have a look here: > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 120%3A+Cleanup+K