Hi Bill, So we ended up applying the fix for KAFKA-7144 onto kafka 1.1.1 and now all works fine. Thanks for the insight.
Greets, Bart On Tue, Oct 9, 2018 at 4:49 PM Bill Bejeck <b...@confluent.io> wrote: > Hi Bart, > > Sounds good. Let me know how it goes. > > -Bill > > On Tue, Oct 9, 2018 at 5:08 AM Bart Vercammen <b...@cloutrix.com> wrote: > > > Hi Bill, > > > > Thanks for the reply. > > We had a look at the patch for v and will try it out on Kafka > > 1.1.1 > > Currently a full upstep to 2.0.x is not yet an option. > > > > In the mean time I have some unit-tests that reproduce this problem, so > the > > backport to v1.1.1 can easily be verified. > > > > Greets, > > Bart > > > > On Tue, Oct 9, 2018 at 12:27 AM Bill Bejeck <b...@confluent.io> wrote: > > > > > Hi Bart, > > > > > > This is a known issue discovered in version 1.1 - > > > https://issues.apache.org/jira/browse/KAFKA-7144 > > > > > > This issue has been fixed in Kafka Streams 2.0, any chance you can > > upgrade > > > to 2.0? > > > > > > Thanks, > > > Bill > > > > > > On Mon, Oct 8, 2018 at 2:46 PM Bart Vercammen <b...@cloutrix.com> > wrote: > > > > > > > Thanks John, > > > > > > > > I'll see what I can do regarding the logs ... > > > > As a side not, our Kafka cluster is running version v1.1.1 in > v0.10.2.1 > > > log > > > > format configuration (due to another issue: KAFKA-6000) > > > > But, as said, I'll try to come up with some detailed logs, or a > > scenario > > > to > > > > reproduce this. > > > > > > > > Greets, > > > > Bart > > > > > > > > On Mon, Oct 8, 2018 at 8:37 PM John Roesler <j...@confluent.io> > wrote: > > > > > > > > > Hi Bart, > > > > > > > > > > I suspected it might not be feasible to just dump your production > > logs > > > > onto > > > > > the internet. > > > > > > > > > > A repro would be even better, but I bet it wouldn't show up when > you > > > try > > > > > and reproduce it. Good luck! > > > > > > > > > > If the repro doesn't turn out, maybe you could just extract the > > > > assignment > > > > > lines from your logs? > > > > > > > > > > Thanks, > > > > > -John > > > > > > > > > > On Mon, Oct 8, 2018 at 1:24 PM Bart Vercammen <b...@cloutrix.com> > > > wrote: > > > > > > > > > > > Hi John, > > > > > > > > > > > > Zipping up some logs from our running Kafka cluster is going to > be > > a > > > > bit > > > > > > difficult. > > > > > > What I can do is try to reproduce this off-line and capture the > > logs > > > > from > > > > > > there. > > > > > > > > > > > > We also had a look in the PartitionAssignor source code (for > 1.1.1) > > > and > > > > > > indeed this behaviour is a bit weird > > > > > > as from the source code I'd expect equally divided partitions. > > > > > > > > > > > > Anyway, hopefully I'll be able to reproduce this issue with some > > > simple > > > > > > unit-test like code. > > > > > > I'll post the results when I have more info. > > > > > > > > > > > > Greets, > > > > > > Bart > > > > > > > > > > > > On Mon, Oct 8, 2018 at 7:36 PM John Roesler <j...@confluent.io> > > > wrote: > > > > > > > > > > > > > Hi Bart, > > > > > > > > > > > > > > This sounds a bit surprising. Is there any chance you can zip > up > > > some > > > > > > logs > > > > > > > so we can see the assignment protocol on the nodes? > > > > > > > > > > > > > > Thanks, > > > > > > > -John > > > > > > > > > > > > > > On Mon, Oct 8, 2018 at 4:32 AM Bart Vercammen < > b...@cloutrix.com > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > I recently moved some KafkaStreams applications from > v0.10.2.1 > > to > > > > > > v1.1.1 > > > > > > > > and now I notice a weird behaviour in the partition > assignment. > > > > > > > > When starting 4 instances of my Kafka Streams application (on > > > > > v1.1.1) I > > > > > > > see > > > > > > > > that 17 of the 20 partitions (of a source topic) are assigned > > to > > > 1 > > > > > > > instance > > > > > > > > of the application while the other 3 instances only get 1 > > > partition > > > > > > > > assigned. (previously (on v0.10.2.1) the all got 5 > partitions.) > > > > > > > > > > > > > > > > Is this expected behaviour, as I read that quite some > > > improvements > > > > > were > > > > > > > > done in the partition assignment strategy for Kafka Streams > > > > > > applications? > > > > > > > > If yes, how can I make it so that the partitions are equally > > > > devided > > > > > > > again > > > > > > > > across all running applications? It's a bit weird in my > > opinion > > > > as > > > > > > this > > > > > > > > makes scaling the application very hard. > > > > > > > > > > > > > > > > Also, when initially starting with 1 instance of the > > application, > > > > and > > > > > > > > gradually scaling up, the new instances only get 1 partition > > > > assigned > > > > > > ... > > > > > > > > > > > > > > > > All my Streams applications use default configuration (more > or > > > > less), > > > > > > > > running 1 stream-thread. > > > > > > > > > > > > > > > > Any suggestions / enlightenments on this? > > > > > > > > Greets, > > > > > > > > Bart > > > > > > > > >