> > We've been able to get the crucial factors that cause this behavior down to > a particular combination
What do you mean by this -- that you only see this when all four of those operators are at play? Or do you see it with any of them. I guess the first thing to narrow down is whether it's actually rebalancing or just restoring within this time (the REBALANCING state is somewhat misleadingly-named). If this is a completely new app then it's probably not restoring, but if this app had already been running and building up state before hitting this issue then that's probably the reason. It's not at all uncommon for restoration to take more than 30 seconds. If it really is rebalancing this entire time, then you need to look into the logs to figure out why. I don't see anything obviously wrong with your particular application, and even if there was it should never result in endless rebalances like this. How many instances of the application are you running? Cheers, Sophie On Thu, Oct 15, 2020 at 10:01 PM Alex Jablonski <ajablon...@thoughtworks.com> wrote: > Hey there! > > My team and I have run across a bit of a jam in our application where, > given a particular setup, our Kafka Streams application never seems to > start successfully, instead just getting stuck in the REBALANCING state. > We've been able to get the crucial factors that cause this behavior down to > a particular combination of (1) grouping, (2) windowing, (3) aggregating, > and (4) foreign-key joining, with some of those steps specifying Serdes > besides the default. > > It's probably more useful to see a minimal example, so there's one here > <https://github.com/ajablonski/streams-issue-demo/blob/master/build.gradle > >. > The underlying Kafka Streams version is 2.5.1. The first test should show > the application eventually transition to running state, but it doesn't > within the 30 second timeout I've set. Interestingly, getting rid of the > 'Grouped.with' argument to the 'groupBy' function and the > 'Materialized.with' in 'aggregate' in the 'StreamsConfiguration' lets the > application transition to "RUNNING", though without the correct Serdes > that's not too valuable. > > There might be a cleaner way to organize the particular flow in the toy > example, but is there something fundamentally wrong with the approach laid > out in that application that would cause Streams to be stuck in > REBALANCING? I'd appreciate any advice folks could give! > > Thanks! > Alex Jablonski >