Hi Matthias, First, I will answer to your last question.
The main reason to have both TaskState#assignment and TaskState#consumedOffsetsByPartition is that tasks have no consumed offsets until at least one message is consumed for each partition even if previous offsets exist for the consumer group. So yes this methods are redundant as it only diverge at application startup. About the use case, currently we are developping for a customer a little framework based on KafkaStreams which transform/denormalize data before ingesting into hadoop. We have a cluster of workers (SpringBoot) which instantiate KStreams topologies dynamicaly based on dataflow configurations. Each configuration describes a topic to consume and how to process messages (this looks like NiFi processors API). Our architecture is inspired from KafkaConnect. We have topics for configs and states which are consumed by each workers (actually we have reused some internals classes to the connect API). Now, we would like to develop UIs to visualize topics and partitions consumed by our worker applications. Also, I think it would be nice to be able, in the futur, to develop web UIs similar to Spark but for KafkaStreams to visualize DAGs...so maybe this KIP is just a first step. Thanks, 2017-03-01 22:52 GMT+01:00 Matthias J. Sax <matth...@confluent.io>: > Thanks for the KIP. > > I am wondering a little bit, why you need to expose this information. > Can you describe some use cases? > > Would it be worth to unify this new API with KafkaStreams#state() to get > the overall state of an application without the need to call two > different methods? Not sure how this unified API might look like though. > > > One minor comment about the API: TaskState#assignment seems to be > redundant. It should be the same as > TaskState#consumedOffsetsByPartition.keySet() > > Or do I miss something? > > > -Matthias > > On 3/1/17 5:19 AM, Florian Hussonnois wrote: > > Hi Eno, > > > > Yes, but the state() method only returns the global state of the > > KafkaStream application (ie: CREATED, RUNNING, REBALANCING, > > PENDING_SHUTDOWN, NOT_RUNNING). > > > > An alternative to this KIP would be to change this method to return more > > information instead of adding a new method. > > > > 2017-03-01 13:46 GMT+01:00 Eno Thereska <eno.there...@gmail.com>: > > > >> Thanks Florian, > >> > >> Have you had a chance to look at the new state methods in 0.10.2, e.g., > >> KafkaStreams.state()? > >> > >> Thanks > >> Eno > >>> On 1 Mar 2017, at 11:54, Florian Hussonnois <fhussonn...@gmail.com> > >> wrote: > >>> > >>> Hi all, > >>> > >>> I have just created KIP-130 to add a new method to the KafkaStreams API > >> in > >>> order to expose the states of threads and active tasks. > >>> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP+ > >> 130%3A+Expose+states+of+active+tasks+to+KafkaStreams+public+API > >>> > >>> > >>> Thanks, > >>> > >>> -- > >>> Florian HUSSONNOIS > >> > >> > > > > > > -- Florian HUSSONNOIS