Your observation is correct. Kafka Streams creates a task per partition. As you have a shared state store over two operator, the tasks of both input streams need to be merged to ensure co-partitioning.
Thus, task0 reads topic1 partition0 and topic2 partion0, and all other task[123] only topic1 partition[123] (as there are no more partitions for the second topic. Using the same number of partitions should be the better solution. If you introduce a second consumer, KS does not know anything about this consumer, and thus, cannot control when this consumer commits offsets. And as you don't know when KS's internal consumer commits offsets, you cannot align your own offset commits to the KS commits. Long story short, this would be a problem for fail-over scenarios and might result in data loss. -Matthias On 9/1/17 5:42 AM, hugues.deslan...@hardis.fr wrote: > Hi, > > I'd like to have your comments on the problem I met while testing my app > with kafka streams (0.10.2.1) > Roughly, my stream app has 2 input topics : > . the first one has 4 partitions (main data) > . the second one has only one partition and receives messages from time to > time > > At first, I supposed I had 2 sub topologies > A : With the first topic, I build a state store using process() and I also > have punctuate activated . > B : The second topic is used to trigger an analysis using the state store > data with process() > (both processes use the same kafka topic as sink) > > During tests I realised the content of the state store viewed by this > process B is only based on data received on partition 0 fo the first topic > I finally understood the link between those 2 sub-topologies forced the > system to see it as one unique topology and have only one task by > partition reading the first and second topic; am I right ? > > I imagined 2 options to solve this issue > option 1 : replace topology B by a consumer on second topic that will > trigger a query on statestore > option 2 : have 4 partitions for topic 2 and write the same message in the > 4 partitions > > I tested both but not sure which one is better ... > Do you have any other suggestions or comments > Thanks in advance. > > Hugues >
signature.asc
Description: OpenPGP digital signature