Hi David, on startup of the second application instance, the KTable is effectively partitioned into two distinct partial KTables, each holding the key-valus pairs for their corresponding assigned partitions.
Thus, your "lookups" on each instance, can only access the key-value pairs for the set of keys assigned to each instance. There is no replication of the whole KTable to both instances happening. We are aware, that a global view over the whole KTable (ie, all local KTable partitions over all running applications instances) is a nice feature. There is already a KIP in place and we hope to release this feature, soon: Have look here for QA KIP-67 https://cwiki.apache.org/confluence/display/KAFKA/KIP-67%3A+Queryable+state+for+Kafka+Streams -Matthias On 08/02/2016 05:47 PM, David Garcia wrote: > Hello, I’ve googled around for this, but haven’t had any luck. Based upon > this: http://docs.confluent.io/3.0.0/streams/architecture.html#state KTables > are local to instances. An instance will process one or more partitions from > one or more topics. How does Kstreams/Ktables handle the following situation? > > A single application instance is processing 4 partitions from a topic. The > application is using a Ktable. Each event triggers lookups in the KTable. > Now, a new application instance is started. This triggers a rebalancing of > the partitions. 2 partitions originally processed by the first instance > migrate to the new instance. What happens with the KTable? Is the entire > table “migrated” also? This would be nice because lookups (in the first > instance) triggered by particular events should be identical to lookups (in > the second instance) triggered by those same events. > > -David >
signature.asc
Description: OpenPGP digital signature