[ https://issues.apache.org/jira/browse/KAFKA-6555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367822#comment-16367822 ]
Ashish Surana commented on KAFKA-6555: -------------------------------------- First let's see when a task can go to restoration state: * when a node goes down having active task A1: one of the replicas of A1 should ideally become active. Since it's starting on replica, restoration should be quick if replica wasn't lagging much. * when new node gets added: active task A1 from other node get's assigned to this node. State is stored from scratch on this node, so restoration will take much longer. So I think there are scenarios when restoring store might be having latest state, and in other scenarios replica might have latest state. Ideally we want to serve the query from the store (main or replica) having latest state at that point of time, and it could be active or replica. A1 is active and R1, R2 and R3 are replicas, then pick latest(A1, R1, R2, R3). This latest function seems to be complicated. Please let me know if this makes sense or you guys think otherwise. > Making state store queryable during restoration > ----------------------------------------------- > > Key: KAFKA-6555 > URL: https://issues.apache.org/jira/browse/KAFKA-6555 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Ashish Surana > Assignee: Ashish Surana > Priority: Major > > State store in Kafka streams are currently only queryable when StreamTask is > in RUNNING state. The idea is to make it queryable even in the RESTORATION > (PARTITION_ASSIGNED) state as the time spend on restoration can be huge and > making the data inaccessible during this time could be downtime not suitable > for many applications. > When the active partition goes down then one of the following occurs: > # One of the standby replica partition gets promoted to active: Replica task > has to restore the remaining state from the changelog topic before it can > become RUNNING. The time taken for this depends on how much the replica is > lagging behind. During this restoration time the state store for that > partition is currently not queryable resulting in the partition downtime. We > can make the state store partition queryable for the data already present in > the state store. > # When there is no replica or standby task, then active task will be started > in one of the existing node. That node has to build the entire state from the > changelog topic which can take lot of time depending on how big is the > changelog topic, and keeping state store not queryable during this time is > the downtime for the parition. > It's very important improvement as it could simply improve the availability > of microservices developed using kafka streams. > I am working on a patch for this change. Any feedback or comments are welcome. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)