Sophie Blee-Goldman created KAFKA-10198:
-------------------------------------------

             Summary: Dirty tasks may be recycled instead of closed
                 Key: KAFKA-10198
                 URL: https://issues.apache.org/jira/browse/KAFKA-10198
             Project: Kafka
          Issue Type: Bug
          Components: streams
            Reporter: Sophie Blee-Goldman
            Assignee: Sophie Blee-Goldman
             Fix For: 2.6.0


We recently added a guard to `Task#closeClean` to make sure we don't 
accidentally clean-close a dirty task, but we forgot to also add this check to 
`Task#closeAndRecycleState`. This meant an otherwise dirty task could be closed 
clean and recycled into a new task when it should have just been closed.

This manifest as an NPE in our test application. Specifically, task 1_0 was 
active on StreamThread-2 but reassigned as a standby. During handleRevocation 
we hit a TaskMigratedException while flushing the tasks and bailed on trying to 
flush and commit the remainder. This left task 1_0 with dirty keys in the 
suppression buffer and the `commitNeeded` flag still set to true.

During handleAssignment, we should have closed all the tasks with pending state 
as dirty (ie any task with commitNeeded = true). Since we don't know about the 
TaskMigratedException we hit during handleRevocation, we rely on the guard in 
Task#closeClean` to throw an exception and force the task to be closed dirty.

Unfortunately, we left this guard out of `closeAndRecycleState`, which meant 
task 1_0 was able to slip through without being closed dirty. Once 
reinitialized as a standby task, we eventually tried to commit it. The 
suppression buffer of course tried to flush its remaining dirty keys from its 
previous life as an active task. But since it's now a standby task, it should 
not be sending anything to the changelog and has a null RecordCollector. We 
tried to access it, and hit the NPE.

 

The fix is simple, we just need to add the guard in closeClean to 
closeAndRecycleState as well



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to