Ao Li created KAFKA-17379:
-----------------------------

             Summary: KafkaStreams: Unexpected state transition from ERROR to 
PENDING_SHUTDOWN
                 Key: KAFKA-17379
                 URL: https://issues.apache.org/jira/browse/KAFKA-17379
             Project: Kafka
          Issue Type: Bug
          Components: streams
            Reporter: Ao Li

I saw a failing test: `KafkaStreamsTest::shouldNotAddThreadWhenError`

{code}
Stream-client test-client: Unexpected state transition from ERROR to 
PENDING_SHUTDOWN
java.lang.IllegalStateException: Stream-client test-client: Unexpected state 
transition from ERROR to PENDING_SHUTDOWN
        at org.apache.kafka.streams.KafkaStreams.setState(KafkaStreams.java:344)
        at org.apache.kafka.streams.KafkaStreams.close(KafkaStreams.java:1558)
        at org.apache.kafka.streams.KafkaStreams.close(KafkaStreams.java:1456)
        at 
org.apache.kafka.streams.KafkaStreamsTest.shouldNotAddThreadWhenError(KafkaStreamsTest.java:708)
        at java.base/java.lang.reflect.Method.invoke(Method.java:580)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
        at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
{code}

You may use the following branch to reproduce the failure. 
https://github.com/aoli-al/kafka/tree/KAFKA-218

The root cause of the failure is that the close function in KafkaStreams is not 
atomic. If a state is changed while closing, the failure occurs: 

{code}
       // state is PENDINGERROR
        if (state.hasCompletedShutdown()) {
            log.info("Streams client is already in the terminal {} state, all 
resources are closed and the client has stopped.", state);
            return true;
        }
       // state is ERROR
        if (state.isShuttingDown()) {
            log.info("Streams client is in {}, all resources are being closed 
and the client will be stopped.", state);
            if (state == State.PENDING_ERROR && waitOnState(State.ERROR, 
timeoutMs)) {
                log.info("Streams client stopped to ERROR completely");
                return true;
            } else if (state == State.PENDING_SHUTDOWN && 
waitOnState(State.NOT_RUNNING, timeoutMs)) {
                log.info("Streams client stopped to NOT_RUNNING completely");
                return true;
            } else {
                log.warn("Streams client cannot transition to {} completely 
within the timeout",
                         state == State.PENDING_SHUTDOWN ? State.NOT_RUNNING : 
State.ERROR);
                return false;
            }
        }

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to