[ https://issues.apache.org/jira/browse/KAFKA-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17680403#comment-17680403 ]
A. Sophie Blee-Goldman commented on KAFKA-14533: ------------------------------------------------ I'll give it a few more days before re-enabling both parameters, but so far I've got a few runs in with only the `false` parameter enabled and this seems to have fixed the flakiness. Can't really envision why the state updater would/could the listOffsets request to fail in the way shown above, but it really does appear to be something about enabling this that so badly broke the SmokeTestDriverIntegrationTest Definitely need to look into this before we consider publicly releasing the state updater feature cc [~cadonna] [~lbrutschy] [~guozhang] > Flaky Test SmokeTestDriverIntegrationTest.shouldWorkWithRebalance > ----------------------------------------------------------------- > > Key: KAFKA-14533 > URL: https://issues.apache.org/jira/browse/KAFKA-14533 > Project: Kafka > Issue Type: Test > Components: streams, unit tests > Reporter: Greg Harris > Priority: Major > Labels: flaky-test > > The SmokeTestDriverIntegrationTest appears to be flakey failing in recent > runs: > ``` > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1444/tests/ > java.util.concurrent.TimeoutException: > shouldWorkWithRebalance(boolean) timed out after 600 seconds > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1443/tests/ > java.util.concurrent.TimeoutException: > shouldWorkWithRebalance(boolean) timed out after 600 seconds > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1441/tests/ > java.util.concurrent.TimeoutException: > shouldWorkWithRebalance(boolean) timed out after 600 seconds > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1440/tests/ > java.util.concurrent.TimeoutException: > shouldWorkWithRebalance(boolean) timed out after 600 seconds > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1438/tests/ > java.util.concurrent.TimeoutException: > shouldWorkWithRebalance(boolean) timed out after 600 seconds > > https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/1434/tests/ > java.util.concurrent.TimeoutException: > shouldWorkWithRebalance(boolean) timed out after 600 seconds > ``` > The stacktrace appears to be: > ``` > java.util.concurrent.TimeoutException: shouldWorkWithRebalance(boolean) timed > out after 600 seconds > at > org.junit.jupiter.engine.extension.TimeoutExceptionFactory.create(TimeoutExceptionFactory.java:29) > at > org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:58) > at > org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156) > ... > Suppressed: java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) > at > org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest.shouldWorkWithRebalance(SmokeTestDriverIntegrationTest.java:151) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727) > at > org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60) > at > org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131) > at > org.junit.jupiter.engine.extension.SameThreadTimeoutInvocation.proceed(SameThreadTimeoutInvocation.java:45) > ... 134 more > ``` > The test appears to be timing out waiting for the SmokeTestClient to complete > its asynchronous close, and taking significantly longer to do so (600s > instead of 60s) than a typical local test execution time. -- This message was sent by Atlassian Jira (v8.20.10#820010)