showuon commented on code in PR #15133: URL: https://github.com/apache/kafka/pull/15133#discussion_r1463172726
########## core/src/test/scala/unit/kafka/utils/TestUtils.scala: ########## @@ -1396,11 +1396,13 @@ object TestUtils extends Logging { // Note: Call this method in the test itself, rather than the @AfterEach method. // Because of the assert, if assertNoNonDaemonThreads fails, nothing after would be executed. def assertNoNonDaemonThreads(threadNamePrefix: String): Unit = { - val nonDaemonThreads = Thread.getAllStackTraces.keySet.asScala.filter { t => - !t.isDaemon && t.isAlive && t.getName.startsWith(threadNamePrefix) - } - val threadCount = nonDaemonThreads.size - assertEquals(0, threadCount, s"Found unexpected $threadCount NonDaemon threads=${nonDaemonThreads.map(t => t.getName).mkString(", ")}") + var nonDemonThreads: mutable.Set[Thread] = mutable.Set.empty[Thread] + waitUntilTrue(() => { + nonDemonThreads = Thread.getAllStackTraces.keySet.asScala.filter { t => + !t.isDaemon && t.isAlive && t.getName.startsWith(threadNamePrefix) + } + 0 == nonDemonThreads.size + }, s"Found unexpected ${nonDemonThreads.size} NonDaemon threads=${nonDemonThreads.map(t => t.getName).mkString(", ")}", 1000) Review Comment: cc @divijvaidya , I found sometimes the [CI](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-15133/9/testReport/junit/kafka.server/ReplicaManagerTest/Build___JDK_11_and_Scala_2_13___testSuccessfulBuildRemoteLogAuxStateMetrics__/) is too sensitive to the non demean threads check. There are some shutdown are in async way. So you can check the failed results [here](https://ci-builds.apache.org/job/Kafka/job/kafka-pr/job/PR-15133/9/): Basically, if there are some resource not closed, all the following tests should also fail (I verified in my local env). But in the CI results, it only fail 2 of replicaManagertest, and only in jdk11. So I'm going to verify it using `waitUntilTrue` to give it some chance to wait for the threads shutdown. I also set the wait time as 1 second because if there are really resources leaked, the total wait time will be the product of `waitTime` and the number of all the following failed tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org