gharris1727 commented on pull request #9040: URL: https://github.com/apache/kafka/pull/9040#issuecomment-661296268
@kkonstantine In my investigation for this fix, I noticed that the long join was happening in the first worker to join the group, as it was creating the `connect-offsets` topic. The broker logs indicated that the topic was created in a timely manner, and that it was visible to the other workers that joined afterwards. The first worker remained waiting for the topic creation result after the other two workers had been started, causing the test to fail. I could only pick out one suspicious thing about the create topic operation on the broker, as I am not very familiar with broker logs. For the successful create topic operations, these unblocked messages appeared: ``` [2020-07-05 09:47:32,423] DEBUG Request key TopicKey(connect-status) unblocked 1 topic operations (kafka.server.DelayedOperationPurgatory) [2020-07-05 09:47:32,423] DEBUG [Admin Manager on Broker 1]: Request key connect-status unblocked 1 topic requests. (kafka.server.AdminManager) ``` These did not appear for the excessively long create topic operation. Reading the log message literally, it's possible that the operation is either never entering the DelayedOperationPurgatory, or never released from it, and thus the operation times out on the client side without the worker finding out that the request was filled. I think this is benign in our case, and a retry will be able to recover the test by discovering the topic has already been created. [Logs for that run with the long join](http://confluent-kafka-2-6-system-test-results.s3-us-west-2.amazonaws.com/2020-07-05--001.1593942687--confluentinc--2.6--926929cad/ConnectDistributedTest/test_pause_state_persistent/connect_protocol%3Dcompatible/689.tgz) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org