FrankYang0529 opened a new pull request, #19262:
URL: https://github.com/apache/kafka/pull/19262

   If broker 1 doesn't get heartbeat promptly and it's fenced after the topic 
creation, the broker 1 cannot be ISR. The session timeout is 300ms. Following 
logs are from 
https://develocity.apache.org/s/tjs4dzxiphmwc/tests/task/:metadata:test/details/org.apache.kafka.controller.QuorumControllerTest/testMinIsrUpdateWithElr()/1/output:
   
   ```
   [2025-03-18 07:14:45,121] DEBUG [QuorumController id=0] Processed 
processBrokerHeartbeat(1474681863) in 140186 us 
(org.apache.kafka.controller.QuorumController:542) <-- heartbeat for broker 1
   ...
   [2025-03-18 07:14:45,147] DEBUG [QuorumController id=0] Processed 
processBrokerHeartbeat(399642596) in 10723 us 
(org.apache.kafka.controller.QuorumController:542) <-- heartbeat for broker 2
   ...
   [2025-03-18 07:14:45,172] DEBUG [QuorumController id=0] Processed 
processBrokerHeartbeat(168471181) in 13236 us 
(org.apache.kafka.controller.QuorumController:542) <-- heartbeat for broker 3
   ...
   [2025-03-18 07:14:45,288] INFO [QuorumController id=0] CreateTopics 
result(s): CreatableTopic(name='foo', ...) 
(org.apache.kafka.metalog.LocalLogManager$SharedLogData:258) <-- topic creation
   ...
   [2025-03-18 07:14:45,455] INFO [QuorumController id=0] Fencing broker 1 at 
epoch 6 because its session has timed out. 
(org.apache.kafka.controller.ReplicationControlManager:1693) <-- broker 1 
session timeout
   ```
   
   At 07:14:45,121, the broker 1 gets heartbeat and it's active. However, it 
doesn't get another heartbeat before 07:14:45,421, so it's fenced at 
07:14:45,455.
   
   The case can be reproduced by adding Thread.sleep(300) just after 
`active.creatTopics` [0]. To solve the root cause, use another thread to send 
heartbeat request, so broker 1 doesn't have chance to get fenced.
   
   [0] 
https://github.com/apache/kafka/blob/1ded681684e771b16aa98ae751f39b9816345a83/metadata/src/test/java/org/apache/kafka/controller/QuorumControllerTest.java#L663-L665


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to