Kris20030907 opened a new issue, #9900:
URL: https://github.com/apache/rocketmq/issues/9900

   ### Before Creating the Enhancement Request
   
   - [x] I have confirmed that this should be classified as an enhancement 
rather than a bug/feature.
   
   
   ### Summary
   
   In large RocketMQ clusters with many broker nodes (e.g., 177 brokers in our 
production environment), the consumer startup time is significantly delayed due 
to serial heartbeat sending.
   
   ### Motivation
   
   Currently, RocketMQ clients send heartbeats to broker nodes serially, which 
creates a linear relationship between startup time and broker count. In our 
production environment with 177 brokers:
   
   - **Current startup time**: >15 seconds (measured from consumer start to 
first message consumption)
   - **Impact on containerized deployments**: Causes Kubernetes 
readiness/liveness probe failures, leading to multiple restarts during 
deployment
   - **Business impact**: Significantly slows down service deployment and 
rollback processes
   
   While heartbeat V2 reduces data size, it doesn't address the fundamental 
serial execution bottleneck.
   
   ### Describe the Solution You'd Like
   
   Introduce concurrent heartbeat sending with the following design:
   
   1. **New configuration parameters**:
      - `enableConcurrentHeartbeat`: Boolean flag to enable/disable concurrent 
mode (default: false for backward compatibility)
      - `concurrentHeartbeatThreadPoolSize`: Thread pool size for concurrent 
heartbeats (default: Current available CPU cores)
   
   2. **Implementation approach**:
      - Create a fixed thread pool for heartbeat sending when concurrent mode 
is enabled
      - Submit heartbeat tasks to all brokers in parallel
      - Use `CountDownLatch` to wait for all tasks to complete
      - Maintain the same error handling and logging as the serial 
implementation
   
   3. **Performance target**: Reduce consumer startup time from >15 seconds to 
<1 second for 177-broker clusters
   
   ### Describe Alternatives You've Considered
   
   1. **Batch heartbeats**: Sending one heartbeat covering multiple brokers 
requires protocol changes and broker-side support
   
   ### Additional Context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to