ozw999 opened a new issue #3138:
URL: https://github.com/apache/rocketmq/issues/3138
**前提**
使用rocketmq4.4.0版本。在双主双从集群中,我通过修改broker和namesrv的定时线程执行频率以及心跳超时时间,使得rocketmq在发生一主一从掉线时,namesrv能够在短时间内(大约10s)剔除故障的broker。
**在consumer进行rebalance时遇到的问题**
1、有概率会出现DefaultMQPushConsumerImpl.doRebalance()不会按照waitInterval(20s)执行,两次消息打印间隔>90s。我增加了打印信息,if条件始终成立。在容灾发生前和发生几分钟后,rebalance执行间隔恢复正常。
【RebalanceService】
```
@Override
public void run() {
log.info(this.getServiceName() + " service started");
while (!this.isStopped()) {
this.waitForRunning(waitInterval);
this.mqClientFactory.doRebalance();
}
log.info(this.getServiceName() + " service end");
}
```
【DefaultMQPushConsumerImpl】
```
@Override
public void doRebalance() {
if (!this.pause) {
this.rebalanceImpl.doRebalance(this.isConsumeOrderly());
}
}
```
2、有概率出现MQClientInstance的单线程池定时任务未按照设定时间执行,比如updateTopicRouteInfoFromNameServer两次消息打印间隔>60s。同问题1也是在容灾前后一切正常,容灾发生时线程不执行:
【MQClientInstance】
```
private final ScheduledExecutorService scheduledExecutorService =
Executors.newSingleThreadScheduledExecutor(new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
return new Thread(r, "MQClientFactoryScheduledThread");
}
});
```
```
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
try {
MQClientInstance.this.updateTopicRouteInfoFromNameServer();
} catch (Exception e) {
log.error("ScheduledTask
updateTopicRouteInfoFromNameServer exception", e);
}
}
}, 10, this.clientConfig.getPollNameServerInterval(),
TimeUnit.MILLISECONDS);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]