If the cluster is busy then it will have lots of data to rebalance once the
broker comes online.  What type is your underlying storage ? Are you using
SSD ?

5k/sec and avg size 3kb  i.e. 15000Kb (14.6 MB /sec ) .  So if your broker
is down for 10 minutes then approx 8 GB data need to rebalance and again it
will depend on replication factor.




On Wed, Sep 16, 2020 at 3:09 PM Miroslav Tsvetanov <tsvetanov...@gmail.com>
wrote:

> Hello All,
>
> We are running Kafka in production with 3 brokers and Kafka version 2.1.1.
> We have noticed that when a Kafka broker was stopped for more than 10
> minutes and we are starting it again, after the start-up we are facing
> degradation of around 90% for up to 4 minutes.
> During this period(of around 4 minutes) we observe CPU usage reduction from
> 22% to 2% at all of the brokers. Also, the broker which has just been
> started have network-out 7 MB/min and network-in 2.2 GB/min, on the other
> hand, the rest of the brokers has network out 1.1 GB/min and network-in 55
> MB/min.
> We assume that this is due to the fact that the broker, who has been
> stopped for more than 10 minutes, must catch up with the messages that have
> been processed during the time while he was stopped.
> The performance degradation persists until all 3 brokers become insync (we
> have min.insync.replicas=2 and replication factor of 3).
>
> It is worth mentioning that we have ~5k messages per/sec with an average
> size ~3kb.
>
> We also try to increase broker nodes to 5 (rebalanced) and run with
> https://kafka.apache.org/081/documentation.html#prodconfig and still see
> ~35% performance degradation.
>
> Thanks in advance.
>
> Best regards,
> Miroslav
>


-- 
Thanx & Regard
Ashutosh Singh
08151945559

Reply via email to