We are using EC2 EBS volume "thoroughput optimized hdd (st1)" from AWS:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html
with 3 brokers and replication factor 3.
There is no data lost we simply accept 10% of the messages sent during this
time period and the rest are delayed and some of them can reach timeout.

On Wed, Sep 16, 2020 at 1:30 PM Ashutosh singh <getas...@gmail.com> wrote:

> If the cluster is busy then it will have lots of data to rebalance once the
> broker comes online.  What type is your underlying storage ? Are you using
> SSD ?
>
> 5k/sec and avg size 3kb  i.e. 15000Kb (14.6 MB /sec ) .  So if your broker
> is down for 10 minutes then approx 8 GB data need to rebalance and again it
> will depend on replication factor.
>
>
>
>
> On Wed, Sep 16, 2020 at 3:09 PM Miroslav Tsvetanov <tsvetanov...@gmail.com
> >
> wrote:
>
> > Hello All,
> >
> > We are running Kafka in production with 3 brokers and Kafka version
> 2.1.1.
> > We have noticed that when a Kafka broker was stopped for more than 10
> > minutes and we are starting it again, after the start-up we are facing
> > degradation of around 90% for up to 4 minutes.
> > During this period(of around 4 minutes) we observe CPU usage reduction
> from
> > 22% to 2% at all of the brokers. Also, the broker which has just been
> > started have network-out 7 MB/min and network-in 2.2 GB/min, on the other
> > hand, the rest of the brokers has network out 1.1 GB/min and network-in
> 55
> > MB/min.
> > We assume that this is due to the fact that the broker, who has been
> > stopped for more than 10 minutes, must catch up with the messages that
> have
> > been processed during the time while he was stopped.
> > The performance degradation persists until all 3 brokers become insync
> (we
> > have min.insync.replicas=2 and replication factor of 3).
> >
> > It is worth mentioning that we have ~5k messages per/sec with an average
> > size ~3kb.
> >
> > We also try to increase broker nodes to 5 (rebalanced) and run with
> > https://kafka.apache.org/081/documentation.html#prodconfig and still see
> > ~35% performance degradation.
> >
> > Thanks in advance.
> >
> > Best regards,
> > Miroslav
> >
>
>
> --
> Thanx & Regard
> Ashutosh Singh
> 08151945559
>

Reply via email to