@Suman, thanks for confirming. I will dig more then. The instances are
dedicated to running Kafka, and so is the mounted volume.

@Seva, thanks for the insight. I guess if nothing works, then we will move
from st1 to gp2 volumes.

On Tue, Apr 7, 2020 at 12:28 AM Suman B N <sumannew...@gmail.com> wrote:

> We have used st1 volumes and we never saw any issue.
> Yes, we are using m-series. Even t-series worked for us :D
>
> During those spikes, do you observe any background operations going on?
> Check server logs, controller logs.
>
> On Tue, Apr 7, 2020 at 12:49 PM Seva Feldman <sev...@ironsrc.com> wrote:
>
> > ST1 EBS fit only for sequential rights and reads. Once you have many
> > partitions on EBS it will be mostly random.
> > Interesting to monitor random vs sequential...
> >
> > We tested kafka on ST1 with 1xx partitions on each EBS and it was
> > constantly lagging.
> >
> > BR
> >
> > On Tue, Apr 7, 2020 at 10:06 AM Soumyajit Sahu <soumyajit.s...@gmail.com
> >
> > wrote:
> >
> > > Our typical IOPS stays at ~10K write ops/min, but it goes to 37K write
> > > ops/min (which is where AWS throttles).
> > > The spike in write ops isn't accompanied by any spike in write
> throughput
> > > or produce requests (except for the first few minutes of catch up). The
> > > write ops spike stays up (persistently for an hour or two) until we
> stop
> > > the broker ec2 instance for about 30 mins and then start it back.
> > >
> > > @Liam, no, we are not using log compaction except for a few consumer
> > offset
> > > topics and config topic (for Kafka Connect), and schema registry store.
> > >
> > > @Suman, are you using m5 or r5 instances. Recently, we migrated from r5
> > to
> > > m5, and I wonder if that has a hand in this.
> > >
> > > We have about 1000 partitions residing on each disk, but I don't think
> > that
> > > matters as most of the time the brokers run flawlessly (even during
> peak
> > > traffic hours).
> > >
> > > Thanks!
> > >
> > > On Mon, Apr 6, 2020 at 11:39 PM Suman B N <sumannew...@gmail.com>
> wrote:
> > >
> > > > We too have a similar setup but we never observed any such spikes.
> > > >
> > > > Are you sure your disk IOPS is good enough? Check if that is
> > throttling.
> > > >
> > > > After a broker restarts, there might be more traffic as well because
> of
> > > > followers trying to catch up with the leader.
> > > >
> > > > -Suman
> > > >
> > > > On Tue, Apr 7, 2020 at 11:59 AM Soumyajit Sahu <
> > soumyajit.s...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > We are running Kafka on AWS EC2 instances (m5.2xlarge) with mounted
> > EBS
> > > > st1
> > > > > volume (one on each machine).
> > > > > Occasionally, we have noticed that the write ops/second goes
> through
> > > the
> > > > > roof and we get throttled by AWS while the data throughput wouldn't
> > > have
> > > > > changed much. As far as our observation goes, it happens usually
> > after
> > > a
> > > > > broker restart.
> > > > >
> > > > > Has anyone else come across this behavior?
> > > > >
> > > > > Thanks!
> > > > >
> > > >
> > > >
> > > > --
> > > > *Suman*
> > > > *OlaCabs*
> > > >
> > >
> >
> >
> > --
> > Seva Feldman
> > VP R&D Mobile Delivery
> > [image: ironSource] <http://www.ironsrc.com/>
> >
> > email sev...@ironsrc.com
> > mobile +972544346089
> >
> > ironSource HQ - 121 Derech Menachem Begin st. Tel Aviv
> >
>
>
> --
> *Suman*
> *OlaCabs*
>

Reply via email to