Re: Kafka performance when it comes to throughput

Israel Ekpo Fri, 07 Jan 2022 16:06:34 -0800

Marisa,

I have kicked off the video series on performance optimization for the
Kafka setup.


I will be working on the various configurations for latency, throughput,
availability and durability.

https://youtu.be/aPlbG349cXg

The first ones will be on latency and throughput which is what you are
interested in and then I will work on the demos for availability and
durability later.

This will be done in KRaft and Legacy mode with sample datasets of 1000,
100000, and 1000000 messages end-to-end with variations in the number of
producers, consumers and partitions.

I am looking forward to this.

Israel Ekpo
Lead Instructor, IzzyAcademy.com
https://www.youtube.com/c/izzyacademy
https://izzyacademy.com/


On Thu, Jan 6, 2022 at 10:17 PM Marisa Queen <[email protected]>
wrote:

> Wow, that's awesome! I wasn't expecting that. I truly appreciate your help
> and professionalism.
>
> > Let me find some time soon and I will do a video on that scenario
> optimized primarily for low latency and throughput. I will also compare how
> this performs when adjusted for durability and high availability.
>
> Take your time! That will be tremendously helpful. I was going to try to do
> that myself, but I'm sure that you have better expertise to tune the knobs
> for a more realistic and professional benchmark.
>
> I'm curious to see the numbers. Perhaps you can even start with the
> simplest of all setups: 1 producer, 1 topic, 1 partition, 1 consumer. 10
> million messages flowing. What messages-per-second number do you get? Then
> move to 1 producer, 1 topic, 4 partitions, 1 consumer. Did it get better or
> worse with the addition of multiple partitions?
>
> Thanks again, Israel.
>
> Cheers,
>
> M. Queen
>
>
> On Thu, Jan 6, 2022 at 11:52 PM Israel Ekpo <[email protected]> wrote:
>
> > Thanks for your response Marisa.
> >
> > This has been a very interesting discussion and I appreciate it.
> >
> > It is a bit of a challenge in the sense that I wish I had a demo ready to
> > go with similar use case and expectations  to easily explain what I have
> > been trying to convey
> >
> > I am always ready for a challenge like this and to fix this I will like
> to
> > do a demo soon with the 10 million message scenario you
> > originally mentioned in your first message to track the time end to end
> > while capturing other metrics.
> >
> > Let me find some time soon and I will do a video on that scenario
> optimized
> > primarily for low latency and throughput. I will also compare how this
> > performs when adjusted for durability and high availability
> >
> > I wish I had this demo ready before now it would have clarified a lot of
> > what I have been trying to explain regarding tuning the knobs for
> latency,
> > throughput, high availability and durability. By achieving what you are
> > willing to pay for I was only suggesting that highly performant system
> can
> > be expensive at times and I apologize if the tone came out wrong
> >
> >  I am grateful that you brought this up and it will give the community
> > something to reference in the future of similar questions come up
> regarding
> > benchmarks
> >
> > Thanks for bringing up the question, please let us know if you have
> > additional questions and you can reach out with any further questions or
> > feedback you may have.
> >
> > Thanks again
> >
> > Sincerely
> > Israel Ekpo
> >
> > On Thu, Jan 6, 2022 at 9:18 PM Marisa Queen <[email protected]>
> > wrote:
> >
> > > Hi Israel,
> > >
> > > > You can achieve any performance benchmark you are willing to pay for.
> > >
> > > Thanks for your email. Allow me to respectfully disagree. I believe
> that
> > > some systems are better than others when it comes to performance. The
> > idea
> > > that I can just take a slow system, multiply by 1 million, and then I
> > have
> > > a super fast system, is at the least misleading. Assuming the same
> > hardware
> > > for everyone, some languages are faster than others. Some algorithms
> are
> > > faster than others. Some architectures are more efficient than others.
> > Some
> > > protocols are faster than others.
> > >
> > > Take a binary search vs a linear search for example. Binary search is
> of
> > > course much faster and more efficient than linear search (for large
> > lists),
> > > but according to your rationale this is not a problem. Just buy enough
> > > machines to do linear search in parallel and you can boast 1 million
> > > searches per second. What an amazing search system you are deploying!
> It
> > > can do 1 million searches per second, that's more than enough for any
> > > system.
> > >
> > > 7 TRILLION messages per day for Kafka/LinkedIn sounds amazing when just
> > > thrown on the table. Using your example, a transportation company can
> > > transport 5 packages per day using one of its bicycles. Is the
> > architecture
> > > of this company efficient? Fast? According to your rationale, it does
> not
> > > matter! The company needs only to buy 1 million bikes, and now it can
> > boast
> > > about delivering 5 million packages per day. You can say the company
> is a
> > > large corporation, but when it comes to efficiency it is more like a
> > > dinosaur. It has a high chance of being replaced by other more
> efficient
> > > companies in the future.
> > >
> > > To summarize, low latency is crucial for finance applications. You
> can't
> > > just say: "don't worry, it is proven and it can do 7 trillion messages
> > per
> > > day". That just won't do it. A ceiling benchmark number, for latency
> and
> > > throughput, is paramount for any system that wants to operate in that
> > > industry. The answer is not "as much as you are willing to pay for".
> > >
> > > Cheers,
> > >
> > > M. Queen
> > >
> > >
> > > On Thu, Jan 6, 2022 at 8:53 PM Israel Ekpo <[email protected]>
> wrote:
> > >
> > > > Marisa,
> > > >
> > > > I do not agree with your assessment. There are several factors that
> > could
> > > > influence your performance numbers even with localhost. Your project
> > > should
> > > > be configured based on your own needs.
> > > >
> > > > Your throughput could go up or lower depending on how you are
> > configured
> > > > based on what is important for your use case(s).
> > > >
> > > > If you have other apps running on the machine that would impact your
> > > > results. If you only have a 2 CPU, 4GB laptop, obviously you cannot
> > > compare
> > > > the results with a server that has 256GB of RAM and 64 Cores.
> > > >
> > > > Also, do not measure it in terms of messages per second but more in
> > terms
> > > > of data volume per second. A throughput of 100GBps will give you 100
> > > > messages per second 1 GB per message or 100,000 messages per second
> at
> > > 1KB
> > > > each if you have smaller messages the same volume will give a higher
> > > count
> > > > of messages for the same unit time.
> > > >
> > > > Take a look at the reference architecture and this best practices
> > > document
> > > > for how to optimize your performance based on your project goals
> > > > (durability, latency, throughput and availability)
> > > >
> > > > Confluent Platform Reference Architecture - Confluent
> > > > <
> > > >
> > >
> >
> https://www.confluent.io/thank-you/resources/apache-kafka-confluent-enterprise-reference-architecture/
> > > > >
> > > > Kafka Best Practices: Build, Monitor & Optimize Kafka in Confluent
> > Cloud
> > > > <
> > > >
> > >
> >
> https://www.confluent.io/thank-you/resources/recommendations-developers-using-confluent-cloud/
> > > > >
> > > >
> > > > Everybody's scenario and use case will impact how they set up their
> > > > project. You cannot look at another project and use their numbers for
> > > your
> > > > own set up. That is generally a bad idea and the better answer is
> that
> > > you
> > > > will need to define your project objectives and then figure out what
> is
> > > > needed to achieve those goals.
> > > >
> > > > The better question is to take a look at what volume throughput,
> > > retention
> > > > policy and period as well as environment and then figure out the
> > capacity
> > > > planning necessary to support what you need.
> > > >
> > > > You can achieve any performance benchmark you are willing to pay
> for. I
> > > am
> > > > not a fan of just blinding copying other peoples numbers and using it
> > out
> > > > of context in benchmarks comparisons.
> > > >
> > > > Take a look at the capacity planner and sizing calculator to figure
> out
> > > > what hardware and infrastructure you need for your scenario
> > > >
> > > > Sizing Calculator for Apache Kafka and Confluent Platform (
> > eventsizer.io
> > > )
> > > > <https://eventsizer.io/>
> > > >
> > > > I hope this is more useful.
> > > >
> > > >
> > > > Israel Ekpo
> > > > Lead Instructor, IzzyAcademy.com
> > > > https://www.youtube.com/c/izzyacademy
> > > > https://izzyacademy.com/
> > > >
> > > >
> > > > On Thu, Jan 6, 2022 at 6:07 PM Marisa Queen <
> [email protected]
> > >
> > > > wrote:
> > > >
> > > > > Hi Joris,
> > > > >
> > > > > Thank you so much, friend!
> > > > >
> > > > > > I appreciate that setting up everything on localhost will be
> easier
> > > and
> > > > > lead to big numbers, but bear in mind that it's typically all the
> > other
> > > > > real-life stuff (remote connections, replication, at-least once,
> ...)
> > > > that
> > > > > causes massive slowdowns compared to localhost
> > > > >
> > > > > Totally agree! But we must establish a ceiling first. If this
> > > > > super-good-loopback number doesn't look good, then one has no
> > business
> > > > > moving forward with Kafka to the more complex (and of course
> slower)
> > > > stuff.
> > > > >
> > > > > The purpose of the ceiling is that. It is your maximum ambition
> > > > represented
> > > > > by a number. You can't go any higher than that. At least with
> Kafka.
> > > > >
> > > > > Agree?
> > > > >
> > > > > Cheers,
> > > > >
> > > > > M. Queen
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 6, 2022 at 3:51 PM Joris Peeters <
> > > [email protected]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > These tutorials - though quite a bit outdated - seem quite
> useful:
> > > > > >
> > http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html
> > > > > (and
> > > > > > the follow-ups).
> > > > > > Ends up being close to how I write this in Java, and tutorial 13
> > > talks
> > > > > > about batching and acks etc, which you'll need in order to tune
> to
> > > > > maximise
> > > > > > your throughput.
> > > > > >
> > > > > > I'm sure someone else has better example resources.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jan 6, 2022 at 6:25 PM Marisa Queen <
> > > [email protected]
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Joris,
> > > > > > >
> > > > > > > Thank you so much. I plan to write a Java Consumer and a Java
> > > > Producer,
> > > > > > for
> > > > > > > my benchmark. Do you recommend an example that I can use as a
> > > > reference
> > > > > > to
> > > > > > > write my basic Java producer and simple Java consumer? I'll for
> > > sure
> > > > > > share
> > > > > > > the through number I get with the community. Maybe even write a
> > > blog
> > > > > post
> > > > > > > about it. I hope it is more than 23 messages per second per
> > > partition
> > > > > > > :PPPPP
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > M. Queen
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jan 6, 2022 at 2:14 PM Joris Peeters <
> > > > > [email protected]
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I'd just follow the instructions in
> > > > > > https://kafka.apache.org/quickstart
> > > > > > > to
> > > > > > > > set up Kafka and Zookeeper on a single node, by running the
> > Java
> > > > > > > processes
> > > > > > > > directly. Or can run in Docker.
> > > > > > > >
> > > > > > > > For the producer and consumer I'd personally use Python, as
> > it's
> > > > the
> > > > > > > > easiest to get going. You may want to look at
> > > > > > > > https://kafka-python.readthedocs.io/en/master/# (easier) and
> > > > > > > > https://github.com/confluentinc/confluent-kafka-python
> > (faster).
> > > > > > Similar
> > > > > > > > things exist for Go, Java, C++, ...
> > > > > > > > Or I'm sure there are some benchmark setups out there that
> you
> > > can
> > > > > > tweak
> > > > > > > a
> > > > > > > > little.
> > > > > > > >
> > > > > > > > I appreciate that setting up everything on localhost will be
> > > easier
> > > > > and
> > > > > > > > lead to big numbers, but bear in mind that it's typically all
> > the
> > > > > other
> > > > > > > > real-life stuff (remote connections, replication,
> > at-least-once,
> > > > ...)
> > > > > > > that
> > > > > > > > causes massive slowdowns compared to localhost, and those are
> > > > things
> > > > > > > banks
> > > > > > > > eventually tend to need (I work in finance industry myself).
> > What
> > > > > > you're
> > > > > > > > doing is a very useful benchmark, but I'd surround it with
> the
> > > > above
> > > > > > > > caveats to avoid overpromising.
> > > > > > > >
> > > > > > > > -J
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jan 6, 2022 at 4:58 PM Marisa Queen <
> > > > > [email protected]
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Joris,
> > > > > > > > >
> > > > > > > > > I've spoken to him. His answers are below:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jan 6, 2022 at 1:37 PM Joris Peeters <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > There's a few unknown parameters here that might
> influence
> > > the
> > > > > > > answer,
> > > > > > > > > > though. From the top of my head, at least
> > > > > > > > > > - How much replication of the data is needed (for high
> > > > > > availability),
> > > > > > > > and
> > > > > > > > > > how many acks for the producer? (If fire-and-forget it
> can
> > be
> > > > > > faster,
> > > > > > > > if
> > > > > > > > > > need to replicate and ack from 3 brokers in different
> DC's
> > > then
> > > > > > will
> > > > > > > be
> > > > > > > > > > slower)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Let's assume no high-availability for now, for simplicity's
> > > sake.
> > > > > > > > > Fire-and-forget like he said. We don't want to
> overcomplicate
> > > > this
> > > > > > > simple
> > > > > > > > > benchmark and we want the highest possible throughput
> number.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > - Transactions? (If end-to-end exactly-once then it's a
> lot
> > > > > slower)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Again no transactions. Let's keep it simple.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > - Size of the messages? (If each message is a GB it will
> > > > > obviously
> > > > > > be
> > > > > > > > > > slower)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Let's assume 512 bytes. Powers of two are fun!
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > - Distance and bandwidth between the producers, Kafka &
> the
> > > > > > > consumers?
> > > > > > > > > (If
> > > > > > > > > > the network links get saturated that would limit the
> > > > performance.
> > > > > > > > Latency
> > > > > > > > > > is likely less important than throughput, but if your
> > > consumers
> > > > > are
> > > > > > > in
> > > > > > > > > > Tokyo and the producer in London then it will likely also
> > be
> > > > > > slower)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Loopback, same machine, for the love of God. Let's not even
> > go
> > > > > there.
> > > > > > > We
> > > > > > > > > want the highest possible throughput. I accept the limit of
> > the
> > > > > speed
> > > > > > > of
> > > > > > > > > light. If network particularities, and distances, are to be
> > > > > included
> > > > > > in
> > > > > > > > > this measurement then it is basically worth nothing.
> Loopback
> > > > > > > eliminates
> > > > > > > > > all those network variables that we surely don't want to
> > > include
> > > > in
> > > > > > the
> > > > > > > > > benchmark.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > FWIW, I find that the producer side is generally the
> > limiting
> > > > > > factor,
> > > > > > > > > > especially if there is only one.
> > > > > > > > > > I'd take a look at e.g. the Appendix test details on
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > https://docs.confluent.io/2.0.0/clients/librdkafka/INTRODUCTION_8md.html
> > > > > > > > > .
> > > > > > > > > > I
> > > > > > > > > > haven't yet seen a faster Kafka impl than rdkafka, so
> those
> > > > would
> > > > > > be
> > > > > > > > > > reasonable upper bounds.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Thanks for your reply, Joris. Can you point me to a Hello
> > World
> > > > > Kafka
> > > > > > > > > example, so I can set up this very basic and BARE BONES
> Kafka
> > > > > system,
> > > > > > > > > without any of the complications you correctly mentioned
> > > above? I
> > > > > > have
> > > > > > > 10
> > > > > > > > > million messages that I need to send from producers to
> > > > consumers. I
> > > > > > > have
> > > > > > > > 1
> > > > > > > > > topic, 1 producer for this topic, 4 partitions for this
> topic
> > > > and 4
> > > > > > > > > consumers, one for each partition. Everything loopback,
> same
> > > > > machine,
> > > > > > > no
> > > > > > > > > high-availability, transactions, etc. just KAFKA BARE
> BONES.
> > > What
> > > > > can
> > > > > > > be
> > > > > > > > > more trivial and basic than that?
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > >
> > > > > > > > > M. Queen
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Jan 6, 2022 at 4:25 PM Marisa Queen <
> > > > > > > [email protected]
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Israel,
> > > > > > > > > > >
> > > > > > > > > > > Your email is great, but I'm afraid to forward it to my
> > > > > customer
> > > > > > > > > because
> > > > > > > > > > it
> > > > > > > > > > > doesn't answer his question.
> > > > > > > > > > >
> > > > > > > > > > > I'm hoping that other members from this list will be
> able
> > > to
> > > > > give
> > > > > > > me
> > > > > > > > a
> > > > > > > > > > more
> > > > > > > > > > > NUMERIC answer, let's wait to see.
> > > > > > > > > > >
> > > > > > > > > > > Just to give you some follow up on your answer, when
> you
> > > say:
> > > > > > > > > > >
> > > > > > > > > > > > 30 passengers per driver or aircraft per day may not
> > > sound
> > > > > > > > impressive
> > > > > > > > > > but
> > > > > > > > > > > 750,000 passengers per day all together is how you
> should
> > > > look
> > > > > at
> > > > > > > it
> > > > > > > > > > >
> > > > > > > > > > > Well, with this rationality one can come up with any
> > > desired
> > > > > > > > throughput
> > > > > > > > > > > number by just adding more partitions. Do you see my
> > > customer
> > > > > > point
> > > > > > > > > that
> > > > > > > > > > > this does not make any sense? Adding more partitions
> also
> > > > does
> > > > > > not
> > > > > > > > come
> > > > > > > > > > for
> > > > > > > > > > > free, because messages need to be separated into the
> > newly
> > > > > > created
> > > > > > > > > > > partition and ordering will be lost. Order is important
> > for
> > > > > some
> > > > > > > > > > messages,
> > > > > > > > > > > so to keep adding more partitions towards an infinite
> > > > > throughput
> > > > > > is
> > > > > > > > not
> > > > > > > > > > an
> > > > > > > > > > > option.
> > > > > > > > > > >
> > > > > > > > > > > I've just spoken to him here, his reply was:
> > > > > > > > > > >
> > > > > > > > > > > "Marisa, I'm asking a very simple question for a very
> > basic
> > > > > Kafka
> > > > > > > > > > scenario.
> > > > > > > > > > > If I can't get an answer for that, then I'm in trouble.
> > Can
> > > > you
> > > > > > > > please
> > > > > > > > > > find
> > > > > > > > > > > out with your peers/community what is a good throughput
> > > > number
> > > > > to
> > > > > > > > have
> > > > > > > > > in
> > > > > > > > > > > mind for the scenario I've been describing. Again it
> is a
> > > > very
> > > > > > > basic
> > > > > > > > > and
> > > > > > > > > > > simple scenario: I have 10 million messages that I need
> > to
> > > > send
> > > > > > > from
> > > > > > > > > > > producers to consumers. Let's assume I have 1 topic, 1
> > > > producer
> > > > > > for
> > > > > > > > > this
> > > > > > > > > > > topic, 4 partitions for this topic and 4 consumers, one
> > for
> > > > > each
> > > > > > > > > > partition.
> > > > > > > > > > > What I would like to know is: How long is it going to
> > take
> > > > for
> > > > > > > these
> > > > > > > > 10
> > > > > > > > > > > million messages to travel all the way from the
> producer
> > to
> > > > the
> > > > > > > > > > consumers?
> > > > > > > > > > > That's the throughput performance number I'm interested
> > > in."
> > > > > > > > > > >
> > > > > > > > > > > I surely won't tell him: "Hey, that's easy, you have 4
> > > > > > partitions,
> > > > > > > > each
> > > > > > > > > > > partition according to LinkedIn can handle 23 messages
> > per
> > > > > > second,
> > > > > > > so
> > > > > > > > > we
> > > > > > > > > > > are looking for a 92 messages per second throughput
> > here!"
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > >
> > > > > > > > > > > M. Queen
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jan 6, 2022 at 12:58 PM Israel Ekpo <
> > > > > > [email protected]>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Marisa
> > > > > > > > > > > >
> > > > > > > > > > > > I think there may be some confusion about the
> > throughput
> > > > for
> > > > > > each
> > > > > > > > > > > partition
> > > > > > > > > > > > and I want to explain briefly using some analogies
> > > > > > > > > > > >
> > > > > > > > > > > > Using transportation for example if we were to pick
> an
> > > > > airline
> > > > > > or
> > > > > > > > > > > > ridesharing organization to describe the volume of
> > > > customers
> > > > > > they
> > > > > > > > can
> > > > > > > > > > > > support per day we would have to look at how many
> total
> > > > > > customers
> > > > > > > > can
> > > > > > > > > > > > American Airlines service in a day or how many
> > customers
> > > > can
> > > > > > Uber
> > > > > > > > or
> > > > > > > > > > Lyft
> > > > > > > > > > > > serve in a day. We would not zero in on only the
> number
> > > of
> > > > > > > > customers
> > > > > > > > > a
> > > > > > > > > > > > particular driver can service or the number of
> > passengers
> > > > are
> > > > > > > > > > particular
> > > > > > > > > > > > aircraft than service in a day. That would be very
> > > limiting
> > > > > > > > > considering
> > > > > > > > > > > the
> > > > > > > > > > > > hundreds of thousands of aircrafts or drivers
> actively
> > > > > > > transporting
> > > > > > > > > > > > passengers in real time.
> > > > > > > > > > > >
> > > > > > > > > > > > 30 passengers per driver or aircraft per day may not
> > > sound
> > > > > > > > impressive
> > > > > > > > > > but
> > > > > > > > > > > > 750,000 passengers per day all together is how you
> > should
> > > > > look
> > > > > > at
> > > > > > > > it
> > > > > > > > > > > >
> > > > > > > > > > > > Partitions in Kafka are just a logical unit for
> > > organizing
> > > > > and
> > > > > > > > > storing
> > > > > > > > > > > data
> > > > > > > > > > > > within a Kafka topic. You should not base your
> analysis
> > > on
> > > > > just
> > > > > > > > what
> > > > > > > > > a
> > > > > > > > > > > > subunit of storage is able to support.
> > > > > > > > > > > >
> > > > > > > > > > > > I would recommend taking a look at Kafka Summit talks
> > on
> > > > > > > > performance
> > > > > > > > > > and
> > > > > > > > > > > > benchmarks to get some understanding how what Kafka
> is
> > > able
> > > > > to
> > > > > > do
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > applicable use cases in the Financial Services
> industry
> > > > > > > > > > > >
> > > > > > > > > > > > A lot of reputable organizations already trust Kafka
> > > today
> > > > > for
> > > > > > > > their
> > > > > > > > > > > needs
> > > > > > > > > > > > so this is already proven
> > > > > > > > > > > >
> > > > > > > > > > > > https://kafka.apache.org/powered-by
> > > > > > > > > > > >
> > > > > > > > > > > > I hope this helps.
> > > > > > > > > > > >
> > > > > > > > > > > > Israel Ekpo
> > > > > > > > > > > > Lead Instructor, IzzyAcademy.com
> > > > > > > > > > > > https://www.youtube.com/c/izzyacademy
> > > > > > > > > > > > https://izzyacademy.com/
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Jan 6, 2022 at 10:01 AM Marisa Queen <
> > > > > > > > > > [email protected]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Cheers from NYC!
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm trying to give a performance number to a
> > potential
> > > > > client
> > > > > > > > (from
> > > > > > > > > > the
> > > > > > > > > > > > > financial market) who asked me the following
> > question:
> > > > > > > > > > > > >
> > > > > > > > > > > > > *"If I have a Kafka system setup in the best way
> > > possible
> > > > > for
> > > > > > > > > > > > performance,
> > > > > > > > > > > > > what is an approximate number that I can have in
> mind
> > > for
> > > > > the
> > > > > > > > > > > throughput
> > > > > > > > > > > > of
> > > > > > > > > > > > > this system?"*
> > > > > > > > > > > > >
> > > > > > > > > > > > > The client proceeded to say:
> > > > > > > > > > > > >
> > > > > > > > > > > > > *"What I want to know specifically, is how many
> > > messages
> > > > > per
> > > > > > > > second
> > > > > > > > > > > can I
> > > > > > > > > > > > > send from one side of my distributed system to the
> > > other
> > > > > side
> > > > > > > > with
> > > > > > > > > > > Apache
> > > > > > > > > > > > > Kafka."*
> > > > > > > > > > > > >
> > > > > > > > > > > > > And he concluded with:
> > > > > > > > > > > > >
> > > > > > > > > > > > > *"To give you an example, let's say I have 10
> million
> > > > > > messages
> > > > > > > > > that I
> > > > > > > > > > > > need
> > > > > > > > > > > > > to send from producers to consumers. Let's assume I
> > > have
> > > > 1
> > > > > > > > topic, 1
> > > > > > > > > > > > > producer for this topic, 4 partitions for this
> topic
> > > and
> > > > 4
> > > > > > > > > consumers,
> > > > > > > > > > > one
> > > > > > > > > > > > > for each partition. What I would like to know is:
> How
> > > > long
> > > > > is
> > > > > > > it
> > > > > > > > > > going
> > > > > > > > > > > to
> > > > > > > > > > > > > take for these 10 million messages to travel all
> the
> > > way
> > > > > from
> > > > > > > the
> > > > > > > > > > > > producer
> > > > > > > > > > > > > to the consumers? That's the throughput performance
> > > > number
> > > > > > I'm
> > > > > > > > > > > interested
> > > > > > > > > > > > > in."*
> > > > > > > > > > > > >
> > > > > > > > > > > > > I read in a reddit post yesterday (for some reason
> I
> > > > can't
> > > > > > find
> > > > > > > > the
> > > > > > > > > > > post
> > > > > > > > > > > > > anymore) that Kafka is able to handle 7 trillion
> > > messages
> > > > > per
> > > > > > > > day.
> > > > > > > > > > The
> > > > > > > > > > > > > LinkedIn article about it, says:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > *"We maintain over 100 Kafka clusters with more
> than
> > > > 4,000
> > > > > > > > brokers,
> > > > > > > > > > > which
> > > > > > > > > > > > > serve more than 100,000 topics and 7 million
> > > partitions.
> > > > > The
> > > > > > > > total
> > > > > > > > > > > number
> > > > > > > > > > > > > of messages handled by LinkedIn’s Kafka deployments
> > > > > recently
> > > > > > > > > > surpassed
> > > > > > > > > > > 7
> > > > > > > > > > > > > trillion per day."*
> > > > > > > > > > > > >
> > > > > > > > > > > > > The OP of the reddit post went on to say that
> > WhatsApp
> > > is
> > > > > > > > handling
> > > > > > > > > > > around
> > > > > > > > > > > > > 64 billion messages per day (740,000 msgs per sec x
> > 24
> > > x
> > > > > 60 x
> > > > > > > 60)
> > > > > > > > > and
> > > > > > > > > > > > that
> > > > > > > > > > > > > 7
> > > > > > > > > > > > > trillion for LinkedIn is a huge number, giving a
> > > whopping
> > > > > 81
> > > > > > > > > million
> > > > > > > > > > > > > messages per second for LinkedIn. But that doesn't
> > > matter
> > > > > for
> > > > > > > my
> > > > > > > > > > > > question.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 7 Trillion messages divided by 7 million partitions
> > > gives
> > > > > us
> > > > > > 1
> > > > > > > > > > million
> > > > > > > > > > > > > messages per day per partition. So to calculate the
> > > > > > throughput
> > > > > > > we
> > > > > > > > > do:
> > > > > > > > > > > > >
> > > > > > > > > > > > >     1 million divided by 60 divided by 60 divided
> by
> > 24
> > > > =>
> > > > > > *23
> > > > > > > > > > messages
> > > > > > > > > > > > per
> > > > > > > > > > > > > second per partition*
> > > > > > > > > > > > >
> > > > > > > > > > > > > We'll all agree that 23 messages per second per
> > > partition
> > > > > for
> > > > > > > > > > > throughput
> > > > > > > > > > > > > performance is very low, so I can't give this
> number
> > to
> > > > my
> > > > > > > > > potential
> > > > > > > > > > > > > client.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So my question is: *What number should I give to my
> > > > > potential
> > > > > > > > > > client?*
> > > > > > > > > > > > Note
> > > > > > > > > > > > > that he is a stubborn and strict bank CTO, so he
> > won't
> > > > take
> > > > > > any
> > > > > > > > > talk
> > > > > > > > > > > from
> > > > > > > > > > > > > me. He wants a mathematical answer using the
> > scientific
> > > > > > method.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Has anyone been in my shoes and can shed some light
> > on
> > > > this
> > > > > > > kafka
> > > > > > > > > > > > > throughput performance topic?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > >
> > > > > > > > > > > > > M. Queen
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > --
> > Israel Ekpo
> > Lead Instructor, IzzyAcademy.com
> > https://www.youtube.com/c/izzyacademy
> > https://izzyacademy.com/
> >
>

Re: Kafka performance when it comes to throughput

Reply via email to