Re: Kafka performance when it comes to throughput

Joris Peeters Thu, 06 Jan 2022 09:13:58 -0800

I'd just follow the instructions in https://kafka.apache.org/quickstart to
set up Kafka and Zookeeper on a single node, by running the Java processes
directly. Or can run in Docker.


For the producer and consumer I'd personally use Python, as it's the
easiest to get going. You may want to look at
https://kafka-python.readthedocs.io/en/master/# (easier) and
https://github.com/confluentinc/confluent-kafka-python (faster). Similar
things exist for Go, Java, C++, ...
Or I'm sure there are some benchmark setups out there that you can tweak a
little.

I appreciate that setting up everything on localhost will be easier and
lead to big numbers, but bear in mind that it's typically all the other
real-life stuff (remote connections, replication, at-least-once, ...) that
causes massive slowdowns compared to localhost, and those are things banks
eventually tend to need (I work in finance industry myself). What you're
doing is a very useful benchmark, but I'd surround it with the above
caveats to avoid overpromising.

-J


On Thu, Jan 6, 2022 at 4:58 PM Marisa Queen <marisa.queen...@gmail.com>
wrote:

> Hi Joris,
>
> I've spoken to him. His answers are below:
>
>
> On Thu, Jan 6, 2022 at 1:37 PM Joris Peeters <joris.mg.peet...@gmail.com>
> wrote:
>
> > There's a few unknown parameters here that might influence the answer,
> > though. From the top of my head, at least
> > - How much replication of the data is needed (for high availability), and
> > how many acks for the producer? (If fire-and-forget it can be faster, if
> > need to replicate and ack from 3 brokers in different DC's then will be
> > slower)
> >
>
> Let's assume no high-availability for now, for simplicity's sake.
> Fire-and-forget like he said. We don't want to overcomplicate this simple
> benchmark and we want the highest possible throughput number.
>
>
> > - Transactions? (If end-to-end exactly-once then it's a lot slower)
> >
>
> Again no transactions. Let's keep it simple.
>
>
> > - Size of the messages? (If each message is a GB it will obviously be
> > slower)
> >
>
> Let's assume 512 bytes. Powers of two are fun!
>
>
> > - Distance and bandwidth between the producers, Kafka & the consumers?
> (If
> > the network links get saturated that would limit the performance. Latency
> > is likely less important than throughput, but if your consumers are in
> > Tokyo and the producer in London then it will likely also be slower)
> >
>
>
> Loopback, same machine, for the love of God. Let's not even go there. We
> want the highest possible throughput. I accept the limit of the speed of
> light. If network particularities, and distances, are to be included in
> this measurement then it is basically worth nothing. Loopback eliminates
> all those network variables that we surely don't want to include in the
> benchmark.
>
>
> >
> > FWIW, I find that the producer side is generally the limiting factor,
> > especially if there is only one.
> > I'd take a look at e.g. the Appendix test details on
> > https://docs.confluent.io/2.0.0/clients/librdkafka/INTRODUCTION_8md.html
> .
> > I
> > haven't yet seen a faster Kafka impl than rdkafka, so those would be
> > reasonable upper bounds.
> >
>
>
> Thanks for your reply, Joris. Can you point me to a Hello World Kafka
> example, so I can set up this very basic and BARE BONES Kafka system,
> without any of the complications you correctly mentioned above? I have 10
> million messages that I need to send from producers to consumers. I have 1
> topic, 1 producer for this topic, 4 partitions for this topic and 4
> consumers, one for each partition. Everything loopback, same machine, no
> high-availability, transactions, etc. just KAFKA BARE BONES. What can be
> more trivial and basic than that?
>
> Cheers,
>
> M. Queen
>
>
> >
> > On Thu, Jan 6, 2022 at 4:25 PM Marisa Queen <marisa.queen...@gmail.com>
> > wrote:
> >
> > > Hi Israel,
> > >
> > > Your email is great, but I'm afraid to forward it to my customer
> because
> > it
> > > doesn't answer his question.
> > >
> > > I'm hoping that other members from this list will be able to give me a
> > more
> > > NUMERIC answer, let's wait to see.
> > >
> > > Just to give you some follow up on your answer, when you say:
> > >
> > > > 30 passengers per driver or aircraft per day may not sound impressive
> > but
> > > 750,000 passengers per day all together is how you should look at it
> > >
> > > Well, with this rationality one can come up with any desired throughput
> > > number by just adding more partitions. Do you see my customer point
> that
> > > this does not make any sense? Adding more partitions also does not come
> > for
> > > free, because messages need to be separated into the newly created
> > > partition and ordering will be lost. Order is important for some
> > messages,
> > > so to keep adding more partitions towards an infinite throughput is not
> > an
> > > option.
> > >
> > > I've just spoken to him here, his reply was:
> > >
> > > "Marisa, I'm asking a very simple question for a very basic Kafka
> > scenario.
> > > If I can't get an answer for that, then I'm in trouble. Can you please
> > find
> > > out with your peers/community what is a good throughput number to have
> in
> > > mind for the scenario I've been describing. Again it is a very basic
> and
> > > simple scenario: I have 10 million messages that I need to send from
> > > producers to consumers. Let's assume I have 1 topic, 1 producer for
> this
> > > topic, 4 partitions for this topic and 4 consumers, one for each
> > partition.
> > > What I would like to know is: How long is it going to take for these 10
> > > million messages to travel all the way from the producer to the
> > consumers?
> > > That's the throughput performance number I'm interested in."
> > >
> > > I surely won't tell him: "Hey, that's easy, you have 4 partitions, each
> > > partition according to LinkedIn can handle 23 messages per second, so
> we
> > > are looking for a 92 messages per second throughput here!"
> > >
> > > Cheers,
> > >
> > > M. Queen
> > >
> > >
> > > On Thu, Jan 6, 2022 at 12:58 PM Israel Ekpo <israele...@gmail.com>
> > wrote:
> > >
> > > > Hi Marisa
> > > >
> > > > I think there may be some confusion about the throughput for each
> > > partition
> > > > and I want to explain briefly using some analogies
> > > >
> > > > Using transportation for example if we were to pick an airline or
> > > > ridesharing organization to describe the volume of customers they can
> > > > support per day we would have to look at how many total customers can
> > > > American Airlines service in a day or how many customers can Uber or
> > Lyft
> > > > serve in a day. We would not zero in on only the number of customers
> a
> > > > particular driver can service or the number of passengers are
> > particular
> > > > aircraft than service in a day. That would be very limiting
> considering
> > > the
> > > > hundreds of thousands of aircrafts or drivers actively transporting
> > > > passengers in real time.
> > > >
> > > > 30 passengers per driver or aircraft per day may not sound impressive
> > but
> > > > 750,000 passengers per day all together is how you should look at it
> > > >
> > > > Partitions in Kafka are just a logical unit for organizing and
> storing
> > > data
> > > > within a Kafka topic. You should not base your analysis on just what
> a
> > > > subunit of storage is able to support.
> > > >
> > > > I would recommend taking a look at Kafka Summit talks on performance
> > and
> > > > benchmarks to get some understanding how what Kafka is able to do and
> > the
> > > > applicable use cases in the Financial Services industry
> > > >
> > > > A lot of reputable organizations already trust Kafka today for their
> > > needs
> > > > so this is already proven
> > > >
> > > > https://kafka.apache.org/powered-by
> > > >
> > > > I hope this helps.
> > > >
> > > > Israel Ekpo
> > > > Lead Instructor, IzzyAcademy.com
> > > > https://www.youtube.com/c/izzyacademy
> > > > https://izzyacademy.com/
> > > >
> > > >
> > > > On Thu, Jan 6, 2022 at 10:01 AM Marisa Queen <
> > marisa.queen...@gmail.com>
> > > > wrote:
> > > >
> > > > > Cheers from NYC!
> > > > >
> > > > > I'm trying to give a performance number to a potential client (from
> > the
> > > > > financial market) who asked me the following question:
> > > > >
> > > > > *"If I have a Kafka system setup in the best way possible for
> > > > performance,
> > > > > what is an approximate number that I can have in mind for the
> > > throughput
> > > > of
> > > > > this system?"*
> > > > >
> > > > > The client proceeded to say:
> > > > >
> > > > > *"What I want to know specifically, is how many messages per second
> > > can I
> > > > > send from one side of my distributed system to the other side with
> > > Apache
> > > > > Kafka."*
> > > > >
> > > > > And he concluded with:
> > > > >
> > > > > *"To give you an example, let's say I have 10 million messages
> that I
> > > > need
> > > > > to send from producers to consumers. Let's assume I have 1 topic, 1
> > > > > producer for this topic, 4 partitions for this topic and 4
> consumers,
> > > one
> > > > > for each partition. What I would like to know is: How long is it
> > going
> > > to
> > > > > take for these 10 million messages to travel all the way from the
> > > > producer
> > > > > to the consumers? That's the throughput performance number I'm
> > > interested
> > > > > in."*
> > > > >
> > > > > I read in a reddit post yesterday (for some reason I can't find the
> > > post
> > > > > anymore) that Kafka is able to handle 7 trillion messages per day.
> > The
> > > > > LinkedIn article about it, says:
> > > > >
> > > > >
> > > > > *"We maintain over 100 Kafka clusters with more than 4,000 brokers,
> > > which
> > > > > serve more than 100,000 topics and 7 million partitions. The total
> > > number
> > > > > of messages handled by LinkedIn’s Kafka deployments recently
> > surpassed
> > > 7
> > > > > trillion per day."*
> > > > >
> > > > > The OP of the reddit post went on to say that WhatsApp is handling
> > > around
> > > > > 64 billion messages per day (740,000 msgs per sec x 24 x 60 x 60)
> and
> > > > that
> > > > > 7
> > > > > trillion for LinkedIn is a huge number, giving a whopping 81
> million
> > > > > messages per second for LinkedIn. But that doesn't matter for my
> > > > question.
> > > > >
> > > > > 7 Trillion messages divided by 7 million partitions gives us 1
> > million
> > > > > messages per day per partition. So to calculate the throughput we
> do:
> > > > >
> > > > >     1 million divided by 60 divided by 60 divided by 24 => *23
> > messages
> > > > per
> > > > > second per partition*
> > > > >
> > > > > We'll all agree that 23 messages per second per partition for
> > > throughput
> > > > > performance is very low, so I can't give this number to my
> potential
> > > > > client.
> > > > >
> > > > > So my question is: *What number should I give to my potential
> > client?*
> > > > Note
> > > > > that he is a stubborn and strict bank CTO, so he won't take any
> talk
> > > from
> > > > > me. He wants a mathematical answer using the scientific method.
> > > > >
> > > > > Has anyone been in my shoes and can shed some light on this kafka
> > > > > throughput performance topic?
> > > > >
> > > > > Cheers,
> > > > >
> > > > > M. Queen
> > > > >
> > > >
> > >
> >
>

Re: Kafka performance when it comes to throughput

Reply via email to