Wow, that's awesome! I wasn't expecting that. I truly appreciate your help and professionalism.
> Let me find some time soon and I will do a video on that scenario optimized primarily for low latency and throughput. I will also compare how this performs when adjusted for durability and high availability. Take your time! That will be tremendously helpful. I was going to try to do that myself, but I'm sure that you have better expertise to tune the knobs for a more realistic and professional benchmark. I'm curious to see the numbers. Perhaps you can even start with the simplest of all setups: 1 producer, 1 topic, 1 partition, 1 consumer. 10 million messages flowing. What messages-per-second number do you get? Then move to 1 producer, 1 topic, 4 partitions, 1 consumer. Did it get better or worse with the addition of multiple partitions? Thanks again, Israel. Cheers, M. Queen On Thu, Jan 6, 2022 at 11:52 PM Israel Ekpo <israele...@gmail.com> wrote: > Thanks for your response Marisa. > > This has been a very interesting discussion and I appreciate it. > > It is a bit of a challenge in the sense that I wish I had a demo ready to > go with similar use case and expectations to easily explain what I have > been trying to convey > > I am always ready for a challenge like this and to fix this I will like to > do a demo soon with the 10 million message scenario you > originally mentioned in your first message to track the time end to end > while capturing other metrics. > > Let me find some time soon and I will do a video on that scenario optimized > primarily for low latency and throughput. I will also compare how this > performs when adjusted for durability and high availability > > I wish I had this demo ready before now it would have clarified a lot of > what I have been trying to explain regarding tuning the knobs for latency, > throughput, high availability and durability. By achieving what you are > willing to pay for I was only suggesting that highly performant system can > be expensive at times and I apologize if the tone came out wrong > > I am grateful that you brought this up and it will give the community > something to reference in the future of similar questions come up regarding > benchmarks > > Thanks for bringing up the question, please let us know if you have > additional questions and you can reach out with any further questions or > feedback you may have. > > Thanks again > > Sincerely > Israel Ekpo > > On Thu, Jan 6, 2022 at 9:18 PM Marisa Queen <marisa.queen...@gmail.com> > wrote: > > > Hi Israel, > > > > > You can achieve any performance benchmark you are willing to pay for. > > > > Thanks for your email. Allow me to respectfully disagree. I believe that > > some systems are better than others when it comes to performance. The > idea > > that I can just take a slow system, multiply by 1 million, and then I > have > > a super fast system, is at the least misleading. Assuming the same > hardware > > for everyone, some languages are faster than others. Some algorithms are > > faster than others. Some architectures are more efficient than others. > Some > > protocols are faster than others. > > > > Take a binary search vs a linear search for example. Binary search is of > > course much faster and more efficient than linear search (for large > lists), > > but according to your rationale this is not a problem. Just buy enough > > machines to do linear search in parallel and you can boast 1 million > > searches per second. What an amazing search system you are deploying! It > > can do 1 million searches per second, that's more than enough for any > > system. > > > > 7 TRILLION messages per day for Kafka/LinkedIn sounds amazing when just > > thrown on the table. Using your example, a transportation company can > > transport 5 packages per day using one of its bicycles. Is the > architecture > > of this company efficient? Fast? According to your rationale, it does not > > matter! The company needs only to buy 1 million bikes, and now it can > boast > > about delivering 5 million packages per day. You can say the company is a > > large corporation, but when it comes to efficiency it is more like a > > dinosaur. It has a high chance of being replaced by other more efficient > > companies in the future. > > > > To summarize, low latency is crucial for finance applications. You can't > > just say: "don't worry, it is proven and it can do 7 trillion messages > per > > day". That just won't do it. A ceiling benchmark number, for latency and > > throughput, is paramount for any system that wants to operate in that > > industry. The answer is not "as much as you are willing to pay for". > > > > Cheers, > > > > M. Queen > > > > > > On Thu, Jan 6, 2022 at 8:53 PM Israel Ekpo <israele...@gmail.com> wrote: > > > > > Marisa, > > > > > > I do not agree with your assessment. There are several factors that > could > > > influence your performance numbers even with localhost. Your project > > should > > > be configured based on your own needs. > > > > > > Your throughput could go up or lower depending on how you are > configured > > > based on what is important for your use case(s). > > > > > > If you have other apps running on the machine that would impact your > > > results. If you only have a 2 CPU, 4GB laptop, obviously you cannot > > compare > > > the results with a server that has 256GB of RAM and 64 Cores. > > > > > > Also, do not measure it in terms of messages per second but more in > terms > > > of data volume per second. A throughput of 100GBps will give you 100 > > > messages per second 1 GB per message or 100,000 messages per second at > > 1KB > > > each if you have smaller messages the same volume will give a higher > > count > > > of messages for the same unit time. > > > > > > Take a look at the reference architecture and this best practices > > document > > > for how to optimize your performance based on your project goals > > > (durability, latency, throughput and availability) > > > > > > Confluent Platform Reference Architecture - Confluent > > > < > > > > > > https://www.confluent.io/thank-you/resources/apache-kafka-confluent-enterprise-reference-architecture/ > > > > > > > Kafka Best Practices: Build, Monitor & Optimize Kafka in Confluent > Cloud > > > < > > > > > > https://www.confluent.io/thank-you/resources/recommendations-developers-using-confluent-cloud/ > > > > > > > > > > Everybody's scenario and use case will impact how they set up their > > > project. You cannot look at another project and use their numbers for > > your > > > own set up. That is generally a bad idea and the better answer is that > > you > > > will need to define your project objectives and then figure out what is > > > needed to achieve those goals. > > > > > > The better question is to take a look at what volume throughput, > > retention > > > policy and period as well as environment and then figure out the > capacity > > > planning necessary to support what you need. > > > > > > You can achieve any performance benchmark you are willing to pay for. I > > am > > > not a fan of just blinding copying other peoples numbers and using it > out > > > of context in benchmarks comparisons. > > > > > > Take a look at the capacity planner and sizing calculator to figure out > > > what hardware and infrastructure you need for your scenario > > > > > > Sizing Calculator for Apache Kafka and Confluent Platform ( > eventsizer.io > > ) > > > <https://eventsizer.io/> > > > > > > I hope this is more useful. > > > > > > > > > Israel Ekpo > > > Lead Instructor, IzzyAcademy.com > > > https://www.youtube.com/c/izzyacademy > > > https://izzyacademy.com/ > > > > > > > > > On Thu, Jan 6, 2022 at 6:07 PM Marisa Queen <marisa.queen...@gmail.com > > > > > wrote: > > > > > > > Hi Joris, > > > > > > > > Thank you so much, friend! > > > > > > > > > I appreciate that setting up everything on localhost will be easier > > and > > > > lead to big numbers, but bear in mind that it's typically all the > other > > > > real-life stuff (remote connections, replication, at-least once, ...) > > > that > > > > causes massive slowdowns compared to localhost > > > > > > > > Totally agree! But we must establish a ceiling first. If this > > > > super-good-loopback number doesn't look good, then one has no > business > > > > moving forward with Kafka to the more complex (and of course slower) > > > stuff. > > > > > > > > The purpose of the ceiling is that. It is your maximum ambition > > > represented > > > > by a number. You can't go any higher than that. At least with Kafka. > > > > > > > > Agree? > > > > > > > > Cheers, > > > > > > > > M. Queen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 3:51 PM Joris Peeters < > > joris.mg.peet...@gmail.com > > > > > > > > wrote: > > > > > > > > > These tutorials - though quite a bit outdated - seem quite useful: > > > > > > http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html > > > > (and > > > > > the follow-ups). > > > > > Ends up being close to how I write this in Java, and tutorial 13 > > talks > > > > > about batching and acks etc, which you'll need in order to tune to > > > > maximise > > > > > your throughput. > > > > > > > > > > I'm sure someone else has better example resources. > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 6:25 PM Marisa Queen < > > marisa.queen...@gmail.com > > > > > > > > > wrote: > > > > > > > > > > > Hi Joris, > > > > > > > > > > > > Thank you so much. I plan to write a Java Consumer and a Java > > > Producer, > > > > > for > > > > > > my benchmark. Do you recommend an example that I can use as a > > > reference > > > > > to > > > > > > write my basic Java producer and simple Java consumer? I'll for > > sure > > > > > share > > > > > > the through number I get with the community. Maybe even write a > > blog > > > > post > > > > > > about it. I hope it is more than 23 messages per second per > > partition > > > > > > :PPPPP > > > > > > > > > > > > Cheers, > > > > > > > > > > > > M. Queen > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 2:14 PM Joris Peeters < > > > > joris.mg.peet...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > I'd just follow the instructions in > > > > > https://kafka.apache.org/quickstart > > > > > > to > > > > > > > set up Kafka and Zookeeper on a single node, by running the > Java > > > > > > processes > > > > > > > directly. Or can run in Docker. > > > > > > > > > > > > > > For the producer and consumer I'd personally use Python, as > it's > > > the > > > > > > > easiest to get going. You may want to look at > > > > > > > https://kafka-python.readthedocs.io/en/master/# (easier) and > > > > > > > https://github.com/confluentinc/confluent-kafka-python > (faster). > > > > > Similar > > > > > > > things exist for Go, Java, C++, ... > > > > > > > Or I'm sure there are some benchmark setups out there that you > > can > > > > > tweak > > > > > > a > > > > > > > little. > > > > > > > > > > > > > > I appreciate that setting up everything on localhost will be > > easier > > > > and > > > > > > > lead to big numbers, but bear in mind that it's typically all > the > > > > other > > > > > > > real-life stuff (remote connections, replication, > at-least-once, > > > ...) > > > > > > that > > > > > > > causes massive slowdowns compared to localhost, and those are > > > things > > > > > > banks > > > > > > > eventually tend to need (I work in finance industry myself). > What > > > > > you're > > > > > > > doing is a very useful benchmark, but I'd surround it with the > > > above > > > > > > > caveats to avoid overpromising. > > > > > > > > > > > > > > -J > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 4:58 PM Marisa Queen < > > > > marisa.queen...@gmail.com > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Joris, > > > > > > > > > > > > > > > > I've spoken to him. His answers are below: > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 1:37 PM Joris Peeters < > > > > > > joris.mg.peet...@gmail.com > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > There's a few unknown parameters here that might influence > > the > > > > > > answer, > > > > > > > > > though. From the top of my head, at least > > > > > > > > > - How much replication of the data is needed (for high > > > > > availability), > > > > > > > and > > > > > > > > > how many acks for the producer? (If fire-and-forget it can > be > > > > > faster, > > > > > > > if > > > > > > > > > need to replicate and ack from 3 brokers in different DC's > > then > > > > > will > > > > > > be > > > > > > > > > slower) > > > > > > > > > > > > > > > > > > > > > > > > > Let's assume no high-availability for now, for simplicity's > > sake. > > > > > > > > Fire-and-forget like he said. We don't want to overcomplicate > > > this > > > > > > simple > > > > > > > > benchmark and we want the highest possible throughput number. > > > > > > > > > > > > > > > > > > > > > > > > > - Transactions? (If end-to-end exactly-once then it's a lot > > > > slower) > > > > > > > > > > > > > > > > > > > > > > > > > Again no transactions. Let's keep it simple. > > > > > > > > > > > > > > > > > > > > > > > > > - Size of the messages? (If each message is a GB it will > > > > obviously > > > > > be > > > > > > > > > slower) > > > > > > > > > > > > > > > > > > > > > > > > > Let's assume 512 bytes. Powers of two are fun! > > > > > > > > > > > > > > > > > > > > > > > > > - Distance and bandwidth between the producers, Kafka & the > > > > > > consumers? > > > > > > > > (If > > > > > > > > > the network links get saturated that would limit the > > > performance. > > > > > > > Latency > > > > > > > > > is likely less important than throughput, but if your > > consumers > > > > are > > > > > > in > > > > > > > > > Tokyo and the producer in London then it will likely also > be > > > > > slower) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Loopback, same machine, for the love of God. Let's not even > go > > > > there. > > > > > > We > > > > > > > > want the highest possible throughput. I accept the limit of > the > > > > speed > > > > > > of > > > > > > > > light. If network particularities, and distances, are to be > > > > included > > > > > in > > > > > > > > this measurement then it is basically worth nothing. Loopback > > > > > > eliminates > > > > > > > > all those network variables that we surely don't want to > > include > > > in > > > > > the > > > > > > > > benchmark. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > FWIW, I find that the producer side is generally the > limiting > > > > > factor, > > > > > > > > > especially if there is only one. > > > > > > > > > I'd take a look at e.g. the Appendix test details on > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.confluent.io/2.0.0/clients/librdkafka/INTRODUCTION_8md.html > > > > > > > > . > > > > > > > > > I > > > > > > > > > haven't yet seen a faster Kafka impl than rdkafka, so those > > > would > > > > > be > > > > > > > > > reasonable upper bounds. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for your reply, Joris. Can you point me to a Hello > World > > > > Kafka > > > > > > > > example, so I can set up this very basic and BARE BONES Kafka > > > > system, > > > > > > > > without any of the complications you correctly mentioned > > above? I > > > > > have > > > > > > 10 > > > > > > > > million messages that I need to send from producers to > > > consumers. I > > > > > > have > > > > > > > 1 > > > > > > > > topic, 1 producer for this topic, 4 partitions for this topic > > > and 4 > > > > > > > > consumers, one for each partition. Everything loopback, same > > > > machine, > > > > > > no > > > > > > > > high-availability, transactions, etc. just KAFKA BARE BONES. > > What > > > > can > > > > > > be > > > > > > > > more trivial and basic than that? > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > M. Queen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 4:25 PM Marisa Queen < > > > > > > marisa.queen...@gmail.com > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Israel, > > > > > > > > > > > > > > > > > > > > Your email is great, but I'm afraid to forward it to my > > > > customer > > > > > > > > because > > > > > > > > > it > > > > > > > > > > doesn't answer his question. > > > > > > > > > > > > > > > > > > > > I'm hoping that other members from this list will be able > > to > > > > give > > > > > > me > > > > > > > a > > > > > > > > > more > > > > > > > > > > NUMERIC answer, let's wait to see. > > > > > > > > > > > > > > > > > > > > Just to give you some follow up on your answer, when you > > say: > > > > > > > > > > > > > > > > > > > > > 30 passengers per driver or aircraft per day may not > > sound > > > > > > > impressive > > > > > > > > > but > > > > > > > > > > 750,000 passengers per day all together is how you should > > > look > > > > at > > > > > > it > > > > > > > > > > > > > > > > > > > > Well, with this rationality one can come up with any > > desired > > > > > > > throughput > > > > > > > > > > number by just adding more partitions. Do you see my > > customer > > > > > point > > > > > > > > that > > > > > > > > > > this does not make any sense? Adding more partitions also > > > does > > > > > not > > > > > > > come > > > > > > > > > for > > > > > > > > > > free, because messages need to be separated into the > newly > > > > > created > > > > > > > > > > partition and ordering will be lost. Order is important > for > > > > some > > > > > > > > > messages, > > > > > > > > > > so to keep adding more partitions towards an infinite > > > > throughput > > > > > is > > > > > > > not > > > > > > > > > an > > > > > > > > > > option. > > > > > > > > > > > > > > > > > > > > I've just spoken to him here, his reply was: > > > > > > > > > > > > > > > > > > > > "Marisa, I'm asking a very simple question for a very > basic > > > > Kafka > > > > > > > > > scenario. > > > > > > > > > > If I can't get an answer for that, then I'm in trouble. > Can > > > you > > > > > > > please > > > > > > > > > find > > > > > > > > > > out with your peers/community what is a good throughput > > > number > > > > to > > > > > > > have > > > > > > > > in > > > > > > > > > > mind for the scenario I've been describing. Again it is a > > > very > > > > > > basic > > > > > > > > and > > > > > > > > > > simple scenario: I have 10 million messages that I need > to > > > send > > > > > > from > > > > > > > > > > producers to consumers. Let's assume I have 1 topic, 1 > > > producer > > > > > for > > > > > > > > this > > > > > > > > > > topic, 4 partitions for this topic and 4 consumers, one > for > > > > each > > > > > > > > > partition. > > > > > > > > > > What I would like to know is: How long is it going to > take > > > for > > > > > > these > > > > > > > 10 > > > > > > > > > > million messages to travel all the way from the producer > to > > > the > > > > > > > > > consumers? > > > > > > > > > > That's the throughput performance number I'm interested > > in." > > > > > > > > > > > > > > > > > > > > I surely won't tell him: "Hey, that's easy, you have 4 > > > > > partitions, > > > > > > > each > > > > > > > > > > partition according to LinkedIn can handle 23 messages > per > > > > > second, > > > > > > so > > > > > > > > we > > > > > > > > > > are looking for a 92 messages per second throughput > here!" > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > M. Queen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 12:58 PM Israel Ekpo < > > > > > israele...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Marisa > > > > > > > > > > > > > > > > > > > > > > I think there may be some confusion about the > throughput > > > for > > > > > each > > > > > > > > > > partition > > > > > > > > > > > and I want to explain briefly using some analogies > > > > > > > > > > > > > > > > > > > > > > Using transportation for example if we were to pick an > > > > airline > > > > > or > > > > > > > > > > > ridesharing organization to describe the volume of > > > customers > > > > > they > > > > > > > can > > > > > > > > > > > support per day we would have to look at how many total > > > > > customers > > > > > > > can > > > > > > > > > > > American Airlines service in a day or how many > customers > > > can > > > > > Uber > > > > > > > or > > > > > > > > > Lyft > > > > > > > > > > > serve in a day. We would not zero in on only the number > > of > > > > > > > customers > > > > > > > > a > > > > > > > > > > > particular driver can service or the number of > passengers > > > are > > > > > > > > > particular > > > > > > > > > > > aircraft than service in a day. That would be very > > limiting > > > > > > > > considering > > > > > > > > > > the > > > > > > > > > > > hundreds of thousands of aircrafts or drivers actively > > > > > > transporting > > > > > > > > > > > passengers in real time. > > > > > > > > > > > > > > > > > > > > > > 30 passengers per driver or aircraft per day may not > > sound > > > > > > > impressive > > > > > > > > > but > > > > > > > > > > > 750,000 passengers per day all together is how you > should > > > > look > > > > > at > > > > > > > it > > > > > > > > > > > > > > > > > > > > > > Partitions in Kafka are just a logical unit for > > organizing > > > > and > > > > > > > > storing > > > > > > > > > > data > > > > > > > > > > > within a Kafka topic. You should not base your analysis > > on > > > > just > > > > > > > what > > > > > > > > a > > > > > > > > > > > subunit of storage is able to support. > > > > > > > > > > > > > > > > > > > > > > I would recommend taking a look at Kafka Summit talks > on > > > > > > > performance > > > > > > > > > and > > > > > > > > > > > benchmarks to get some understanding how what Kafka is > > able > > > > to > > > > > do > > > > > > > and > > > > > > > > > the > > > > > > > > > > > applicable use cases in the Financial Services industry > > > > > > > > > > > > > > > > > > > > > > A lot of reputable organizations already trust Kafka > > today > > > > for > > > > > > > their > > > > > > > > > > needs > > > > > > > > > > > so this is already proven > > > > > > > > > > > > > > > > > > > > > > https://kafka.apache.org/powered-by > > > > > > > > > > > > > > > > > > > > > > I hope this helps. > > > > > > > > > > > > > > > > > > > > > > Israel Ekpo > > > > > > > > > > > Lead Instructor, IzzyAcademy.com > > > > > > > > > > > https://www.youtube.com/c/izzyacademy > > > > > > > > > > > https://izzyacademy.com/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Jan 6, 2022 at 10:01 AM Marisa Queen < > > > > > > > > > marisa.queen...@gmail.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Cheers from NYC! > > > > > > > > > > > > > > > > > > > > > > > > I'm trying to give a performance number to a > potential > > > > client > > > > > > > (from > > > > > > > > > the > > > > > > > > > > > > financial market) who asked me the following > question: > > > > > > > > > > > > > > > > > > > > > > > > *"If I have a Kafka system setup in the best way > > possible > > > > for > > > > > > > > > > > performance, > > > > > > > > > > > > what is an approximate number that I can have in mind > > for > > > > the > > > > > > > > > > throughput > > > > > > > > > > > of > > > > > > > > > > > > this system?"* > > > > > > > > > > > > > > > > > > > > > > > > The client proceeded to say: > > > > > > > > > > > > > > > > > > > > > > > > *"What I want to know specifically, is how many > > messages > > > > per > > > > > > > second > > > > > > > > > > can I > > > > > > > > > > > > send from one side of my distributed system to the > > other > > > > side > > > > > > > with > > > > > > > > > > Apache > > > > > > > > > > > > Kafka."* > > > > > > > > > > > > > > > > > > > > > > > > And he concluded with: > > > > > > > > > > > > > > > > > > > > > > > > *"To give you an example, let's say I have 10 million > > > > > messages > > > > > > > > that I > > > > > > > > > > > need > > > > > > > > > > > > to send from producers to consumers. Let's assume I > > have > > > 1 > > > > > > > topic, 1 > > > > > > > > > > > > producer for this topic, 4 partitions for this topic > > and > > > 4 > > > > > > > > consumers, > > > > > > > > > > one > > > > > > > > > > > > for each partition. What I would like to know is: How > > > long > > > > is > > > > > > it > > > > > > > > > going > > > > > > > > > > to > > > > > > > > > > > > take for these 10 million messages to travel all the > > way > > > > from > > > > > > the > > > > > > > > > > > producer > > > > > > > > > > > > to the consumers? That's the throughput performance > > > number > > > > > I'm > > > > > > > > > > interested > > > > > > > > > > > > in."* > > > > > > > > > > > > > > > > > > > > > > > > I read in a reddit post yesterday (for some reason I > > > can't > > > > > find > > > > > > > the > > > > > > > > > > post > > > > > > > > > > > > anymore) that Kafka is able to handle 7 trillion > > messages > > > > per > > > > > > > day. > > > > > > > > > The > > > > > > > > > > > > LinkedIn article about it, says: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > *"We maintain over 100 Kafka clusters with more than > > > 4,000 > > > > > > > brokers, > > > > > > > > > > which > > > > > > > > > > > > serve more than 100,000 topics and 7 million > > partitions. > > > > The > > > > > > > total > > > > > > > > > > number > > > > > > > > > > > > of messages handled by LinkedIn’s Kafka deployments > > > > recently > > > > > > > > > surpassed > > > > > > > > > > 7 > > > > > > > > > > > > trillion per day."* > > > > > > > > > > > > > > > > > > > > > > > > The OP of the reddit post went on to say that > WhatsApp > > is > > > > > > > handling > > > > > > > > > > around > > > > > > > > > > > > 64 billion messages per day (740,000 msgs per sec x > 24 > > x > > > > 60 x > > > > > > 60) > > > > > > > > and > > > > > > > > > > > that > > > > > > > > > > > > 7 > > > > > > > > > > > > trillion for LinkedIn is a huge number, giving a > > whopping > > > > 81 > > > > > > > > million > > > > > > > > > > > > messages per second for LinkedIn. But that doesn't > > matter > > > > for > > > > > > my > > > > > > > > > > > question. > > > > > > > > > > > > > > > > > > > > > > > > 7 Trillion messages divided by 7 million partitions > > gives > > > > us > > > > > 1 > > > > > > > > > million > > > > > > > > > > > > messages per day per partition. So to calculate the > > > > > throughput > > > > > > we > > > > > > > > do: > > > > > > > > > > > > > > > > > > > > > > > > 1 million divided by 60 divided by 60 divided by > 24 > > > => > > > > > *23 > > > > > > > > > messages > > > > > > > > > > > per > > > > > > > > > > > > second per partition* > > > > > > > > > > > > > > > > > > > > > > > > We'll all agree that 23 messages per second per > > partition > > > > for > > > > > > > > > > throughput > > > > > > > > > > > > performance is very low, so I can't give this number > to > > > my > > > > > > > > potential > > > > > > > > > > > > client. > > > > > > > > > > > > > > > > > > > > > > > > So my question is: *What number should I give to my > > > > potential > > > > > > > > > client?* > > > > > > > > > > > Note > > > > > > > > > > > > that he is a stubborn and strict bank CTO, so he > won't > > > take > > > > > any > > > > > > > > talk > > > > > > > > > > from > > > > > > > > > > > > me. He wants a mathematical answer using the > scientific > > > > > method. > > > > > > > > > > > > > > > > > > > > > > > > Has anyone been in my shoes and can shed some light > on > > > this > > > > > > kafka > > > > > > > > > > > > throughput performance topic? > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > > > > > M. Queen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > Israel Ekpo > Lead Instructor, IzzyAcademy.com > https://www.youtube.com/c/izzyacademy > https://izzyacademy.com/ >