date:20220106

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

2022-01-06 Thread Roger Kasinsky

Hi Luke, > The solution I can think of is to create only one partition for the topic. That would work, but then I lose the benefits of the partitions. > Or you can create 4 consumers in one group, to consume from 4 partitions. That works, too. That does not work, because I need only one consume

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

2022-01-06 Thread Israel Ekpo

Hi Roger, I am going to briefly add to what others have already stated. The recommendations made by Sunil and Luke are based on the fundamentals of how Kafka stores and organizes events as well as the retrieval mechanism of consumer groups. Without additional details about the objectives of your

Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Cheers from NYC! I'm trying to give a performance number to a potential client (from the financial market) who asked me the following question: *"If I have a Kafka system setup in the best way possible for performance, what is an approximate number that I can have in mind for the throughput of th

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

2022-01-06 Thread sunil chaudhari

hi Roger, What consumer u r using? Is there a chance to mention consumer threads? Example: logstash kafka consumer has configurable number of threads under each consumer instance. That may help up to some extent. Regards, Sunil. On Thu, 6 Jan 2022 at 7:27 PM, Roger Kasinsky wrote: > Hi Luke, >

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Israel Ekpo

Hi Marisa I think there may be some confusion about the throughput for each partition and I want to explain briefly using some analogies Using transportation for example if we were to pick an airline or ridesharing organization to describe the volume of customers they can support per day we would

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

2022-01-06 Thread Roger Kasinsky

Hi Israel, Thanks for your detailed explanation. I understand now that Kafka can't give me any guarantees with regards to ordering if my single consumer is consuming from multiple partitions. Hi Sunil, Thanks for the thread suggestion. However I don't think increasing or decreasing the number of

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Haruki Okada

Hi, Marisa. Kafka is well-designed to make full use of system resources, so I think calculating based on machine's spec is a good start. Let's say we have servers with 10Gbps full-duplex NIC. Also, let's say we set the topic's replication factor to 3 (so the cluster will have minimum 3 servers),

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Israel, Your email is great, but I'm afraid to forward it to my customer because it doesn't answer his question. I'm hoping that other members from this list will be able to give me a more NUMERIC answer, let's wait to see. Just to give you some follow up on your answer, when you say: > 30 p

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Joris Peeters

There's a few unknown parameters here that might influence the answer, though. From the top of my head, at least - How much replication of the data is needed (for high availability), and how many acks for the producer? (If fire-and-forget it can be faster, if need to replicate and ack from 3 broker

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Okada, Thanks for your reply. Finally I see some numbers! I love numbers :) I've shown your email to my boss (I hope he will hire me to do this project) and he said the following: "I would like to see this 833k/sec number for myself. Am I asking too much? :) Can you set up a very basic and si

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Joris, I've spoken to him. His answers are below: On Thu, Jan 6, 2022 at 1:37 PM Joris Peeters wrote: > There's a few unknown parameters here that might influence the answer, > though. From the top of my head, at least > - How much replication of the data is needed (for high availability),

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Joris Peeters

I'd just follow the instructions in https://kafka.apache.org/quickstart to set up Kafka and Zookeeper on a single node, by running the Java processes directly. Or can run in Docker. For the producer and consumer I'd personally use Python, as it's the easiest to get going. You may want to look at h

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Joris, Thank you so much. I plan to write a Java Consumer and a Java Producer, for my benchmark. Do you recommend an example that I can use as a reference to write my basic Java producer and simple Java consumer? I'll for sure share the through number I get with the community. Maybe even write

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Joris Peeters

These tutorials - though quite a bit outdated - seem quite useful: http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html (and the follow-ups). Ends up being close to how I write this in Java, and tutorial 13 talks about batching and acks etc, which you'll need in order to tune to max

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Joris, Thank you so much, friend! > I appreciate that setting up everything on localhost will be easier and lead to big numbers, but bear in mind that it's typically all the other real-life stuff (remote connections, replication, at-least once, ...) that causes massive slowdowns compared to lo

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Israel Ekpo

Marisa, I do not agree with your assessment. There are several factors that could influence your performance numbers even with localhost. Your project should be configured based on your own needs. Your throughput could go up or lower depending on how you are configured based on what is important

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Israel, > You can achieve any performance benchmark you are willing to pay for. Thanks for your email. Allow me to respectfully disagree. I believe that some systems are better than others when it comes to performance. The idea that I can just take a slow system, multiply by 1 million, and the

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Israel Ekpo

Thanks for your response Marisa. This has been a very interesting discussion and I appreciate it. It is a bit of a challenge in the sense that I wish I had a demo ready to go with similar use case and expectations to easily explain what I have been trying to convey I am always ready for a chall

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Alex Craig

Marisa, you might consider engaging someone at Confluent, maybe they can give you some case studies or whitepapers from similar use-cases in the financial industry. (and yes, Kafka is used in the financial industry) . A client asking you to "prove that Kafka performs/scales" seems like an unusual

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Wow, that's awesome! I wasn't expecting that. I truly appreciate your help and professionalism. > Let me find some time soon and I will do a video on that scenario optimized primarily for low latency and throughput. I will also compare how this performs when adjusted for durability and high availa

Re: Kafka performance when it comes to throughput

2022-01-06 Thread Marisa Queen

Hi Alex, > Furthermore, setting up a localhost pub/sub demo on a single machine (your laptop?) is so far removed from a real-world scenario I can't imagine how any numbers derived from that would be useful. I can't imagine either. That's why I'm planning to run this on a lab Linux machine with 8

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

Kafka performance when it comes to throughput

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

Re: Kafka performance when it comes to throughput

Re: About the order of messages when a single consumer is consuming messages from multiple partitions

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

Re: Kafka performance when it comes to throughput

21 matches

Site Navigation

Mail list logo

Footer information