Re: Kafka partition problem

2023-02-21 Thread Brebner, Paul
Hi, I have a few blogs on Kafka partitions performance for background if you are interested: 1. https://www.instaclustr.com/blog/the-power-of-kafka-partitions-how-to-get-the-most-out-of-your-kafka-cluster/ The Power of Apache Kafka® Partitions: How to Get the Most out of Your Kafka Cluster<

Re: Kafka for IoT ingestion pipeline

2023-02-26 Thread Brebner, Paul
bW-E0XiPhBxvmC0s-I]<https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/> A Complete Guide to Apache Kafka for Developers (or, everything I know about Kafka in one place)<https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-eve

Re: Kafka Cluster WITHOUT Zookeeper

2023-03-27 Thread Brebner, Paul
I have a recent 3 part blog series on Kraft (expanded version of ApacheCon 2022 talk): https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-1-partitions-and-data-performance/ https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-2-partitions-an

Re: Kafka Cluster WITHOUT Zookeeper

2023-03-28 Thread Brebner, Paul
nd in Kafka 3.5.0. Cheers, David On Mon, Mar 27, 2023 at 6:32 PM Brebner, Paul wrote: > I have a recent 3 part blog series on Kraft (expanded version of ApacheCon > 2022 talk): > > > > > https://www.instaclustr.com/blog/apache-kafka-kraft-abandons-the-zookeeper-part-1-parti

Re: User = partitions

2023-05-09 Thread Brebner, Paul
Hi – as each topic can have 1 or more consumers (depending on the number of partitions), it’s certainly possible to have a 1-1 relationship between topics and consumers. The downside of this is typically that if the single consumer fails (and you don’t have some automatic approach to restart it

Re: Data Stream Processing applications testing

2023-05-22 Thread Brebner, Paul
Hi Alexandre, looks interesting. Would you consider submitting something (related to performance) to the Community over Code Performance Engineering track please? https://www.linkedin.com/pulse/call-papers-2nd-performance-engineering-track-over-code-brebner/ Thanks, Paul From: Alexandre Strapa

Re: Patterns for generating ordered streams

2023-05-29 Thread Brebner, Paul
Hi Edvard, interesting problem – I’ve had similar problems with high fan out use cases, but only for demo applications where I’m more interested in scale than order – e.g. have a look at this list of blogs, examples include Anomalia Machina for Kafka+Cassandra, and Kongo, Kafka+IoT. https://www

Re: Patterns for generating ordered streams

2023-05-29 Thread Brebner, Paul
Oh, the Kafka parallel consumer may help potentially? https://www.instaclustr.com/blog/improving-apache-kafka-performance-and-scalability-with-the-parallel-consumer-part-2/ Paul From: Edvard Fagerholm Date: Tuesday, 30 May 2023 at 6:55 am To: users@kafka.apache.org Subject: Patterns for gener

CFP for the 2nd Performance Engineering track at Community Over Code NA 2023

2023-07-02 Thread Brebner, Paul
more ideas and links including the CPF submission page: https://www.linkedin.com/pulse/call-papers-2nd-performance-engineering-track-over-code-brebner/ - Paul Brebner and Roger Abelenda

Re: Franz Kafka 100th

2024-06-04 Thread Brebner, Paul
Yes I found out just in time to mention this during my Community over Code talk today, paul From: Edoardo Comar Date: Monday, 3 June 2024 at 2:50 PM To: users@kafka.apache.org Subject: Franz Kafka 100th [You don't often get email from eco...@uk.ibm.com. Learn why this is important at https://a

Re: Kafka 20k topics metadata update taking long time

2024-07-03 Thread Brebner, Paul
Hi – interesting, I had maybe similar problems today when “testing” the limits of a Kafka cluster for max partitions – I could create a topic with lots of partitions (ok so more than sensible, taking into account RF=3 over 1M partitions) – but trying to send a message failed with a meta-data tim

Re: Kafka 20k topics metadata update taking long time

2024-07-04 Thread Brebner, Paul
dedicated Kraft controllers. Curious if there is a time out setting somewhere for the client meta-data request? Paul [cid:image001.png@01DACEDE.1DAE3A70] From: Brebner, Paul Date: Thursday, 4 July 2024 at 3:44 PM To: users@kafka.apache.org Subject: Re: Kafka 20k topics metadata update taking

Re: Kafka 20k topics metadata update taking long time

2024-07-04 Thread Brebner, Paul
OK so repeating with Java Kafka producer there is no problem – it’s specific to the Kafka CLI Producer! Paul From: Brebner, Paul Date: Friday, 5 July 2024 at 1:21 PM To: users@kafka.apache.org Subject: Re: Kafka 20k topics metadata update taking long time EXTERNAL EMAIL - USE CAUTION when

Error creating a topic with 10k partitions but not altering existing topic to 10k partitions? Why

2024-07-10 Thread Brebner, Paul
Hi – just curious if anyone can suggest why the following occurs: 1 – try to create a topic with 10,000 partitions with Kafka CLI (kafka-topics.sh) Fails with ERROR org.apache.kafka.common.errors.PolicyViolationException: Unable to perform excessively large batch operation. 2- create a topic wi

Re: Error creating a topic with 10k partitions but not altering existing topic to 10k partitions? Why

2024-07-11 Thread Brebner, Paul
duced by KIP-599, it may be triggered for non-existing topics but not for the existing resources. Hope this help you track it down. OSB On Thu, Jul 11, 2024, 08:04 Brebner, Paul wrote: > Hi – just curious if anyone can suggest why the following occurs: > > 1 – try to create a top

Re: Q: Does Kafka log level affect performance and latency

2024-11-01 Thread Brebner, Paul
Hi Om, I asked some of our techops people about this, and their general advice is that increasing the log level from the default (INFO?) is likely to increase the I/O (the amount depending on a variety of factors including the cluster traffic etc) – and my take on this is that assuming there is

Re: Q: Does Kafka log level affect performance and latency

2024-11-01 Thread Brebner, Paul
And the flip side, if the cluster is heavily loaded then increasing log levels is likely to have a detectable impact on performance! Paul From: Brebner, Paul Date: Saturday, 2 November 2024 at 9:50 am To: users@kafka.apache.org , om22sh...@gmail.com Subject: Re: Q: Does Kafka log level affect

Re: Schema Registry on bare metal

2024-11-27 Thread Brebner, Paul
We recommend (and also provide as a managed service) the open source schema registry Karapace, https://www.karapace.io/ Looking at the docs here https://github.com/Aiven-Open/karapace it looks like it can be installed with Docker or a “Source install” which may work for you? Regards, Paul Brebne

Re: Schema Registry - serving multiple Kafka clusters

2025-01-07 Thread Brebner, Paul
Hi Karan, good question! I’ve asked our Kafka dev team and they think it may be possible in theory, but depending on if you are using a managed Kafka service maybe not supported – e.g. NetApp Instaclustr managed Kafka supports Karapace, but not with multiple clusters. Good luck, Paul Brebner F

Re: How to scale a Kafka Cluster, what all should we consider

2025-02-03 Thread Brebner, Paul
than 87, you have recommended. wrt to the number of partitions 35 seems to be OK as our consumers should rarely scale beyond 10 or something like that. Please let me know if this sounds OK given our current utilization rates. Thanks Sachin On Mon, Feb 3, 2025 at 7:51 AM Brebner, Paul wrote:

Re: How to scale a Kafka Cluster, what all should we consider

2025-02-02 Thread Brebner, Paul
Hi Sachin, I’m not an “operational” Kafka person but do have some limited experience with Kafka benchmarking etc, so here are a few ideas. I’m playing around with a Kafka tiered storage sizing model at present, designed to predict min IO and/or network with local and tiered storage enabled. Th

Re: Explicitly creating topology topics in a streams app

2024-11-21 Thread Brebner, Paul
Hi John, I’m not a Kafka streams expert but have experimented a few times – I recall that Kafka Streams does need to create/use “internal topics” – and security has to be set on clients correctly from memory. This may help? https://kafka.apache.org/23/documentation/streams/developer-guide/manag

Re: JoinGroup API response timing.

2025-01-22 Thread Brebner, Paul
Hi – short answer is consumers can read from a specific partition, but in general for a consumer group you want to balance the partitions across the available consumers for high throughput – if a consumer fails or is kicked off the group because it times out etc then the remainder of the consume

Re: Random access to kafka messages

2025-01-27 Thread Brebner, Paul
Sounds “interesting” – in theory it could work, just remember that segment size will impact latency – records are stored in segments on local/remote storage (with tiering enabled), bigger segments improve throughput, but smaller segments may improve read latency, Paul Brebner From: Greg Harris

Re: JoinGroup API response timing.

2025-01-30 Thread Brebner, Paul
em > be saved locally by consumer or Redis, etc. > > The concept of consumer group shifts the responsibility of partition > assignment to broker because only broker knows the number of partitions. > > Best regards. > > On Thu, 23 Jan, 2025, 06:25 Brebner, Paul, .invalid> > w

Re: Under replicated partition

2025-02-12 Thread Brebner, Paul
Hi Ernar, I don’t think anyone responded yet so here’s my 2 cents worth (I’m not a Kafka ops expert, but I did ask our Kafka techops people – the following are suggestions however, not professional advice – which we do also offer 😉): Looks like there is more traffic at nigh and cluster struggles

Re: Under replicated partition

2025-02-12 Thread Brebner, Paul
And after a 2nd (and 3rd) opinion it does look like replica.lag.time.max.ms is below the default value, so maybe try increasing as a first step, Paul From: Brebner, Paul Date: Thursday, 13 February 2025 at 2:33 pm To: users@kafka.apache.org Subject: Re: Under replicated partition EXTERNAL

CFP for Community over Code NA 2025 is still open

2025-04-13 Thread Brebner, Paul
Hi Kafka people! There's still time (7 days and counting down) to submit talks for Community Over Code NA 2025 - if you have something performance related here's the CFP for the 7th C/C Performance Engineering track https://lnkd.in/gR5wv3RD (it doesn’t need to be Kafka specific), Regards, Paul

Re: Schema Registry options for Strimzi Kafka

2025-04-12 Thread Brebner, Paul
Hi Omer, We (NetApp Instaclustr) are pretty happy with Karapace (we contributed to it), and provide it as a managed service along with our Apache Kafka offering. I did a blog series on it a while back if you are interested in reading about my experiences with it, starting here with part 1 http

Re: Documentation and meaning of configuration 'retention.bytes'

2025-02-25 Thread Brebner, Paul
Well spotted I think – I was briefly puzzled with the time retention behaviour, as segments seemed to live longer than advertised – until I realised it was min time, deletion is lazy – can occur at some (distant?) time in the future (and is async I think) – this was particularly noticeable for

KIP-429 vs. KIP 848?

2025-06-16 Thread Brebner, Paul
Hi all, time for me to ask a silly question please! I'm puzzled about the transition from KIP-429 https://cwiki.apache.org/confluence/display/KAFKA/KIP-429%3A+Kafka+Consumer+Incremental+Rebalance+Protocol to KIP-848 https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generatio

Re: KIP-429 vs. KIP 848?

2025-06-16 Thread Brebner, Paul
Sorry error: 429 appeared in Kafka 2.4.0, and 848 appeared in 3.7 Paul From: Brebner, Paul Date: Tuesday, 17 June 2025 at 12:32 pm To: Kafka Users , dev Subject: KIP-429 vs. KIP 848? EXTERNAL EMAIL - USE CAUTION when clicking links or attachments Hi all, time for me to ask a silly

Re: KafkaProducer partitionsFor v/s KafkaAdminClient describeTopics

2025-06-11 Thread Brebner, Paul
Hi Anana, Typically in Kafka, it is useful for Consumers to know about the number of partitions (as the number of consumers must be <= partitions). So one way for Consumers to find partitions is the KafkaConsumer class using the partitionsFor(topic) method, https://kafka.apache.org/40/javadoc/o

Re: 回复:KIP-429 vs. KIP 848?

2025-07-06 Thread Brebner, Paul
1...@qq.com   -- 原始邮件 -- 发件人: "Brebner, Paul"https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-429*3A*Kafka*Consumer*Incremental*Rebalance*Protocol**BKIP-848https:/*cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Gener

Kafka Queues (KIP-932) and Kafka Connect?

2025-07-23 Thread Brebner, Paul
Hi, Curious if anyone has thought about using Kafka Queues (KIP-932) with Kafka Connect in the future? I.e. autoscaling connect tasks? There could be some Connect use cases where order isn’t as important as keeping up with load spikes, maybe. Regards, Paul Brebner NetApp Instaclustr Technology