Wow, I am super happy to see this KIP! Thanks for publishing it!

I threw the idea out there last week in an article of mine about calculating 
Kafka costs[1]

> [FUTURE KIP] - a Produce to Local Leader KIP, similar to KIP-392, can be 
> introduced to eliminate producer inter-AZ network costs for topics that do 
> not have keys.
> there is no fundamental reason that a topic without ordering guarantees needs 
> to produce to a specific partition - why not just choose the broker in the 
> closest zone?
> if all of your traffic is unkeyed, then this can further reduce Kafka’s 
> network cost by 25%.
> it sounds like a change that wouldn’t be too complicated, maybe even 
> achievable today through the Producer’s partitioner.

I don't know if you saw it from there, but I'm super happy to see it come to 
fruition! It's even easier than I thought - I didn't realize we had the 
node/rack information in the partitioner already.

I think it will be very impactful.
We've seen the strong trend in the industry of trading off latency for cost 
reduction. Namely - almost every vendor has introduced some sort of leaderless 
Kafka API model that outsources replication to a remote store cost[2][3][4][5]. 
This in turn allows them to reduce cross-zone networking costs to literally 
zero. In certain optimized deployments the networking cost can be up to 80-90% 
of the total cost![6] KIP-392 allows us to eliminate the consumer-side traffic 
cost, but there is great motivation to enable users to do the same for 
producers that don't depend on ordering.

I am +1 the KIP as is. 

One may make an argument to have a way to enable it server-side via the broker, 
but I'd like to hear a good reason for that. I believe the simplicity in the 
current state is preferred, since clients already have freedom to produce to 
any partition they explicitly choose.

Best,
Stan

[1] 
https://bigdata.2minutestreaming.com/p/the-brutal-truth-about-apache-kafka-cost-calculators
[2] WarpStream and its $220m acquisition 
https://www.linkedin.com/pulse/how-confluent-acquired-warpstream-220m-after-just-13-months-hxgyf/
[3] Confluent Freight 
https://www.confluent.io/blog/introducing-confluent-cloud-freight-clusters/
[4] RedPanda Cloud Topics 
https://www.redpanda.com/blog/cloud-topics-streaming-data-object-storage
[5] BufStream https://buf.build/product/bufstream
[6] calculator https://akalculator.com/

On 2024/12/20 11:35:28 Ivan Yurchenko wrote:
> Hello all,
> 
> I'd like to propose a new KIP to discuss: KIP-1123: Rack-aware partitioning 
> for Kafka Producer [1].
> 
> Best,
> Ivan Yurchenko
> 
> [1] 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1123%3A+Rack-aware+partitioning+for+Kafka+Producer
> 

Reply via email to