Hi Deo, Thanks for the KIP!
"However the limit of messages in a single partition replica is very big. This could lead to very big partitions (~TBs). Moving those partitions are very time consuming and have a big impact on system performance." One way to do faster rebalance is to have a latest-offset replica build strategy when expanding the replicas for a partition and ensure that the expanded replica does not serve as a leader until the data in the older nodes expires by retention time/size. Currently, Kafka supports only the earliest-offset strategy during reassignment. And, this strategy will only work for topics with cleanup policy set to "delete". -- Kamal On Thu, Jan 2, 2025 at 10:23 PM David Arthur <mum...@gmail.com> wrote: > Hey De Gao, thanks for the KIP! > > As you’re probably aware, a Partition is a logical construct in Kafka. A > broker hosts a partition which is composed of physical log segments. Only > the active segment is being written to and the others are immutable. The > concept of a Chunk sounds quite similar to our log segments. > > From what I can tell reading the KIP, the main difference is that a Chunk > can have its own assignment and therefore be replicated across different > brokers. > > > Horizontal scalability: the data was distributed more evenly to brokers > in cluster. Also achieving a more flexible resource allocation. > > I think this is only true in cases where we have a small number of > partitions with a large amount of data. I have certainly seen cases where a > small number of partitions can cause trouble with balancing the cluster. > > The idea of shuffling around older data in order to spread out the load is > interesting. It does seem like it would increase the complexity of the > client a bit when it comes to consuming the old data. Usually the client > can just read from a single replica from the beginning of the log to the > end. With this proposal, the client would need to hop around between > replicas as it crossed the chunk boundaries. > > > Better load balancing: The read of partition data, especially early data > can be distributed to more nodes other than just leader nodes. > > As you know, this is already possible with KIP-392. I guess the idea with > the chunks is that clients would be reading older data from less busy > brokers (i.e., brokers which are not the leader, or perhaps not even a > follower of the active chunk). I’m not sure this would always result in > better load balancing. It seems a bit situational. > > > Increased fault tolerance: failure of leader node will not impact read > older data. > > I don’t think this proposal changes the fault tolerance. A failure of a > leader results in a failover to a follower. If a client is consuming using > KIP-392, a leader failure will not affect the consumption (besides updating > the clients metadata). > > -- > > I guess I'm missing a key point here. What problem is this trying to solve? > Is it a solution for the "single partition" problem? (i.e., a topic with > one partition and a lot of data) > > Thanks! > David A > > On Tue, Dec 31, 2024 at 3:24 PM De Gao <d...@live.co.uk> wrote: > > > Thanks for the comments. I have updated the proposal to compare with > > tiered storage and fetch from replica. Please check. > > > > Thanks. > > > > On 11 December 2024 08:51:43 GMT, David Jacot > <dja...@confluent.io.INVALID> > > wrote: > > >Hi, > > > > > >Thanks for the KIP. The community is pretty busy with the Apache Kafka > 4.0 > > >release so I suppose that no one really had the time to engage in > > reviewing > > >the KIP yet. Sorry for this! > > > > > >I just read the motivation section. I think that it is an interesting > > idea. > > >However, I wonder if this is still needed now that we have tier storage > in > > >place. One of the big selling points of tier storage was that clusters > > >don't have to replicate tiered data anymore. Could you perhaps extend > the > > >motivation of the KIP to include tier storage in the reflexion? > > > > > >Best, > > >David > > > > > >On Tue, Dec 10, 2024 at 10:46 PM De Gao <d...@live.co.uk> wrote: > > > > > >> Hi All: > > >> > > >> There were no discussion in the past week. Just want to double check > if > > I > > >> missed anything? > > >> What should be the expectations on KIP discussion? > > >> > > >> Thank you! > > >> > > >> De Gao > > >> > > >> On 1 December 2024 19:36:37 GMT, De Gao <d...@live.co.uk> wrote: > > >> >Hi All: > > >> > > > >> >I would like to start the discussion of KIP-1114 Introducing Chunk in > > >> Partition. > > >> > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1114%3A+Introducing+Chunk+in+Partition > > >> >This KIP is complicated so I expect discussion will take longer time. > > >> > > > >> >Thank you in advance. > > >> > > > >> >De Gao > > >> > > > > > -- > David Arthur >