Hi!

Thanks for the greate work and I'm excited to see it happens. These KIPs looks 
well to me. 
I have a question about the Batch Coordinator in KIP-1164.
Seems Batch Coordinator is very important in the diskless implementation, could 
you explain more details on the implementation?
For me, I'm wondering how it "chooses the total ordering for writes" and what's 
the "information necessary to support idempotent producers".
I'm thinking about the following cases:
1: client is going to send message A, B, C to Kafka
2: client sending A, B to broker1, broker1 recieve A, B
3: broker1 down, client send C to broker2
4: since broker1 is down, then client recieve A,B fail and retry to send A,B to 
broker2
Then, how Batch Coordinator can choose totol order to be A,B,C ?


Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Christo Lolov" <christolo...@gmail.com>
收件人: dev@kafka.apache.org
发送时间: 星期二, 2025年 4 月 22日 下午 9:04:06
主题: [SPAM]Re: [DISCUSS] KIP-1150 Diskless Topics

Hello!

I want to start with saying that this is a big and impressive undertaking
and I am really excited to see its progression! I am posting my initial
comments in this thread, but they span a few of the child KIPs. Let me know
which questions you would like to move elsewhere. I understand that you
want first a consensus on the direction, but I think I still need designs
on a few of the core areas to form an opinion.

CL - 1: In the same lane as Luke's comment, it would be very useful to see
explicitly what will stay on disk and what won't stay on disk

CL - 2: It would also be very useful to explicitly say what the
interactions will be with the Kraft-related topic - would it be diskless or
on disk?

CL - 3: Do you envision that this feature will work with KIP-932?

CL - 4: KIP-1163 says that there won't be a production-grade implementation
of the Batch Coordinator and KIP-1164 says the opposite. Which one would it
be?

CL - 5: KIP-1163 says that the Batch Coordinator doesn't need to concern
itself with object storage and KIP-1164 says that it will manage the object
physical deletion. Which one would it be?

CL - 6: Could you go in a bit more details on whether we would need changes
to the Kafka clients to achieve what you are proposing? If no changes are
necessary to the clients then what changes would be necessary to brokers to
make clients believe they are communicating with the "right" brokers? Would
those make it in KIP-1163?

CL - 7: Where and how would indexes (offset, time, producer snapshot) live?
In particular, I am interested in how the reference Batch Coordinator will
quickly (for a certain definition of quickly) rebuild state?

CL - 8: I think that we try to have as few Kafka dependencies as possible.
The closure of compile + runtime broker-only dependencies is currently 16
(if I have done my analysis correctly). What problem(s) do you envision
w.r.t. spilling to disk which we wouldn't be able to solve with our own
implementation that require SQLite?

Once again, great work so far!

Best,
Christo

On Sun, 20 Apr 2025 at 23:04, Stanislav Kozlovski <
stanislavkozlov...@apache.org> wrote:

> This is an amazing initiative. Huge kudos for driving it. We should
> incorporate it one way or another.
>
> I have a suggestion I'd like to hear your thoughts on. I'm cognizant of
> the effort required for KIP-1150 so I don't necessarily want to increase
> the scope - but thinking about this early on can help design later on, plus
> shape the motivation.
>
> The idea is to introduce support for replicationless acks=1 writes. This
> would be very similar to how AutoMQ's WAL+S3 feature works, as far as I
> understand it.
>
> Could we have Diskless Brokers serve acks=1 produce requests by
> immediately persisting the data on disk (not sure if we should use fsync or
> not), responding to the request, and then still asynchronously batching
> said data with regular acks=all data via the "
> diskless.append.commit.interval.ms"/ "diskless.append.buffer.max.bytes"
> configs?
>
> If I'm not mistaken, this would offer very similar guarantees as today's
> acks=1 requests, where a period of low durability exists b/w the time the
> leader persists to its local disk and the time all followers persist to
> their disk. Granted, in traditional Kafka this period is probably no more
> than a hundred milliseconds, and here it'd be at least 2x higher. But I
> believe that given the major savings, many acks=1 users will be happy to
> make the tradeoff.
>
> While on the topic of cost, I hastily ran some cost calculations and found
> that the KIP should reduce replication costs by more than 80x. (
> https://topicpartition.io/blog/kip-1150-diskless-topics-in-apache-kafka).
> There may be some errors there as the batch coordinator RPC and merging
> isn't fully fleshed out - but I believe it's directionally correct. It may
> be worth to add that to the motivation in one way or another - so as to be
> able to quantify the numbers.
>
> Best,
> Stanislav
>
> On 2025/04/19 11:02:30 Ivan Yurchenko wrote:
> > Hi Ziming,
> >
> > > 1. Is this feature available by just a minor adjust of config or it
> will intrude current code heavily, say, AutoMq is 100% compatible with
> Kafka and doesn’t intrude the code heavily
> >
> > If we speak about the part visible to the user, we expect:
> >  1. Minimal changes to the client code (with potential fallback with
> even 0 changes for older clients).
> >  2. A limited set of new configurations for broker and topics.
> > Otherwise, this should be a perfectly normal Apache Kafka.
> >
> > > 2. Though we are not discussing implement details, it’s worth giving
> some high-level architecture ideas, and it’s better to compare with AutoMq
> like systems.
> >
> > There's quite a bit of high-level architecture in a sub-KIP-1163 [1].
> > We didn't do comparison to AutoMQ (to the best of our knowledge, they
> have a fairly different approach), but if this helps the community to get
> the idea then sure, we should do this.
> >
> > > 3. What we will provide through it, I think we will just provide a
> common interface and put implementations in another repos, just as we did
> for Kafka Connect and Kafka Tired Storage.
> >
> > This is true for the component that does CRUD operations on object
> storage. However, for the batch coordinator we would like to provide a
> decent out-of-the-box self-contained (i.e. no external deps like database)
> implementation that many Kafka users who don't have challenging scaling
> requirements would benefit from. There's the sub-KIP-1164 [2] for this.
> >
> > > 4. How to deal with KRaft related protocol, since metadata topic is
> managed differently with __cluster_metadata, through this KIP, will we
> align the gap between __cluster_metadata  and data topics by put metadata
> in an object storage? if so, there will be no standby controller? since
> standby controller is the __cluster_metadata followers and there will be no
> followers.
> >
> > The current plan is to not directly work with the KRaft and
> __cluster_metadata. What we need from KRaft is 3 types of events:
> topic/partition creation, topic deletion, and topic configuration changes
> (with the possibility to limit this set to topic deletion only). We think
> that'd be enough if we have a "bridge" that watches for these events in
> __cluster_metadata and reflects them in the batch coordinator (basically,
> by sending requests).
> > Does this answer the question or maybe I misunderstood?
> >
> > Best,
> > Ivan
> >
> > [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core
> > [2]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator
> >
> > On Fri, Apr 18, 2025, at 12:42, Ziming Deng wrote:
> > > Hi Josep,
> > >
> > > This would be a fascinating feature, some well known Kafka users are
> using Kafka in a cloud-native env. As for as I know, there are already some
> secondary development version Kafka which provide this feature, for
> example, I am using AutoMq(https://github.com/AutoMQ/automq) in my
> environment, which significantly helped ms reduced the cost, so I think
> it’s worthwhile to clarify some related details:
> > > 1. Is this feature available by just a minor adjust of config or it
> will intrude current code heavily, say, AutoMq is 100% compatible with
> Kafka and doesn’t intrude the code heavily
> > > 2. Though we are not discussing implement details, it’s worth giving
> some high-level architecture ideas, and it’s better to compare with AutoMq
> like systems.
> > > 3. What we will provide through it, I think we will just provide a
> common interface and put implementations in another repos, just as we did
> for Kafka Connect and Kafka Tired Storage.
> > > 4. How to deal with KRaft related protocol, since metadata topic is
> managed differently with __cluster_metadata, through this KIP, will we
> align the gap between __cluster_metadata  and data topics by put metadata
> in an object storage? if so, there will be no standby controller? since
> standby controller is the __cluster_metadata followers and there will be no
> followers.
> > >
> > > —
> > > Ziming
> > >
> > > > On Apr 16, 2025, at 19:58, Josep Prat <josep.p...@aiven.io.INVALID>
> wrote:
> > > >
> > > > Hi Kafka Devs!
> > > >
> > > > We want to start a new KIP discussion about introducing a new type of
> > > > topics that would make use of Object Storage as the primary source of
> > > > storage. However, as this KIP is big we decided to split it into
> multiple
> > > > related KIPs.
> > > > We have the motivational KIP-1150 (
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics
> )
> > > > that aims to discuss if Apache Kafka should aim to have this type of
> > > > feature at all. This KIP doesn't go onto details on how to implement
> it.
> > > > This follows the same approach used when we discussed KRaft.
> > > >
> > > > But as we know that it is sometimes really hard to discuss on that
> meta
> > > > level, we also created several sub-kips (linked in KIP-1150) that
> offer an
> > > > implementation of this feature.
> > > >
> > > > We kindly ask you to use the proper DISCUSS threads for each type of
> > > > concern and keep this one to discuss whether Apache Kafka wants to
> have
> > > > this feature or not.
> > > >
> > > > Thanks in advance on behalf of all the authors of this KIP.
> > > >
> > > > ------------------
> > > > Josep Prat
> > > > Open Source Engineering Director, Aiven
> > > > josep.p...@aiven.io   |   +491715557497 | aiven.io
> > > > Aiven Deutschland GmbH
> > > > Alexanderufer 3-7, 10117 Berlin
> > > > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > > > Anna Richardson, Kenneth Chen
> > > > Amtsgericht Charlottenburg, HRB 209739 B
> > >
> > >
> >
>

Reply via email to