Hi Christo, Thanks for your reply.
> keep the remote.log.disable.policy, but only allow it to take a value of "delete". I agree, or maybe make it a boolean value, and rename it to `remote.log.delete.on.disable`, which is clearer. And because of this new config, there will be a case that the config is like this: remote.storage.enable=false remote.log.delete.on.disable=false (default) That means, in this case, we'll keep all remote storage data, but close all remote log tasks, and make "log start offset = local log start offset". This will make the remote storage metadata in an unknown state because the data in the remote storage is inaccessible anymore (since log start moved to LLSO). And once this topic re-enables the `remote.storage.enable`, the old remote log metadata will be included, but log start offset is not expected anymore.... So, I'd like to propose that we don't allow this configuration: remote.storage.enable=false remote.log.delete.on.disable=false (default) If the topic config is set to this, or changed to this, we'll return ConfigException during validation. To make it clear, this is the new proposed solution: https://docs.google.com/document/d/1Y_cSkXr-qQiFFlFoGqfzGHE9m9MnIvZSgGpFP5l5o4I/edit Let me know what you think. Thanks. Luke On Thu, Jul 25, 2024 at 8:07 PM Christo Lolov <christolo...@gmail.com> wrote: > Hello! > > Thank you for raising this! > > Up to now KIP-950 took the stance that you can disable tiering whenever you > wish as long as you specify what you would like to do with the data in > remote. Amongst other things it also made the promise that it will not > delete data without a user explicitly saying that they want their data > deleted. In other words there is a 2-step verification that the user truly > wants their data deleted. > > From the table of the new proposal I am left with the impression that the > moment a user tries to disable tiering their data will by deleted. In other > words, there is no 2-step verification that they want their data deleted. > > On a first read, I wouldn't be opposed to this proposal since it provides a > neat alternative to the tiered epoch as long as there is still a 2-step > verification that the user is aware their data will be deleted. I think > that a reasonable way to achieve this is to keep the > remote.log.disable.policy, but only allow it to take a value of "delete". > > What are your thoughts? > > Best, > Christo > > > On Thu, 25 Jul 2024 at 12:10, Luke Chen <show...@gmail.com> wrote: > > > Hi all, > > > > I just found the table is not able to be displayed correctly in the > email. > > I've put the table content in google doc here > > < > > > https://docs.google.com/document/d/1Y_cSkXr-qQiFFlFoGqfzGHE9m9MnIvZSgGpFP5l5o4I/edit?usp=sharing > > > > > . > > > > Thanks. > > Luke > > > > On Thu, Jul 25, 2024 at 6:30 PM Luke Chen <show...@gmail.com> wrote: > > > > > Hi all, > > > > > > While implementing the feature in KRaft mode, I found something we need > > to > > > change the original proposal: > > > > > > (1) In the KIP of "Disablement - KRaft backed Cluster > > > < > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-950%3A++Tiered+Storage+Disablement#KIP950:TieredStorageDisablement-Disablement-KRaftBackedCluster > > >", > > > we said: > > > Controller persists configuration change and completes disablement: > > > > > > 1. The controller creates a ConfigRecord and persists it in the > > > metadata topic. > > > 2. The controller creates a TopicRecord to increment the > tiered_epoch > > > and update the tiered_state to DISABLED state. > > > 3. This update marks the completion of the disablement process, > > > indicating that tiered storage has been successfully disabled for > the > > > KRaft-backed clusters. Similar to topic deletion all replicas will > > > eventually pick up the changes from the cluster metadata topic and > > apply > > > them to their own state. Any deletion failures will be picked up by > > the > > > expiration threads which should be deleting data before the log > start > > > offset. If the retention policy is delete, a new expiration thread > > will be > > > started on leadership change on any historical tiered topic to > > confirm that > > > there aren't any leftover segments in remote which need deletion. > > After a > > > cycle in which it didn't delete anything, it will die. > > > > > > For the (b) step, I don't think the controller needs to create > > TopicRecord > > > because: > > > 1. The broker can fetch the "tiered_state" from the ConfigRecord > > > 2. The "tiered_epoch" is not necessary because raft protocol will keep > > the > > > order for us. The broker can rely on the raft protocol and apply them > in > > > order, to get the expected results. > > > 3. Marking the completion of the disablement process. In KRaft, it's > not > > > necessary because once the ConfigRecord is accepted by the controller, > it > > > must be applied by all the observers "in order". > > > > > > So, I'd like to propose to remove the (b) step in KRaft mode. > > > > > > (2) Current configuration make users and implementation confusion. > > > This is what originally we proposed in KIP-950: > > > > > > remote.storage.enable > > > > > > remote.log.disable.policy(new) > > > > > > remote storage data > > > > > > true > > > > > > null/retain/delete > > > > > > uploadable + readable > > > > > > false > > > > > > retain (default) > > > > > > readable, but remote storage is disabled? For users, they are also > > > surprised if this topic is reading data from remote storage. > > > > > > Note: This also makes development difficult because it’s unable to > > > distinguish between: > > > > > > (1) a topic never enables remote storage > > > > > > (2) a topic enabled and then disabled remote storage > > > > > > A problem we have is when broker startup and trying to set log start > > > offset. Since the remote storage is disabled, we originally should set > to > > > “local log start offset”, but in case (2), we expect it to treat it as > > > “remote storage enabled”, which is confusing. > > > > > > false > > > > > > delete > > > > > > All remote data are deleted > > > > > > > > > Therefore, Kamal and I would like to propose a new version of the > > > configuration: > > > > > > remote.storage.enable > > > > > > remote.copy.disabled (new) > > > > > > remote storage data > > > > > > true > > > > > > false (default) > > > > > > uploadable + readable > > > > > > true > > > > > > true > > > > > > readable > > > > > > false > > > > > > true/false > > > > > > All remote data are deleted > > > > > > The advantage is this config makes users clear what it is configuring, > > and > > > the result is expected. > > > Also, on the implementation side, we can still rely on > > > "remote.storage.enable" to identify is this feature is on/off. > > > > > > Any thoughts about it? > > > > > > Thank you. > > > Luke > > > > > > > > > > > > On Thu, May 30, 2024 at 6:50 PM David Jacot > <dja...@confluent.io.invalid > > > > > > wrote: > > > > > >> Hi all, > > >> > > >> Thanks for the KIP. This is definitely a worthwhile feature. However, > I > > am > > >> a bit sceptical on the ZK part of the story. The 3.8 release is > supposed > > >> to > > >> be the last one supporting ZK so I don't really see how we could bring > > it > > >> to ZK, knowing that we don't plan to do a 3.9 release (current plan). > I > > >> strongly suggest clarifying this before implementing the ZK part in > > order > > >> to avoid having new code [1] being deleted right after 3.8 is released > > >> :). Personally, I agree with Chia-Ping and Mickael. We could drop the > ZK > > >> part. > > >> > > >> [1] https://github.com/apache/kafka/pull/16131 > > >> > > >> Best, > > >> David > > >> > > >> On Tue, May 28, 2024 at 1:31 PM Mickael Maison < > > mickael.mai...@gmail.com> > > >> wrote: > > >> > > >> > Hi, > > >> > > > >> > I agree with Chia-Ping, I think we could drop the ZK variant > > >> > altogether, especially if this is not going to make it in 3.8.0. > > >> > Even if we end up needing a 3.9.0 release, I wouldn't write a bunch > of > > >> > new ZooKeeper-related code in that release to delete it all right > > >> > after in 4.0. > > >> > > > >> > Thanks, > > >> > Mickael > > >> > > > >> > On Fri, May 24, 2024 at 5:03 PM Christo Lolov < > christolo...@gmail.com > > > > > >> > wrote: > > >> > > > > >> > > Hello! > > >> > > > > >> > > I am closing this vote as ACCEPTED with 3 binding +1 (Luke, > > Chia-Ping > > >> and > > >> > > Satish) and 1 non-binding +1 (Kamal) - thank you for the reviews! > > >> > > > > >> > > Realistically, I don't think I have the bandwidth to get this in > > >> 3.8.0. > > >> > > Due to this, I will mark tentatively the Zookeeper part for 3.9 if > > the > > >> > > community decides that they do in fact want one more 3.x release. > > >> > > I will mark the KRaft part as ready to be started and aiming for > > >> either > > >> > 4.0 > > >> > > or 3.9. > > >> > > > > >> > > Best, > > >> > > Christo > > >> > > > >> > > > > > >