Hi Kamal, Thanks for taking a look at the KIP!
I briefly considered that option initially but I found it not very practical once you have more than a few cordoned log directories. I find your example is already not very easy to read, and it only has 2 entries. Also if the configuration is at the cluster level it'sis not easy to see if a broker has all its log directories cordoned, and you still need to describe a specific broker's configuration to find the "name" of a log directory you want to cordon. I think an easy way to get an overall view of the cordoned log directories/brokers will be via the kafka-log-dirs.sh tool. I am also considering adding metrics like we have today for LogDirectoryOffline to ease monitoring. Thanks, Mickael On Thu, Jul 11, 2024 at 8:41 PM Kamal Chandraprakash <kamal.chandraprak...@gmail.com> wrote: > > Hi Mickael, > > Thanks for the KIP! > > This is a useful feature which helps to decommission the nodes by > essentially > creating a new replica exclude broker list. > > To cordon a list of brokers, we have to apply the config on each of the > broker nodes > and similarly to see the list of cordoned brokers, we have to either query > individual broker > config in the cluster or query each of the broker log directory. > > Can we move the configuration from the broker to cluster level? (eg) > > bin/kafka-configs.sh --bootstrap-server localhost:9092 --broker-defaults > --alter --add-config > cordoned.log.dirs=[broker.id.0]:[log.dir.0],[broker.id.1]:[log.dir.1] > > This will provide a consistent view of the list of cordoned brokers in the > cluster. > > -- > Kamal > > On Wed, Jul 10, 2024 at 7:53 PM Mickael Maison <mickael.mai...@gmail.com> > wrote: > > > Hi Luke, > > > > 4. You're right this scenario can happen. In this case I think the > > broker should enforce its new state and not create the replica as all > > its log directories are now cordoned. The replica will be offline and > > an administrator would need to reassign it to another broker. I expect > > most users will rely on kafka-configs.sh to manage the cordoned log > > directories instead of updating the broker properties, so it should > > not be a common issue. > > > > 6. To resolve this issue, the user can uncordon a log directory and > > retry an operation that triggers the creation of the internal topics. > > For example start a consumer using a group should make the broker > > retry creating the __consumer_offsets topic. > > > > Thanks, > > Mickael > > > > On Wed, Jul 10, 2024 at 4:14 PM Mickael Maison <mickael.mai...@gmail.com> > > wrote: > > > > > > Hi Chia-Ping, > > > > > > Question 1) Yes that's a good summary. I'd also add that managing > > > cordoned log directories is intended to be done by cluster > > > administrators who also know about operations in-progress or planned > > > such as scaling or adding/removing log directories. In practice you > > > can't expect users to provide "good" explicit partition assignments as > > > they are not aware of the cluster operations. > > > > > > Question 2) I don't see an actual question :) > > > > > > Question 3) My current proposal is that in that case, brokers will > > > ignore the preferred log directory and place the new replica on a log > > > directory that is not cordoned. To use AlterReplicaLogDirs you need > > > ALTER permission on the CLUSTER resource. In most environments users > > > don't have permissions to use that API. Or if they do, I'd expect them > > > to also have ALTER_CONFIGS on the BROKER resource and be able to > > > change the cordoned log directories configuration. > > > > > > Question 4) If we don't include the CordonedLogDirs field in > > > BrokerRegistrationRequest as you said there's a small window when the > > > controller would not know if a broker can be used for assignments. If > > > all log directories are cordoned the controller should not use that > > > broker for assignments. > > > > > > Question 5) In KRaft mode offline brokers are still registered, and > > > can be used for partition assignment. So we need to persist the > > > cordoned log directories too to exclude brokers that don't have > > > uncordoned log directories. > > > > > > Thanks, > > > Mickael > > > > > > > > > On Wed, Jul 10, 2024 at 12:25 PM Luke Chen <show...@gmail.com> wrote: > > > > > > > > Hi Mickael, > > > > > > > > Thanks for the response. > > > > > > > > > 4. Cordoned log directories are persisted to the metadata log via the > > > > RegisterBrokerRecord, BrokerRegistrationChangeRecord records. If a > > > > broker is offline, the controller will use the latest known state of > > > > the broker to determine the broker's cordoned log directories. I've > > > > added a sentence clarifying this point. > > > > > > > > OK, so if the broker A goes offline, and the controller is in "fenced" > > > > state, without any cordoned log dirs, then some topic created and > > assigned > > > > to broker A. Later, broker A starts up with all its log dirs cordoned > > > > configured. At this situation, will the broker A create the partitions? > > > > > > > > > 6. I'm leaning towards considering that scenario a configuration > > > > error. If all log directories are cordoned before the internal topics > > > > are created, then the broker will not be able to create them. This > > > > seems like a pretty strange scenario, where it's the first time you > > > > start a broker, you've cordoned all its log directory, and the > > > > internal topics (offsets and transactions) have not yet been created > > > > in the rest of the cluster. > > > > > > > > Yes, I agree that this should be a configuration error. > > > > So the follow-up question is: Suppose users encounter this issue, and > > how > > > > could they resolve it? > > > > Uncordon the log dir dynamically using kafka-configs.sh? Will the > > > > uncordoning config change recreate the partitions we didn't create > > earlier > > > > because of log dir cordoned? > > > > > > > > > The metadata log is different (not managed by LogManager), so I think > > > > it should always be created regardless if its log directory is > > > > cordoned or not. > > > > > > > > I agree we should treat "__cluster_metadata" differently. > > > > > > > > > > > > Thanks. > > > > Luke > > > > > > > > > > > > On Wed, Jul 10, 2024 at 12:42 AM Mickael Maison < > > mickael.mai...@gmail.com> > > > > wrote: > > > > > > > > > Hi Luke, > > > > > > > > > > 2. isCordoned() is a new method on LogDirDescription. It does not > > take > > > > > any arguments. It just returns true if this log directory the > > > > > LogDirDescription represents is cordoned. > > > > > > > > > > 3. Sorry that was a typo. This method will only return a log > > directory > > > > > that is not cordoned. Fixed > > > > > > > > > > 4. Cordoned log directories are persisted to the metadata log via the > > > > > RegisterBrokerRecord, BrokerRegistrationChangeRecord records. If a > > > > > broker is offline, the controller will use the latest known state of > > > > > the broker to determine the broker's cordoned log directories. I've > > > > > added a sentence clarifying this point. > > > > > > > > > > 5. Yes a log directory can be uncordoned. You can either update the > > > > > properties file and restart the broker or dynamically change the > > value > > > > > at runtime using kafka-configs. I've added a paragraph about it in > > the > > > > > KIP. > > > > > > > > > > 6. I'm leaning towards considering that scenario a configuration > > > > > error. If all log directories are cordoned before the internal topics > > > > > are created, then the broker will not be able to create them. This > > > > > seems like a pretty strange scenario, where it's the first time you > > > > > start a broker, you've cordoned all its log directory, and the > > > > > internal topics (offsets and transactions) have not yet been created > > > > > in the rest of the cluster. > > > > > The metadata log is different (not managed by LogManager), so I think > > > > > it should always be created regardless if its log directory is > > > > > cordoned or not. > > > > > > > > > > Thanks, > > > > > Mickael > > > > > > > > > > On Tue, Jul 9, 2024 at 3:48 PM Chia-Ping Tsai <chia7...@gmail.com> > > wrote: > > > > > > > > > > > > hi Mickael > > > > > > > > > > > > That is totally a good idea, but I have a question about the > > > > > implementation > > > > > > > > > > > > Do we consider making pluggable ReplicaPlacer (KIP-660) first and > > then > > > > > add > > > > > > another impl of ReplicaPlacer to offer cordon mechanism? Noted that > > > > > > `ReplicaPlacer` can implement Reconfigurable to get updated at > > runtime. > > > > > > That is similar to KIP-1066 - change cordoned.log.dirs through > > configs > > > > > > update. > > > > > > > > > > > > The benefit is to let users have their optimized policy for > > specific > > > > > > scenario. Also, it can avoid that we add more and more mechanism > > to our > > > > > > code base. Of course we can merge the mechanism which can be used > > by 99% > > > > > > users :smile > > > > > > > > > > > > Best, > > > > > > Chia-Ping > > > > > > > > > > > > > > > > > > Luke Chen <show...@gmail.com> 於 2024年7月9日 週二 下午9:07寫道: > > > > > > > > > > > > > Hi Mickael, > > > > > > > > > > > > > > Thanks for the KIP! > > > > > > > This is a long waiting feature for many users! > > > > > > > > > > > > > > Questions: > > > > > > > 1. I think piggyback the "BrokerHeartbeatRequest" to forward the > > > > > corden log > > > > > > > dir to controller makes sense to me. > > > > > > > We already did similar things for fence, controller shutdown, > > failed > > > > > log > > > > > > > dir...etc. > > > > > > > > > > > > > > 2. In the admin API, what parameters will the new added > > isCordoned() > > > > > method > > > > > > > take? > > > > > > > > > > > > > > 3. In the KIP, we said: > > > > > > > "defaultDir(): This method will not return the Uuid of a log > > directory > > > > > that > > > > > > > is not cordoned." > > > > > > > --> It's hard to understand. Does that mean we will only return > > > > > cordoned > > > > > > > log dir? > > > > > > > From the current java doc of the interface, it doesn't look > > right: > > > > > > > "Get the default directory for new partitions placed in a given > > > > > broker." > > > > > > > > > > > > > > 4. Currently, if a broker is registered and then go offline. In > > this > > > > > state, > > > > > > > the controller will still distribute partitions to this broker. > > > > > > > So, if now, the broker get startup with "cordoned.log.dirs" set, > > what > > > > > will > > > > > > > happen? > > > > > > > Will the newly assigned partitions be created successfully or > > not? > > > > > > > > > > > > > > 5. I think after a log dir get cordoned, we can always uncordon > > it, > > > > > right? > > > > > > > I think we should mention it in the KIP. > > > > > > > > > > > > > > 6. If a broker is startup with "cordoned.log.dirs" set, and does > > that > > > > > mean > > > > > > > the internal topics partitions (ex: __consumer_offsets) cannot be > > > > > created, > > > > > > > either? > > > > > > > Also, if this log dir is happen to be the metadata log dir, what > > will > > > > > > > happen to the metadata topic creation? > > > > > > > > > > > > > > Thanks. > > > > > > > Luke > > > > > > > > > > > > > > > > > > > > > On Tue, Jul 9, 2024 at 12:12 AM Mickael Maison < > > > > > mickael.mai...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > Thanks for taking a look. > > > > > > > > > > > > > > > > - Yes you're right, I meant AlterPartitionReassignments. Fixed. > > > > > > > > - That's a good idea. I was expecting users to discover > > cordoned log > > > > > > > > directories by describing broker configurations. But being > > able to > > > > > > > > also get this information when describing log directories makes > > > > > sense. > > > > > > > > I've added that to the KIP. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Mickael > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Jul 5, 2024 at 8:05 AM Haruki Okada < > > ocadar...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > Thank you for the KIP. > > > > > > > > > The motivation sounds make sense to me. > > > > > > > > > > > > > > > > > > I have a few questions: > > > > > > > > > > > > > > > > > > - [nits] "AlterPartitions request" in Error handling section > > is > > > > > > > > > "AlterPartitionReassignments request" actually, right? > > > > > > > > > - Don't we need to include cordoned information in > > DescribeLogDirs > > > > > > > > response > > > > > > > > > too? Some tools (e.g. CruiseControl) need to have a way to > > know > > > > > which > > > > > > > > > broker/log-dirs are cordoned to generate partition > > reassignment > > > > > > > proposal. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > 2024年7月4日(木) 22:57 Mickael Maison <mickael.mai...@gmail.com > > >: > > > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > I'd like to start a discussion on KIP-1066 that introduces > > a > > > > > > > mechanism > > > > > > > > > > to cordon log directories and brokers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1066%3A+Mechanism+to+cordon+brokers+and+log+directories > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Mickael > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > ======================== > > > > > > > > > Okada Haruki > > > > > > > > > ocadar...@gmail.com > > > > > > > > > ======================== > > > > > > > > > > > > > > > > > > > > > >