Hi Chia-Ping, As mentioned in the rejected alternatives section, the community was against KIP-660. The consensus was to not let users build optimized policy for specific scenarios. The main points were: - the difficulty to expose an extensive view of the cluster state (with KRaft, the cluster metadata could be huge) that is required to make optimized decisions - how critical the placement logic is as it runs in the controller. It needs to be fast and reliable to not negatively impact the controller and the rest of the cluster
You can find the full discussions in the dev list archives: https://lists.apache.org/list?dev@kafka.apache.org:gte=1d:KIP-660 Thanks, Mickael On Tue, Jul 9, 2024 at 6:42 PM Mickael Maison <mickael.mai...@gmail.com> wrote: > > Hi Luke, > > 2. isCordoned() is a new method on LogDirDescription. It does not take > any arguments. It just returns true if this log directory the > LogDirDescription represents is cordoned. > > 3. Sorry that was a typo. This method will only return a log directory > that is not cordoned. Fixed > > 4. Cordoned log directories are persisted to the metadata log via the > RegisterBrokerRecord, BrokerRegistrationChangeRecord records. If a > broker is offline, the controller will use the latest known state of > the broker to determine the broker's cordoned log directories. I've > added a sentence clarifying this point. > > 5. Yes a log directory can be uncordoned. You can either update the > properties file and restart the broker or dynamically change the value > at runtime using kafka-configs. I've added a paragraph about it in the > KIP. > > 6. I'm leaning towards considering that scenario a configuration > error. If all log directories are cordoned before the internal topics > are created, then the broker will not be able to create them. This > seems like a pretty strange scenario, where it's the first time you > start a broker, you've cordoned all its log directory, and the > internal topics (offsets and transactions) have not yet been created > in the rest of the cluster. > The metadata log is different (not managed by LogManager), so I think > it should always be created regardless if its log directory is > cordoned or not. > > Thanks, > Mickael > > On Tue, Jul 9, 2024 at 3:48 PM Chia-Ping Tsai <chia7...@gmail.com> wrote: > > > > hi Mickael > > > > That is totally a good idea, but I have a question about the implementation > > > > Do we consider making pluggable ReplicaPlacer (KIP-660) first and then add > > another impl of ReplicaPlacer to offer cordon mechanism? Noted that > > `ReplicaPlacer` can implement Reconfigurable to get updated at runtime. > > That is similar to KIP-1066 - change cordoned.log.dirs through configs > > update. > > > > The benefit is to let users have their optimized policy for specific > > scenario. Also, it can avoid that we add more and more mechanism to our > > code base. Of course we can merge the mechanism which can be used by 99% > > users :smile > > > > Best, > > Chia-Ping > > > > > > Luke Chen <show...@gmail.com> 於 2024年7月9日 週二 下午9:07寫道: > > > > > Hi Mickael, > > > > > > Thanks for the KIP! > > > This is a long waiting feature for many users! > > > > > > Questions: > > > 1. I think piggyback the "BrokerHeartbeatRequest" to forward the corden > > > log > > > dir to controller makes sense to me. > > > We already did similar things for fence, controller shutdown, failed log > > > dir...etc. > > > > > > 2. In the admin API, what parameters will the new added isCordoned() > > > method > > > take? > > > > > > 3. In the KIP, we said: > > > "defaultDir(): This method will not return the Uuid of a log directory > > > that > > > is not cordoned." > > > --> It's hard to understand. Does that mean we will only return cordoned > > > log dir? > > > From the current java doc of the interface, it doesn't look right: > > > "Get the default directory for new partitions placed in a given broker." > > > > > > 4. Currently, if a broker is registered and then go offline. In this > > > state, > > > the controller will still distribute partitions to this broker. > > > So, if now, the broker get startup with "cordoned.log.dirs" set, what will > > > happen? > > > Will the newly assigned partitions be created successfully or not? > > > > > > 5. I think after a log dir get cordoned, we can always uncordon it, right? > > > I think we should mention it in the KIP. > > > > > > 6. If a broker is startup with "cordoned.log.dirs" set, and does that mean > > > the internal topics partitions (ex: __consumer_offsets) cannot be created, > > > either? > > > Also, if this log dir is happen to be the metadata log dir, what will > > > happen to the metadata topic creation? > > > > > > Thanks. > > > Luke > > > > > > > > > On Tue, Jul 9, 2024 at 12:12 AM Mickael Maison <mickael.mai...@gmail.com> > > > wrote: > > > > > > > Hi, > > > > > > > > Thanks for taking a look. > > > > > > > > - Yes you're right, I meant AlterPartitionReassignments. Fixed. > > > > - That's a good idea. I was expecting users to discover cordoned log > > > > directories by describing broker configurations. But being able to > > > > also get this information when describing log directories makes sense. > > > > I've added that to the KIP. > > > > > > > > Thanks, > > > > Mickael > > > > > > > > > > > > On Fri, Jul 5, 2024 at 8:05 AM Haruki Okada <ocadar...@gmail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > Thank you for the KIP. > > > > > The motivation sounds make sense to me. > > > > > > > > > > I have a few questions: > > > > > > > > > > - [nits] "AlterPartitions request" in Error handling section is > > > > > "AlterPartitionReassignments request" actually, right? > > > > > - Don't we need to include cordoned information in DescribeLogDirs > > > > response > > > > > too? Some tools (e.g. CruiseControl) need to have a way to know which > > > > > broker/log-dirs are cordoned to generate partition reassignment > > > proposal. > > > > > > > > > > Thanks, > > > > > > > > > > 2024年7月4日(木) 22:57 Mickael Maison <mickael.mai...@gmail.com>: > > > > > > > > > > > Hi, > > > > > > > > > > > > I'd like to start a discussion on KIP-1066 that introduces a > > > mechanism > > > > > > to cordon log directories and brokers. > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1066%3A+Mechanism+to+cordon+brokers+and+log+directories > > > > > > > > > > > > Thanks, > > > > > > Mickael > > > > > > > > > > > > > > > > > > > > > -- > > > > > ======================== > > > > > Okada Haruki > > > > > ocadar...@gmail.com > > > > > ======================== > > > > > > >