Hi Kamal,

Good spot, yes this is a typo. The flexibleVersions stays as "0+". Fixed

Thanks,
Mickael

On Wed, Jul 17, 2024 at 6:14 PM Mickael Maison <mickael.mai...@gmail.com> wrote:
>
> Hi David,
>
> DA1: It's a point I considered, especially being able to cordon a
> whole broker. With the current proposal you just need to set
> cordoned.log.dirs to the same value as log.dirs. That does not seem
> excessively difficult.
>
> DA2: I did consider using a new metadata record like we do for
> fence/unfence. Since I don't expect cordoned log dirs to change very
> frequently, and the size should be small, I opted to reuse the
> BrokerRegistrationRecord metadata record. At the moment I guess it was
> mostly for the convenience while prototyping. Semantically it probably
> makes sense to have separate records. Your further point suggest
> design changes in the mechanism as a whole, so let's discuss these
> first and we can return to the exact metadata records after.
> I find the idea of having dedicated RPCs interesting. One of the
> reasons I piggybacked on the heartbeating process is for the mapping
> between log directory names and their UUIDs. Currently the mapping
> only exists on each broker instance. So if we wanted a dedicated RPC,
> we would first need to change how we manage the log directory to UUID
> mappings. I guess this could be done via the BrokerRegistration API.
> I'm not sure about storing additional metadata (reason, timestamp). We
> currently don't do that for any operations
> (AlterPartitionReassignments, UnregisterBroker). Typically these are
> stored in the tools used by the operators to drive these operations.
> You bring another interesting point about the ability to cordon
> brokers/log directories while they are offline. It's not something the
> current proposal supports. I'm not sure how useful this would turn out
> it practice. In experience is that brokers are mostly online, so I'd
> expect the need to do so relatively rare. This also kind of loops back
> to KAFKA-17094 [0] and the discussion Gantigmaa started on the dev
> list [1] about being able to identify the ids of offline (but still
> registered) brokers.
>
> DA3: With the current proposal, I don't see a reason why you would
> want to disable the new behavior. If you don't want to use it, you
> have nothing to do. It's opt-in as you need to set cordoned.log.dirs
> on some brokers to get the new behavior. If you don't want it anymore,
> you should unset cordoned.log.dirs. Can you explain why this would not
> work?
>
> DA4: Yes
>
> 0: https://issues.apache.org/jira/browse/KAFKA-17094
> 1: https://lists.apache.org/thread/1rrgbhk43d85wobcp0dqz6mhpn93j9yo
>
> Thanks,
> Mickael
>
>
> On Sun, Jul 14, 2024 at 10:37 AM Kamal Chandraprakash
> <kamal.chandraprak...@gmail.com> wrote:
> >
> > Hi Mickael,
> >
> > In the BrokerHearbeatRequest.json, the flexibleVersions are bumped from
> > "0+" to "1+". Is it a typo?
> >
> >
> > On Fri, Jul 12, 2024 at 11:42 PM David Arthur <mum...@gmail.com> wrote:
> >
> > > Mickael, thanks for the KIP! I think this could be quite a useful feature.
> > >
> > > DA1: Having to know each of the log dirs for a broker seems a bit
> > > inconvenient for cases where we want to cordon off a whole broker. I do
> > > think having the ability to cordon off a specific log dir is useful for
> > > JBOD, but I imagine a common case might be to cordon off the whole broker.
> > >
> > > DA2: Looks like the new "cordoned.log.dirs" can be configured statically
> > > and updated dynamically per-broker. What do you think about a new metadata
> > > record and RPC instead of using a config? From my understanding, the
> > > BrokerRegistration/Heartbeat is more about the lifecycle of a broker
> > > whereas cordoning a broker is an operator driven action. It might make
> > > sense to have a separate record for this. We could include additional
> > > fields like a timestamp, a reason/comment field (e.g., "decommissioning",
> > > "disk failure", "new broker" etc), stuff like that.
> > >
> > > This would also allow cordoning to be done while a broker is offline or
> > > before it has been provisioned. Not sure how likely that is, but might be
> > > useful?
> > >
> > > DA3: Can we consider having a configuration to enable/disable the new
> > > replica placer behavior? This would be separate from the new
> > > MetadataVersion for the RPC/record changes.
> > >
> > > DA4: In the Motivation section, you mention the cluster expansion 
> > > scenario.
> > > For this scenario, is the expectation that the operator will cordon off 
> > > the
> > > existing full brokers so placements only happen on the new brokers?
> > >
> > > Cheers,
> > > David
> > >
> > > On Fri, Jul 12, 2024 at 8:53 AM Mickael Maison <mickael.mai...@gmail.com>
> > > wrote:
> > >
> > > > Hi Kamal,
> > > >
> > > > Thanks for taking a look at the KIP!
> > > >
> > > > I briefly considered that option initially but I found it not very
> > > > practical once you have more than a few cordoned log directories.
> > > > I find your example is already not very easy to read, and it only has
> > > > 2 entries. Also if the configuration is at the cluster level it'sis
> > > > not easy to see if a broker has all its log directories cordoned, and
> > > > you still need to describe a specific broker's configuration to find
> > > > the "name" of a log directory you want to cordon.
> > > >
> > > > I think an easy way to get an overall view of the cordoned log
> > > > directories/brokers will be via the kafka-log-dirs.sh tool. I am also
> > > > considering adding metrics like we have today for LogDirectoryOffline
> > > > to ease monitoring.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > > On Thu, Jul 11, 2024 at 8:41 PM Kamal Chandraprakash
> > > > <kamal.chandraprak...@gmail.com> wrote:
> > > > >
> > > > > Hi Mickael,
> > > > >
> > > > > Thanks for the KIP!
> > > > >
> > > > > This is a useful feature which helps to decommission the nodes by
> > > > > essentially
> > > > > creating a new replica exclude broker list.
> > > > >
> > > > > To cordon a list of brokers, we have to apply the config on each of 
> > > > > the
> > > > > broker nodes
> > > > > and similarly to see the list of cordoned brokers, we have to either
> > > > query
> > > > > individual broker
> > > > > config in the cluster or query each of the broker log directory.
> > > > >
> > > > > Can we move the configuration from the broker to cluster level? (eg)
> > > > >
> > > > > bin/kafka-configs.sh --bootstrap-server localhost:9092
> > > --broker-defaults
> > > > > --alter --add-config
> > > > > cordoned.log.dirs=[broker.id.0]:[log.dir.0],[broker.id.1]:[log.dir.1]
> > > > >
> > > > > This will provide a consistent view of the list of cordoned brokers in
> > > > the
> > > > > cluster.
> > > > >
> > > > > --
> > > > > Kamal
> > > > >
> > > > > On Wed, Jul 10, 2024 at 7:53 PM Mickael Maison <
> > > mickael.mai...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Luke,
> > > > > >
> > > > > > 4. You're right this scenario can happen. In this case I think the
> > > > > > broker should enforce its new state and not create the replica as 
> > > > > > all
> > > > > > its log directories are now cordoned. The replica will be offline 
> > > > > > and
> > > > > > an administrator would need to reassign it to another broker. I
> > > expect
> > > > > > most users will rely on kafka-configs.sh to manage the cordoned log
> > > > > > directories instead of updating the broker properties, so it should
> > > > > > not be a common issue.
> > > > > >
> > > > > > 6. To resolve this issue, the user can uncordon a log directory and
> > > > > > retry an operation that triggers the creation of the internal 
> > > > > > topics.
> > > > > > For example start a consumer using a group should make the broker
> > > > > > retry creating the __consumer_offsets topic.
> > > > > >
> > > > > > Thanks,
> > > > > > Mickael
> > > > > >
> > > > > > On Wed, Jul 10, 2024 at 4:14 PM Mickael Maison <
> > > > mickael.mai...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi Chia-Ping,
> > > > > > >
> > > > > > > Question 1) Yes that's a good summary. I'd also add that managing
> > > > > > > cordoned log directories is intended to be done by cluster
> > > > > > > administrators who also know about operations in-progress or
> > > planned
> > > > > > > such as scaling or adding/removing log directories. In practice 
> > > > > > > you
> > > > > > > can't expect users to provide "good" explicit partition 
> > > > > > > assignments
> > > > as
> > > > > > > they are not aware of the cluster operations.
> > > > > > >
> > > > > > > Question 2) I don't see an actual question :)
> > > > > > >
> > > > > > > Question 3) My current proposal is that in that case, brokers will
> > > > > > > ignore the preferred log directory and place the new replica on a
> > > log
> > > > > > > directory that is not cordoned. To use AlterReplicaLogDirs you 
> > > > > > > need
> > > > > > > ALTER permission on the CLUSTER resource. In most environments
> > > users
> > > > > > > don't have permissions to use that API. Or if they do, I'd expect
> > > > them
> > > > > > > to also have ALTER_CONFIGS on the BROKER resource and be able to
> > > > > > > change the cordoned log directories configuration.
> > > > > > >
> > > > > > > Question 4) If we don't include the CordonedLogDirs field in
> > > > > > > BrokerRegistrationRequest as you said there's a small window when
> > > the
> > > > > > > controller would not know if a broker can be used for assignments.
> > > If
> > > > > > > all log directories are cordoned the controller should not use 
> > > > > > > that
> > > > > > > broker for assignments.
> > > > > > >
> > > > > > > Question 5) In KRaft mode offline brokers are still registered, 
> > > > > > > and
> > > > > > > can be used for partition assignment. So we need to persist the
> > > > > > > cordoned log directories too to exclude brokers that don't have
> > > > > > > uncordoned log directories.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Mickael
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jul 10, 2024 at 12:25 PM Luke Chen <show...@gmail.com>
> > > > wrote:
> > > > > > > >
> > > > > > > > Hi Mickael,
> > > > > > > >
> > > > > > > > Thanks for the response.
> > > > > > > >
> > > > > > > > > 4. Cordoned log directories are persisted to the metadata log
> > > > via the
> > > > > > > > RegisterBrokerRecord, BrokerRegistrationChangeRecord records. If
> > > a
> > > > > > > > broker is offline, the controller will use the latest known 
> > > > > > > > state
> > > > of
> > > > > > > > the broker to determine the broker's cordoned log directories.
> > > I've
> > > > > > > > added a sentence clarifying this point.
> > > > > > > >
> > > > > > > > OK, so if the broker A goes offline, and the controller is in
> > > > "fenced"
> > > > > > > > state, without any cordoned log dirs, then some topic created 
> > > > > > > > and
> > > > > > assigned
> > > > > > > > to broker A. Later, broker A starts up with all its log dirs
> > > > cordoned
> > > > > > > > configured. At this situation, will the broker A create the
> > > > partitions?
> > > > > > > >
> > > > > > > > > 6. I'm leaning towards considering that scenario a
> > > configuration
> > > > > > > > error. If all log directories are cordoned before the internal
> > > > topics
> > > > > > > > are created, then the broker will not be able to create them.
> > > This
> > > > > > > > seems like a pretty strange scenario, where it's the first time
> > > you
> > > > > > > > start a broker, you've cordoned all its log directory, and the
> > > > > > > > internal topics (offsets and transactions) have not yet been
> > > > created
> > > > > > > > in the rest of the cluster.
> > > > > > > >
> > > > > > > > Yes, I agree that this should be a configuration error.
> > > > > > > > So the follow-up question is: Suppose users encounter this 
> > > > > > > > issue,
> > > > and
> > > > > > how
> > > > > > > > could they resolve it?
> > > > > > > > Uncordon the log dir dynamically using kafka-configs.sh? Will 
> > > > > > > > the
> > > > > > > > uncordoning config change recreate the partitions we didn't
> > > create
> > > > > > earlier
> > > > > > > > because of log dir cordoned?
> > > > > > > >
> > > > > > > > > The metadata log is different (not managed by LogManager), so 
> > > > > > > > > I
> > > > think
> > > > > > > > it should always be created regardless if its log directory is
> > > > > > > > cordoned or not.
> > > > > > > >
> > > > > > > > I agree we should treat "__cluster_metadata" differently.
> > > > > > > >
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > > Luke
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Jul 10, 2024 at 12:42 AM Mickael Maison <
> > > > > > mickael.mai...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Luke,
> > > > > > > > >
> > > > > > > > > 2. isCordoned() is a new method on LogDirDescription. It does
> > > not
> > > > > > take
> > > > > > > > > any arguments. It just returns true if this log directory the
> > > > > > > > > LogDirDescription represents is cordoned.
> > > > > > > > >
> > > > > > > > > 3. Sorry that was a typo. This method will only return a log
> > > > > > directory
> > > > > > > > > that is not cordoned. Fixed
> > > > > > > > >
> > > > > > > > > 4. Cordoned log directories are persisted to the metadata log
> > > > via the
> > > > > > > > > RegisterBrokerRecord, BrokerRegistrationChangeRecord records.
> > > If
> > > > a
> > > > > > > > > broker is offline, the controller will use the latest known
> > > > state of
> > > > > > > > > the broker to determine the broker's cordoned log directories.
> > > > I've
> > > > > > > > > added a sentence clarifying this point.
> > > > > > > > >
> > > > > > > > > 5. Yes a log directory can be uncordoned. You can either 
> > > > > > > > > update
> > > > the
> > > > > > > > > properties file and restart the broker or dynamically change
> > > the
> > > > > > value
> > > > > > > > > at runtime using kafka-configs. I've added a paragraph about 
> > > > > > > > > it
> > > > in
> > > > > > the
> > > > > > > > > KIP.
> > > > > > > > >
> > > > > > > > > 6. I'm leaning towards considering that scenario a
> > > configuration
> > > > > > > > > error. If all log directories are cordoned before the internal
> > > > topics
> > > > > > > > > are created, then the broker will not be able to create them.
> > > > This
> > > > > > > > > seems like a pretty strange scenario, where it's the first 
> > > > > > > > > time
> > > > you
> > > > > > > > > start a broker, you've cordoned all its log directory, and the
> > > > > > > > > internal topics (offsets and transactions) have not yet been
> > > > created
> > > > > > > > > in the rest of the cluster.
> > > > > > > > > The metadata log is different (not managed by LogManager), so 
> > > > > > > > > I
> > > > think
> > > > > > > > > it should always be created regardless if its log directory is
> > > > > > > > > cordoned or not.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Mickael
> > > > > > > > >
> > > > > > > > > On Tue, Jul 9, 2024 at 3:48 PM Chia-Ping Tsai <
> > > > chia7...@gmail.com>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > hi Mickael
> > > > > > > > > >
> > > > > > > > > > That is totally a good idea, but I have a question about the
> > > > > > > > > implementation
> > > > > > > > > >
> > > > > > > > > > Do we consider making pluggable ReplicaPlacer (KIP-660) 
> > > > > > > > > > first
> > > > and
> > > > > > then
> > > > > > > > > add
> > > > > > > > > > another impl of ReplicaPlacer to offer cordon mechanism?
> > > Noted
> > > > that
> > > > > > > > > > `ReplicaPlacer` can implement Reconfigurable to get updated
> > > at
> > > > > > runtime.
> > > > > > > > > > That is similar to KIP-1066 - change cordoned.log.dirs
> > > through
> > > > > > configs
> > > > > > > > > > update.
> > > > > > > > > >
> > > > > > > > > > The benefit is to let users have their optimized policy for
> > > > > > specific
> > > > > > > > > > scenario. Also, it can avoid that we add more and more
> > > > mechanism
> > > > > > to our
> > > > > > > > > > code base. Of course we can merge the mechanism which can be
> > > > used
> > > > > > by 99%
> > > > > > > > > > users :smile
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Chia-Ping
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Luke Chen <show...@gmail.com> 於 2024年7月9日 週二 下午9:07寫道:
> > > > > > > > > >
> > > > > > > > > > > Hi Mickael,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the KIP!
> > > > > > > > > > > This is a long waiting feature for many users!
> > > > > > > > > > >
> > > > > > > > > > > Questions:
> > > > > > > > > > > 1. I think piggyback the "BrokerHeartbeatRequest" to
> > > forward
> > > > the
> > > > > > > > > corden log
> > > > > > > > > > > dir to controller makes sense to me.
> > > > > > > > > > > We already did similar things for fence, controller
> > > shutdown,
> > > > > > failed
> > > > > > > > > log
> > > > > > > > > > > dir...etc.
> > > > > > > > > > >
> > > > > > > > > > > 2. In the admin API, what parameters will the new added
> > > > > > isCordoned()
> > > > > > > > > method
> > > > > > > > > > > take?
> > > > > > > > > > >
> > > > > > > > > > > 3. In the KIP, we said:
> > > > > > > > > > > "defaultDir(): This method will not return the Uuid of a
> > > log
> > > > > > directory
> > > > > > > > > that
> > > > > > > > > > > is not cordoned."
> > > > > > > > > > > --> It's hard to understand. Does that mean we will only
> > > > return
> > > > > > > > > cordoned
> > > > > > > > > > > log dir?
> > > > > > > > > > > From the current java doc of the interface, it doesn't 
> > > > > > > > > > > look
> > > > > > right:
> > > > > > > > > > > "Get the default directory for new partitions placed in a
> > > > given
> > > > > > > > > broker."
> > > > > > > > > > >
> > > > > > > > > > > 4. Currently, if a broker is registered and then go
> > > offline.
> > > > In
> > > > > > this
> > > > > > > > > state,
> > > > > > > > > > > the controller will still distribute partitions to this
> > > > broker.
> > > > > > > > > > > So, if now, the broker get startup with 
> > > > > > > > > > > "cordoned.log.dirs"
> > > > set,
> > > > > > what
> > > > > > > > > will
> > > > > > > > > > > happen?
> > > > > > > > > > > Will the newly assigned partitions be created successfully
> > > or
> > > > > > not?
> > > > > > > > > > >
> > > > > > > > > > > 5. I think after a log dir get cordoned, we can always
> > > > uncordon
> > > > > > it,
> > > > > > > > > right?
> > > > > > > > > > > I think we should mention it in the KIP.
> > > > > > > > > > >
> > > > > > > > > > > 6. If a broker is startup with "cordoned.log.dirs" set, 
> > > > > > > > > > > and
> > > > does
> > > > > > that
> > > > > > > > > mean
> > > > > > > > > > > the internal topics partitions (ex: __consumer_offsets)
> > > > cannot be
> > > > > > > > > created,
> > > > > > > > > > > either?
> > > > > > > > > > > Also, if this log dir is happen to be the metadata log 
> > > > > > > > > > > dir,
> > > > what
> > > > > > will
> > > > > > > > > > > happen to the metadata topic creation?
> > > > > > > > > > >
> > > > > > > > > > > Thanks.
> > > > > > > > > > > Luke
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Jul 9, 2024 at 12:12 AM Mickael Maison <
> > > > > > > > > mickael.mai...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for taking a look.
> > > > > > > > > > > >
> > > > > > > > > > > > - Yes you're right, I meant AlterPartitionReassignments.
> > > > Fixed.
> > > > > > > > > > > > - That's a good idea. I was expecting users to discover
> > > > > > cordoned log
> > > > > > > > > > > > directories by describing broker configurations. But
> > > being
> > > > > > able to
> > > > > > > > > > > > also get this information when describing log 
> > > > > > > > > > > > directories
> > > > makes
> > > > > > > > > sense.
> > > > > > > > > > > > I've added that to the KIP.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Mickael
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Jul 5, 2024 at 8:05 AM Haruki Okada <
> > > > > > ocadar...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you for the KIP.
> > > > > > > > > > > > > The motivation sounds make sense to me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have a few questions:
> > > > > > > > > > > > >
> > > > > > > > > > > > > - [nits] "AlterPartitions request" in Error handling
> > > > section
> > > > > > is
> > > > > > > > > > > > > "AlterPartitionReassignments request" actually, right?
> > > > > > > > > > > > > - Don't we need to include cordoned information in
> > > > > > DescribeLogDirs
> > > > > > > > > > > > response
> > > > > > > > > > > > > too? Some tools (e.g. CruiseControl) need to have a 
> > > > > > > > > > > > > way
> > > > to
> > > > > > know
> > > > > > > > > which
> > > > > > > > > > > > > broker/log-dirs are cordoned to generate partition
> > > > > > reassignment
> > > > > > > > > > > proposal.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2024年7月4日(木) 22:57 Mickael Maison <
> > > > mickael.mai...@gmail.com
> > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'd like to start a discussion on KIP-1066 that
> > > > introduces
> > > > > > a
> > > > > > > > > > > mechanism
> > > > > > > > > > > > > > to cordon log directories and brokers.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1066%3A+Mechanism+to+cordon+brokers+and+log+directories
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Mickael
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > ========================
> > > > > > > > > > > > > Okada Haruki
> > > > > > > > > > > > > ocadar...@gmail.com
> > > > > > > > > > > > > ========================
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > >
> > >
> > >
> > > --
> > > David Arthur
> > >

Reply via email to