Hi Mickael,

Thanks for the response.

> 4. Cordoned log directories are persisted to the metadata log via the
RegisterBrokerRecord, BrokerRegistrationChangeRecord records. If a
broker is offline, the controller will use the latest known state of
the broker to determine the broker's cordoned log directories. I've
added a sentence clarifying this point.

OK, so if the broker A goes offline, and the controller is in "fenced"
state, without any cordoned log dirs, then some topic created and assigned
to broker A. Later, broker A starts up with all its log dirs cordoned
configured. At this situation, will the broker A create the partitions?

> 6. I'm leaning towards considering that scenario a configuration
error. If all log directories are cordoned before the internal topics
are created, then the broker will not be able to create them. This
seems like a pretty strange scenario, where it's the first time you
start a broker, you've cordoned all its log directory, and the
internal topics (offsets and transactions) have not yet been created
in the rest of the cluster.

Yes, I agree that this should be a configuration error.
So the follow-up question is: Suppose users encounter this issue, and how
could they resolve it?
Uncordon the log dir dynamically using kafka-configs.sh? Will the
uncordoning config change recreate the partitions we didn't create earlier
because of log dir cordoned?

> The metadata log is different (not managed by LogManager), so I think
it should always be created regardless if its log directory is
cordoned or not.

I agree we should treat "__cluster_metadata" differently.


Thanks.
Luke


On Wed, Jul 10, 2024 at 12:42 AM Mickael Maison <mickael.mai...@gmail.com>
wrote:

> Hi Luke,
>
> 2. isCordoned() is a new method on LogDirDescription. It does not take
> any arguments. It just returns true if this log directory the
> LogDirDescription represents is cordoned.
>
> 3. Sorry that was a typo. This method will only return a log directory
> that is not cordoned. Fixed
>
> 4. Cordoned log directories are persisted to the metadata log via the
> RegisterBrokerRecord, BrokerRegistrationChangeRecord records. If a
> broker is offline, the controller will use the latest known state of
> the broker to determine the broker's cordoned log directories. I've
> added a sentence clarifying this point.
>
> 5. Yes a log directory can be uncordoned. You can either update the
> properties file and restart the broker or dynamically change the value
> at runtime using kafka-configs. I've added a paragraph about it in the
> KIP.
>
> 6. I'm leaning towards considering that scenario a configuration
> error. If all log directories are cordoned before the internal topics
> are created, then the broker will not be able to create them. This
> seems like a pretty strange scenario, where it's the first time you
> start a broker, you've cordoned all its log directory, and the
> internal topics (offsets and transactions) have not yet been created
> in the rest of the cluster.
> The metadata log is different (not managed by LogManager), so I think
> it should always be created regardless if its log directory is
> cordoned or not.
>
> Thanks,
> Mickael
>
> On Tue, Jul 9, 2024 at 3:48 PM Chia-Ping Tsai <chia7...@gmail.com> wrote:
> >
> > hi Mickael
> >
> > That is totally a good idea, but I have a question about the
> implementation
> >
> > Do we consider making pluggable ReplicaPlacer (KIP-660) first and then
> add
> > another impl of ReplicaPlacer to offer cordon mechanism? Noted that
> > `ReplicaPlacer` can implement Reconfigurable to get updated at runtime.
> > That is similar to KIP-1066 - change cordoned.log.dirs through configs
> > update.
> >
> > The benefit is to let users have their optimized policy for specific
> > scenario. Also, it can avoid that we add more and more mechanism to our
> > code base. Of course we can merge the mechanism which can be used by 99%
> > users :smile
> >
> > Best,
> > Chia-Ping
> >
> >
> > Luke Chen <show...@gmail.com> 於 2024年7月9日 週二 下午9:07寫道:
> >
> > > Hi Mickael,
> > >
> > > Thanks for the KIP!
> > > This is a long waiting feature for many users!
> > >
> > > Questions:
> > > 1. I think piggyback the "BrokerHeartbeatRequest" to forward the
> corden log
> > > dir to controller makes sense to me.
> > > We already did similar things for fence, controller shutdown, failed
> log
> > > dir...etc.
> > >
> > > 2. In the admin API, what parameters will the new added isCordoned()
> method
> > > take?
> > >
> > > 3. In the KIP, we said:
> > > "defaultDir(): This method will not return the Uuid of a log directory
> that
> > > is not cordoned."
> > > --> It's hard to understand. Does that mean we will only return
> cordoned
> > > log dir?
> > > From the current java doc of the interface, it doesn't look right:
> > > "Get the default directory for new partitions placed in a given
> broker."
> > >
> > > 4. Currently, if a broker is registered and then go offline. In this
> state,
> > > the controller will still distribute partitions to this broker.
> > > So, if now, the broker get startup with "cordoned.log.dirs" set, what
> will
> > > happen?
> > > Will the newly assigned partitions be created successfully or not?
> > >
> > > 5. I think after a log dir get cordoned, we can always uncordon it,
> right?
> > > I think we should mention it in the KIP.
> > >
> > > 6. If a broker is startup with "cordoned.log.dirs" set, and does that
> mean
> > > the internal topics partitions (ex: __consumer_offsets) cannot be
> created,
> > > either?
> > > Also, if this log dir is happen to be the metadata log dir, what will
> > > happen to the metadata topic creation?
> > >
> > > Thanks.
> > > Luke
> > >
> > >
> > > On Tue, Jul 9, 2024 at 12:12 AM Mickael Maison <
> mickael.mai...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks for taking a look.
> > > >
> > > > - Yes you're right, I meant AlterPartitionReassignments. Fixed.
> > > > - That's a good idea. I was expecting users to discover cordoned log
> > > > directories by describing broker configurations. But being able to
> > > > also get this information when describing log directories makes
> sense.
> > > > I've added that to the KIP.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > >
> > > > On Fri, Jul 5, 2024 at 8:05 AM Haruki Okada <ocadar...@gmail.com>
> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > Thank you for the KIP.
> > > > > The motivation sounds make sense to me.
> > > > >
> > > > > I have a few questions:
> > > > >
> > > > > - [nits] "AlterPartitions request" in Error handling section is
> > > > > "AlterPartitionReassignments request" actually, right?
> > > > > - Don't we need to include cordoned information in DescribeLogDirs
> > > > response
> > > > > too? Some tools (e.g. CruiseControl) need to have a way to know
> which
> > > > > broker/log-dirs are cordoned to generate partition reassignment
> > > proposal.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > 2024年7月4日(木) 22:57 Mickael Maison <mickael.mai...@gmail.com>:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'd like to start a discussion on KIP-1066 that introduces a
> > > mechanism
> > > > > > to cordon log directories and brokers.
> > > > > >
> > > > > >
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1066%3A+Mechanism+to+cordon+brokers+and+log+directories
> > > > > >
> > > > > > Thanks,
> > > > > > Mickael
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > ========================
> > > > > Okada Haruki
> > > > > ocadar...@gmail.com
> > > > > ========================
> > > >
> > >
>

Reply via email to