Re: [DISCUSS] KIP-966: Eligible Leader Replicas

Jun Rao Wed, 04 Oct 2023 09:12:17 -0700

Hi, Calvin,

Thanks for the reply. Just one more comment.


54. It seems that downgrading MV is supported. Is downgrading the software
version supported? It would be useful to document that.

Thanks,

Jun

On Tue, Oct 3, 2023 at 4:55 PM Artem Livshits
<alivsh...@confluent.io.invalid> wrote:

> Hi Colin,
>
> I think in your example "do_unclean_recovery" would need to do different
> things depending on the strategy.
>
> do_unclean_recovery() {
>    if (unclean.recovery.manager.enabled) {
>     if (strategy == Aggressive)
>       use UncleanRecoveryManager(waitLastKnownERL=false)  // just inspect
> logs from whoever is available
>     else
>       use  UncleanRecoveryManager(waitLastKnownERL=true)  // must wait for
> at least last known ELR
>   } else {
>     if (strategy == Aggressive)
>       choose the last known leader if that is available, or a random leader
> if not)
>     else
>       wait for last known leader to get back
>   }
> }
>
> The idea is that the Aggressive strategy would kick in as soon as we lost
> the leader and would pick a leader from whoever is available; but the
> Balanced will only kick in when ELR is empty and will wait for the brokers
> that likely have most data to be available.
>
> On Tue, Oct 3, 2023 at 3:04 PM Colin McCabe <cmcc...@apache.org> wrote:
>
> > On Tue, Oct 3, 2023, at 10:49, Jun Rao wrote:
> > > Hi, Calvin,
> > >
> > > Thanks for the update KIP. A few more comments.
> > >
> > > 41. Why would a user choose the option to select a random replica as
> the
> > > leader instead of using unclean.recovery.strateg=Aggressive? It seems
> > that
> > > the latter is strictly better? If that's not the case, could we fold
> this
> > > option under unclean.recovery.strategy instead of introducing a
> separate
> > > config?
> >
> > Hi Jun,
> >
> > I thought the flow of control was:
> >
> > If there is no leader for the partition {
> >   If (there are unfenced ELR members) {
> >     choose_an_unfenced_ELR_member
> >   } else if (there are fenced ELR members AND strategy=Aggressive) {
> >     do_unclean_recovery
> >   } else if (there are no ELR members AND strategy != None) {
> >     do_unclean_recovery
> >   } else {
> >     do nothing about the missing leader
> >   }
> > }
> >
> > do_unclean_recovery() {
> >    if (unclean.recovery.manager.enabled) {
> >     use UncleanRecoveryManager
> >   } else {
> >     choose the last known leader if that is available, or a random leader
> > if not)
> >   }
> > }
> >
> > However, I think this could be clarified, especially the behavior when
> > unclean.recovery.manager.enabled=false. Inuitively the goal for
> > unclean.recovery.manager.enabled=false is to be "the same as now, mostly"
> > but it's very underspecified in the KIP, I agree.
> >
> > >
> > > 50. ElectLeadersRequest: "If more than 20 topics are included, only the
> > > first 20 will be served. Others will be returned with DesiredLeaders."
> > Hmm,
> > > not sure that I understand this. ElectLeadersResponse doesn't have a
> > > DesiredLeaders field.
> > >
> > > 51. GetReplicaLogInfo: "If more than 2000 partitions are included, only
> > the
> > > first 2000 will be served" Do we return an error for the remaining
> > > partitions? Actually, should we include an errorCode field at the
> > partition
> > > level in GetReplicaLogInfoResponse to cover non-existing partitions and
> > no
> > > authorization, etc?
> > >
> > > 52. The entry should matches => The entry should match
> > >
> > > 53. ElectLeadersRequest.DesiredLeaders: Should it be nullable since a
> > user
> > > may not specify DesiredLeaders?
> > >
> > > 54. Downgrade: Is that indeed possible? I thought earlier you said that
> > > once the new version of the records are in the metadata log, one can't
> > > downgrade since the old broker doesn't know how to parse the new
> version
> > of
> > > the metadata records?
> > >
> >
> > MetadataVersion downgrade is currently broken but we have fixing it on
> our
> > plate for Kafka 3.7.
> >
> > The way downgrade works is that "new features" are dropped, leaving only
> > the old ones.
> >
> > > 55. CleanShutdownFile: Should we add a version field for future
> > extension?
> > >
> > > 56. Config changes are public facing. Could we have a separate section
> to
> > > document all the config changes?
> >
> > +1. A separate section for this would be good.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Sep 25, 2023 at 4:29 PM Calvin Liu <ca...@confluent.io.invalid
> >
> > > wrote:
> > >
> > >> Hi Jun
> > >> Thanks for the comments.
> > >>
> > >> 40. If we change to None, it is not guaranteed for no data loss. For
> > users
> > >> who are not able to validate the data with external resources, manual
> > >> intervention does not give a better result but a loss of availability.
> > So
> > >> practically speaking, the Balance mode would be a better default
> value.
> > >>
> > >> 41. No, it represents how we want to do the unclean leader election.
> If
> > it
> > >> is false, the unclean leader election will be the old random way.
> > >> Otherwise, the unclean recovery will be used.
> > >>
> > >> 42. Good catch. Updated.
> > >>
> > >> 43. Only the first 20 topics will be served. Others will be returned
> > with
> > >> InvalidRequestError
> > >>
> > >> 44. The order matters. The desired leader entries match with the topic
> > >> partition list by the index.
> > >>
> > >> 45. Thanks! Updated.
> > >>
> > >> 46. Good advice! Updated.
> > >>
> > >> 47.1, updated the comment. Basically it will elect the replica in the
> > >> desiredLeader field to be the leader
> > >>
> > >> 47.2 We can let the admin client do the conversion. Using the
> > desiredLeader
> > >> field in the json format seems easier for users.
> > >>
> > >> 48. Once the MV version is downgraded, all the ELR related fields will
> > be
> > >> removed on the next partition change. The controller will also ignore
> > the
> > >> ELR fields. Updated the KIP.
> > >>
> > >> 49. Yes, it would be deprecated/removed.
> > >>
> > >>
> > >> On Mon, Sep 25, 2023 at 3:49 PM Jun Rao <j...@confluent.io.invalid>
> > wrote:
> > >>
> > >> > Hi, Calvin,
> > >> >
> > >> > Thanks for the updated KIP. Made another pass. A few more comments
> > below.
> > >> >
> > >> > 40. unclean.leader.election.enable.false ->
> > >> > unclean.recovery.strategy.Balanced: The Balanced mode could still
> > lead to
> > >> > data loss. So, I am wondering if
> unclean.leader.election.enable.false
> > >> > should map to None?
> > >> >
> > >> > 41. unclean.recovery.manager.enabled: I am not sure why we introduce
> > this
> > >> > additional config. Is it the same as unclean.recovery.strategy=None?
> > >> >
> > >> > 42. DescribeTopicResponse.TopicAuthorizedOperations: Should this be
> at
> > >> the
> > >> > topic level?
> > >> >
> > >> > 43. "Limit: 20 topics max per request": Could we describe what
> > happens if
> > >> > the request includes more than 20 topics?
> > >> >
> > >> > 44. ElectLeadersRequest.DesiredLeaders: Could we describe whether
> the
> > >> > ordering matters?
> > >> >
> > >> > 45. GetReplicaLogInfo.TopicPartitions: "about": "The topic
> partitions
> > to
> > >> > elect leaders.": The description in "about" is incorrect.
> > >> >
> > >> > 46. GetReplicaLogInfoResponse: Should we nest partitions under
> > topicId to
> > >> > be consistent with other types of responses?
> > >> >
> > >> > 47. kafka-leader-election.sh:
> > >> > 47.1 Could we explain DESIGNATION?
> > >> > 47.2 desiredLeader: Should it be a list to match the field in
> > >> > ElectLeadersRequest?
> > >> >
> > >> > 48. We could add a section on downgrade?
> > >> >
> > >> > 49. LastKnownLeader: This seems only needed in the first phase of
> > >> > delivering ELR. Will it be removed when the complete KIP is
> delivered?
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Jun
> > >> >
> > >> > On Tue, Sep 19, 2023 at 1:30 PM Colin McCabe <cmcc...@apache.org>
> > wrote:
> > >> >
> > >> > > Hi Calvin,
> > >> > >
> > >> > > Thanks for the explanations. I like the idea of using none,
> > balanced,
> > >> > > aggressive. We also had an offline discussion about why it is good
> > to
> > >> > use a
> > >> > > new config key (basically, so that we can deprecate the old one
> > which
> > >> had
> > >> > > only false/true values in 4.0) With these changes, I am +1.
> > >> > >
> > >> > > best,
> > >> > > Colin
> > >> > >
> > >> > > On Mon, Sep 18, 2023, at 15:54, Calvin Liu wrote:
> > >> > > > Hi Colin,
> > >> > > > Also, can we deprecate unclean.leader.election.enable in 4.0?
> > Before
> > >> > > that,
> > >> > > > we can have both the config unclean.recovery.strategy and
> > >> > > > unclean.leader.election.enable
> > >> > > > and using the unclean.recovery.Enabled to determine which config
> > to
> > >> use
> > >> > > > during the unclean leader election.
> > >> > > >
> > >> > > > On Mon, Sep 18, 2023 at 3:51 PM Calvin Liu <ca...@confluent.io>
> > >> wrote:
> > >> > > >
> > >> > > >> Hi Colin,
> > >> > > >> For the unclean.recovery.strategy config name, how about we use
> > the
> > >> > > >> following
> > >> > > >> None. It basically means no unclean recovery will be performed.
> > >> > > >> Aggressive. It means availability goes first. Whenever the
> > partition
> > >> > > can't
> > >> > > >> elect a durable replica, the controller will try the unclean
> > >> recovery.
> > >> > > >> Balanced. It is the balance point of the availability
> > >> > first(Aggressive)
> > >> > > >> and least availability(None). The controller performs unclean
> > >> recovery
> > >> > > when
> > >> > > >> both ISR and ELR are empty.
> > >> > > >>
> > >> > > >>
> > >> > > >> On Fri, Sep 15, 2023 at 11:42 AM Calvin Liu <
> ca...@confluent.io>
> > >> > wrote:
> > >> > > >>
> > >> > > >>> Hi Colin,
> > >> > > >>>
> > >> > > >>> > So, the proposal is that if someone sets
> > >> > > "unclean.leader.election.enable
> > >> > > >>> = true"...
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> The idea is to use one of the unclean.leader.election.enable
> and
> > >> > > >>> unclean.recovery.strategy based on the
> > unclean.recovery.Enabled. A
> > >> > > possible
> > >> > > >>> version can be
> > >> > > >>>
> > >> > > >>> If unclean.recovery.Enabled:
> > >> > > >>>
> > >> > > >>> {
> > >> > > >>>
> > >> > > >>> Check unclean.recovery.strategy. If set, use it. Otherwise,
> > check
> > >> > > >>> unclean.leader.election.enable and translate it to
> > >> > > >>> unclean.recovery.strategy.
> > >> > > >>>
> > >> > > >>> } else {
> > >> > > >>>
> > >> > > >>> Use unclean.leader.election.enable
> > >> > > >>>
> > >> > > >>> }
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> —--------
> > >> > > >>>
> > >> > > >>> >The configuration key should be
> > >> "unclean.recovery.manager.enabled",
> > >> > > >>> right?
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> I think we have two ways of choosing a leader uncleanly,
> unclean
> > >> > leader
> > >> > > >>> election and unclean recovery(log inspection) and we try to
> > switch
> > >> > > between
> > >> > > >>> them.
> > >> > > >>>
> > >> > > >>> Do you mean we want to develop two ways of performing the
> > unclean
> > >> > > >>> recovery and one of them is using “unclean recovery manager”?
> I
> > >> guess
> > >> > > we
> > >> > > >>> haven’t discussed the second way.
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> —-------
> > >> > > >>>
> > >> > > >>> >How do these 4 levels of overrides interact with your new
> > >> > > >>> configurations?
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> I do notice in the Kraft controller code, the method to check
> > >> whether
> > >> > > >>> perform unclean leader election is hard coded to false since
> > >> > > >>> 2021(uncleanLeaderElectionEnabledForTopic). Isn’t it a good
> > chance
> > >> to
> > >> > > >>> completely deprecate the unclean.leader.election.enable? We
> > don’t
> > >> > even
> > >> > > have
> > >> > > >>> to worry about the config conversion.
> > >> > > >>>
> > >> > > >>> On the other hand, whatever the override is, as long as the
> > >> > controller
> > >> > > >>> can have the final effective unclean.leader.election.enable,
> the
> > >> > topic
> > >> > > >>> level config unclean.recovery.strategy, the cluster level
> config
> > >> > > >>> unclean.recovery.Enabled, the controller can calculate the
> > correct
> > >> > > methods
> > >> > > >>> to use right?
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> On Fri, Sep 15, 2023 at 10:02 AM Colin McCabe <
> > cmcc...@apache.org>
> > >> > > wrote:
> > >> > > >>>
> > >> > > >>>> On Thu, Sep 14, 2023, at 22:23, Calvin Liu wrote:
> > >> > > >>>> > Hi Colin
> > >> > > >>>> > 1. I think using the new config name is more clear.
> > >> > > >>>> >        a. The unclean leader election is actually removed
> if
> > >> > unclean
> > >> > > >>>> > recovery is in use.
> > >> > > >>>> >        b. Using multiple values in
> > >> unclean.leader.election.enable
> > >> > is
> > >> > > >>>> > confusing and it will be more confusing after people forget
> > >> about
> > >> > > this
> > >> > > >>>> > discussion.
> > >> > > >>>>
> > >> > > >>>> Hi Calvin,
> > >> > > >>>>
> > >> > > >>>> So, the proposal is that if someone sets
> > >> > > "unclean.leader.election.enable
> > >> > > >>>> = true" but then sets one of your new configurations, the
> > value of
> > >> > > >>>> unclean.leader.election.enable is ignored? That seems less
> > clear
> > >> to
> > >> > > me, not
> > >> > > >>>> more. Just in general, having multiple configuration keys to
> > >> control
> > >> > > the
> > >> > > >>>> same thing confuses users. Basically, they are sitting at a
> > giant
> > >> > > control
> > >> > > >>>> panel, and some of the levers do nothing.
> > >> > > >>>>
> > >> > > >>>> > 2. Sorry I forgot to mention in the response that I did add
> > the
> > >> > > >>>> > unclean.recovery.Enabled flag.
> > >> > > >>>>
> > >> > > >>>> The configuration key should be
> > >> "unclean.recovery.manager.enabled",
> > >> > > >>>> right? Becuase we can do "unclean recovery" without the
> > manager.
> > >> > > Disabling
> > >> > > >>>> the manager just means we use a different mechanism for
> > recovery.
> > >> > > >>>>
> > >> > > >>>> >        c. Maybe I underestimated the challenge of replacing
> > the
> > >> > > >>>> config. Any
> > >> > > >>>> > implementation problems ahead?
> > >> > > >>>>
> > >> > > >>>> There are four levels of overrides for
> > >> > unclean.leader.election.enable.
> > >> > > >>>>
> > >> > > >>>> 1. static configuration for node.
> > >> > > >>>>     This goes in the configuration file, typically named
> > >> > > >>>> server.properties
> > >> > > >>>>
> > >> > > >>>> 2. dynamic configuration for node default
> > >> > > >>>>   ConfigResource(type=BROKER, name="")
> > >> > > >>>>
> > >> > > >>>> 3. dynamic configuration for node
> > >> > > >>>>   ConfigResource(type=BROKER, name=<controller id>)
> > >> > > >>>>
> > >> > > >>>> 4. dynamic configuration for topic
> > >> > > >>>>   ConfigResource(type=TOPIC, name=<topic-name>)
> > >> > > >>>>
> > >> > > >>>> How do these 4 levels of overrides interact with your new
> > >> > > >>>> configurations? If the new configurations dominate over the
> old
> > >> > ones,
> > >> > > it
> > >> > > >>>> seems like this will get a lot more confusing to implement
> (and
> > >> also
> > >> > > to
> > >> > > >>>> use.)
> > >> > > >>>>
> > >> > > >>>> Again, I'd recommend just adding some new values to
> > >> > > >>>> unclean.leader.election.enable. It's simple and will prevent
> > user
> > >> > > confusion
> > >> > > >>>> (as well as developer confusion.)
> > >> > > >>>>
> > >> > > >>>> best,
> > >> > > >>>> Colin
> > >> > > >>>>
> > >> > > >>>>
> > >> > > >>>> > 3. About the admin client, I mentioned 3 changes in the
> > client.
> > >> > > >>>> Anything
> > >> > > >>>> > else I missed in the KIP?
> > >> > > >>>> >       a. The client will switch to using the new RPC
> instead
> > of
> > >> > > >>>> > MetadataRequest for the topics.
> > >> > > >>>> >       b. The TopicPartitionInfo used in TopicDescription
> > needs
> > >> to
> > >> > > add
> > >> > > >>>> new
> > >> > > >>>> > fields related to the ELR.
> > >> > > >>>> >       c. The outputs will add the ELR related fields.
> > >> > > >>>> >
> > >> > > >>>> > On Thu, Sep 14, 2023 at 9:19 PM Colin McCabe <
> > >> cmcc...@apache.org>
> > >> > > >>>> wrote:
> > >> > > >>>> >
> > >> > > >>>> >> Hi Calvin,
> > >> > > >>>> >>
> > >> > > >>>> >> Thanks for the changes.
> > >> > > >>>> >>
> > >> > > >>>> >> 1. Earlier I commented that creating
> > >> "unclean.recovery.strategy "
> > >> > > is
> > >> > > >>>> not
> > >> > > >>>> >> necessary, and we can just reuse the existing
> > >> > > >>>> >> "unclean.leader.election.enable" configuration key. Let's
> > >> discuss
> > >> > > >>>> that.
> > >> > > >>>> >>
> > >> > > >>>> >> 2.I also don't understand why you didn't add a
> > configuration to
> > >> > > >>>> enable or
> > >> > > >>>> >> disable the Unclean Recovery Manager. This seems like a
> very
> > >> > simple
> > >> > > >>>> way to
> > >> > > >>>> >> handle the staging issue which we discussed. The URM can
> > just
> > >> be
> > >> > > >>>> turned off
> > >> > > >>>> >> until it is production ready. Let's discuss this.
> > >> > > >>>> >>
> > >> > > >>>> >> 3. You still need to describe the changes to AdminClient
> > that
> > >> are
> > >> > > >>>> needed
> > >> > > >>>> >> to use DescribeTopicRequest.
> > >> > > >>>> >>
> > >> > > >>>> >> Keep at it. It's looking better. :)
> > >> > > >>>> >>
> > >> > > >>>> >> best,
> > >> > > >>>> >> Colin
> > >> > > >>>> >>
> > >> > > >>>> >>
> > >> > > >>>> >> On Thu, Sep 14, 2023, at 11:03, Calvin Liu wrote:
> > >> > > >>>> >> > Hi Colin
> > >> > > >>>> >> > Thanks for the comments!
> > >> > > >>>> >> >
> > >> > > >>>> >> > I did the following changes
> > >> > > >>>> >> >
> > >> > > >>>> >> >    1.
> > >> > > >>>> >> >
> > >> > > >>>> >> >    Simplified the API spec section to only include the
> > diff.
> > >> > > >>>> >> >    2.
> > >> > > >>>> >> >
> > >> > > >>>> >> >    Reordered the HWM requirement section.
> > >> > > >>>> >> >    3.
> > >> > > >>>> >> >
> > >> > > >>>> >> >    Removed the URM implementation details to keep the
> > >> necessary
> > >> > > >>>> >> >    characteristics to perform the unclean recovery.
> > >> > > >>>> >> >    1.
> > >> > > >>>> >> >
> > >> > > >>>> >> >       When to perform the unclean recovery
> > >> > > >>>> >> >       2.
> > >> > > >>>> >> >
> > >> > > >>>> >> >       Under different config, how the unclean recovery
> > finds
> > >> > the
> > >> > > >>>> leader.
> > >> > > >>>> >> >       3.
> > >> > > >>>> >> >
> > >> > > >>>> >> >       How the config unclean.leader.election.enable and
> > >> > > >>>> >> >       unclean.recovery.strategy are converted when users
> > >> > > >>>> enable/disable
> > >> > > >>>> >> the
> > >> > > >>>> >> >       unclean recovery.
> > >> > > >>>> >> >       4.
> > >> > > >>>> >> >
> > >> > > >>>> >> >    More details about how we change admin client.
> > >> > > >>>> >> >    5.
> > >> > > >>>> >> >
> > >> > > >>>> >> >    API limits on the GetReplicaLogInfoRequest and
> > >> > > >>>> DescribeTopicRequest.
> > >> > > >>>> >> >    6.
> > >> > > >>>> >> >
> > >> > > >>>> >> >    Two metrics added
> > >> > > >>>> >> >    1.
> > >> > > >>>> >> >
> > >> > > >>>> >> >
> >  Kafka.controller.global_under_min_isr_partition_count
> > >> > > >>>> >> >       2.
> > >> > > >>>> >> >
> > >> > > >>>> >> >       kafka.controller.unclean_recovery_finished_count
> > >> > > >>>> >> >
> > >> > > >>>> >> >
> > >> > > >>>> >> > On Wed, Sep 13, 2023 at 10:46 AM Colin McCabe <
> > >> > > cmcc...@apache.org>
> > >> > > >>>> >> wrote:
> > >> > > >>>> >> >
> > >> > > >>>> >> >> On Tue, Sep 12, 2023, at 17:21, Calvin Liu wrote:
> > >> > > >>>> >> >> > Hi Colin
> > >> > > >>>> >> >> > Thanks for the comments!
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Hi Calvin,
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Thanks again for the KIP.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> One meta-comment: it's usually better to just do a diff
> > on a
> > >> > > >>>> message
> > >> > > >>>> >> spec
> > >> > > >>>> >> >> file or java file if you're including changes to it in
> > the
> > >> > KIP.
> > >> > > >>>> This is
> > >> > > >>>> >> >> easier to read than looking for "new fields begin" etc.
> > in
> > >> the
> > >> > > >>>> text, and
> > >> > > >>>> >> >> gracefully handles the case where existing fields were
> > >> > changed.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> > Rewrite the Additional High Watermark advancement
> > >> > requirement
> > >> > > >>>> >> >> > There was feedback on this section that some readers
> > may
> > >> not
> > >> > > be
> > >> > > >>>> >> familiar
> > >> > > >>>> >> >> > with HWM and Ack=0,1,all requests. This can help them
> > >> > > understand
> > >> > > >>>> the
> > >> > > >>>> >> >> > proposal. I will rewrite this part for more
> > readability.
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> To be clear, I wasn't suggesting dropping either
> > section. I
> > >> > > agree
> > >> > > >>>> that
> > >> > > >>>> >> >> they add useful background. I was just suggesting that
> we
> > >> > should
> > >> > > >>>> discuss
> > >> > > >>>> >> >> the "acks" setting AFTER discussing the new high
> > watermark
> > >> > > >>>> advancement
> > >> > > >>>> >> >> conditions. We also should discuss acks=0. While it
> isn't
> > >> > > >>>> conceptually
> > >> > > >>>> >> much
> > >> > > >>>> >> >> different than acks=1 here, its omission from this
> > section
> > >> is
> > >> > > >>>> confusing.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> > Unclean recovery
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > The plan is to replace the
> > unclean.leader.election.enable
> > >> > with
> > >> > > >>>> >> >> > unclean.recovery.strategy. If the Unclean Recovery is
> > >> > enabled
> > >> > > >>>> then it
> > >> > > >>>> >> >> deals
> > >> > > >>>> >> >> > with the three options in the
> > unclean.recovery.strategy.
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > Let’s refine the Unclean Recovery. We have already
> > taken a
> > >> > > lot of
> > >> > > >>>> >> >> > suggestions and I hope to enhance the durability of
> > Kafka
> > >> to
> > >> > > the
> > >> > > >>>> next
> > >> > > >>>> >> >> level
> > >> > > >>>> >> >> > with this KIP.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> I am OK with doing the unclean leader recovery
> > improvements
> > >> in
> > >> > > >>>> this KIP.
> > >> > > >>>> >> >> However, I think we need to really work on the
> > configuration
> > >> > > >>>> settings.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Configuration overrides are often quite messy. For
> > example,
> > >> > the
> > >> > > >>>> cases
> > >> > > >>>> >> >> where we have log.roll.hours and log.roll.segment.ms,
> > the
> > >> > user
> > >> > > >>>> has to
> > >> > > >>>> >> >> remember which one takes precedence, and it is not
> > obvious.
> > >> > So,
> > >> > > >>>> rather
> > >> > > >>>> >> than
> > >> > > >>>> >> >> creating a new configuration, why not add additional
> > values
> > >> to
> > >> > > >>>> >> >> "unclean.leader.election.enable"? I think this will be
> > >> simpler
> > >> > > for
> > >> > > >>>> >> people
> > >> > > >>>> >> >> to understand, and simpler in the code as well.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> What if we continued to use
> > "unclean.leader.election.enable"
> > >> > but
> > >> > > >>>> >> extended
> > >> > > >>>> >> >> it so that it took a string? Then the string could have
> > >> these
> > >> > > >>>> values:
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> never
> > >> > > >>>> >> >>     never automatically do an unclean leader election
> > under
> > >> > any
> > >> > > >>>> >> conditions
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> false / default
> > >> > > >>>> >> >>     only do an unclean leader election if there may be
> > >> > possible
> > >> > > >>>> data
> > >> > > >>>> >> loss
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> true / always
> > >> > > >>>> >> >>     always do an unclean leader election if we can't
> > >> > immediately
> > >> > > >>>> elect a
> > >> > > >>>> >> >> leader
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> It's a bit awkward that false maps to default rather
> > than to
> > >> > > >>>> never. But
> > >> > > >>>> >> >> this awkwardness exists if we use two different
> > >> configuration
> > >> > > keys
> > >> > > >>>> as
> > >> > > >>>> >> well.
> > >> > > >>>> >> >> The reason for the awkwardness is that we simply don't
> > want
> > >> > most
> > >> > > >>>> of the
> > >> > > >>>> >> >> people currently setting
> > >> unclean.leader.election.enable=false
> > >> > to
> > >> > > >>>> get the
> > >> > > >>>> >> >> "never" behavior. We have to bite that bullet. Better
> to
> > be
> > >> > > clear
> > >> > > >>>> and
> > >> > > >>>> >> >> explicit than hide it.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Another thing that's a bit awkward is having two
> > different
> > >> > ways
> > >> > > to
> > >> > > >>>> do
> > >> > > >>>> >> >> unclean leader election specified in the KIP. You
> > descirbe
> > >> two
> > >> > > >>>> methods:
> > >> > > >>>> >> the
> > >> > > >>>> >> >> simple "choose the last leader" method, and the
> "unclean
> > >> > > recovery
> > >> > > >>>> >> manager"
> > >> > > >>>> >> >> method. I understand why you did it this way -- "choose
> > the
> > >> > last
> > >> > > >>>> >> leader" is
> > >> > > >>>> >> >> simple, and will help us deliver an implementation
> > quickly,
> > >> > > while
> > >> > > >>>> the
> > >> > > >>>> >> URM
> > >> > > >>>> >> >> is preferable in the long term. My suggestion here is
> to
> > >> > > separate
> > >> > > >>>> the
> > >> > > >>>> >> >> decision of HOW to do unclean leader election from the
> > >> > decision
> > >> > > of
> > >> > > >>>> WHEN
> > >> > > >>>> >> to
> > >> > > >>>> >> >> do it.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> So in other words, have
> "unclean.leader.election.enable"
> > >> > specify
> > >> > > >>>> when we
> > >> > > >>>> >> >> do unclean leader election, and have a new
> configuration
> > >> like
> > >> > > >>>> >> >> "unclean.recovery.manager.enable" to determine if we
> use
> > the
> > >> > > URM.
> > >> > > >>>> >> >> Presumably the URM will take some time to get fully
> > stable,
> > >> so
> > >> > > >>>> this can
> > >> > > >>>> >> >> default to false for a while, and we can flip the
> > default to
> > >> > > true
> > >> > > >>>> when
> > >> > > >>>> >> we
> > >> > > >>>> >> >> feel ready.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> The URM is somewhat under-described here. I think we
> > need a
> > >> > few
> > >> > > >>>> >> >> configurations here for it. For example, we need a
> > >> > > configuration to
> > >> > > >>>> >> specify
> > >> > > >>>> >> >> how long it should wait for a broker to respond to its
> > RPCs
> > >> > > before
> > >> > > >>>> >> moving
> > >> > > >>>> >> >> on. We also need to understand how the URM interacts
> with
> > >> > > >>>> >> >> unclean.leader.election.enable=always. I assume that
> with
> > >> > > "always"
> > >> > > >>>> we
> > >> > > >>>> >> will
> > >> > > >>>> >> >> just unconditionally use the URM rather than choosing
> > >> > randomly.
> > >> > > >>>> But this
> > >> > > >>>> >> >> should be spelled out in the KIP.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > DescribeTopicRequest
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> >    1.
> > >> > > >>>> >> >> >    Yes, the plan is to replace the MetadataRequest
> with
> > >> the
> > >> > > >>>> >> >> >    DescribeTopicRequest for the admin clients. Will
> > check
> > >> > the
> > >> > > >>>> details.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Sounds good. But as I said, you need to specify how
> > >> > AdminClient
> > >> > > >>>> >> interacts
> > >> > > >>>> >> >> with the new request. This will involve adding some
> > fields
> > >> to
> > >> > > >>>> >> >> TopicDescription.java. And you need to specify the
> > changes
> > >> to
> > >> > > the
> > >> > > >>>> >> >> kafka-topics.sh command line tool. Otherwise we cannot
> > use
> > >> the
> > >> > > >>>> tool to
> > >> > > >>>> >> see
> > >> > > >>>> >> >> the new information.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> The new requests, DescribeTopicRequest and
> > >> > > >>>> GetReplicaLogInfoRequest,
> > >> > > >>>> >> need
> > >> > > >>>> >> >> to have limits placed on them so that their size can't
> be
> > >> > > >>>> infinite. We
> > >> > > >>>> >> >> don't want to propagate the current problems of
> > >> > MetadataRequest,
> > >> > > >>>> where
> > >> > > >>>> >> >> clients can request massive responses that can mess up
> > the
> > >> JVM
> > >> > > when
> > >> > > >>>> >> handled.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Adding limits is simple for GetReplicaLogInfoRequest --
> > we
> > >> can
> > >> > > >>>> just say
> > >> > > >>>> >> >> that only 2000 partitions at a time can be requested.
> For
> > >> > > >>>> >> >> DescribeTopicRequest we can probably just limit to 20
> > topics
> > >> > or
> > >> > > >>>> >> something
> > >> > > >>>> >> >> like that, to avoid the complexity of doing pagination
> in
> > >> this
> > >> > > KIP.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> >    2.
> > >> > > >>>> >> >> >    I can let the broker load the ELR info so that
> they
> > can
> > >> > > serve
> > >> > > >>>> the
> > >> > > >>>> >> >> >    DescribeTopicRequest as well.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> Yes, it's fine to add to MetadataCache. In fact, you'll
> > be
> > >> > > loading
> > >> > > >>>> it
> > >> > > >>>> >> >> anyway once it's added to PartitionImage.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> >    3.
> > >> > > >>>> >> >> >    Yeah, it does not make sense to have the topic id
> if
> > >> > > >>>> >> >> >    DescribeTopicRequest is only used by the admin
> > client.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> OK. That makes things simpler. We can always create a
> new
> > >> API
> > >> > > later
> > >> > > >>>> >> >> (hopefully not in this KIP!) to query by topic ID.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > Metrics
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > As for overall cluster health metrics, I think
> > >> under-min-ISR
> > >> > > is
> > >> > > >>>> still
> > >> > > >>>> >> a
> > >> > > >>>> >> >> > useful one. ELR is more like a safety belt. When the
> > ELR
> > >> is
> > >> > > >>>> used, the
> > >> > > >>>> >> >> > cluster availability has already been impacted.
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > Maybe we can have a metric to count the partitions
> that
> > >> > > sum(ISR,
> > >> > > >>>> ELR)
> > >> > > >>>> >> <
> > >> > > >>>> >> >> min
> > >> > > >>>> >> >> > ISR. What do you think?
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> How about:
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> A.  a metric for the totoal number of under-min-isr
> > >> > partitions?
> > >> > > We
> > >> > > >>>> don't
> > >> > > >>>> >> >> have that in Apache Kafka at the moment.
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> B. a metric for the number of unclean leader elections
> we
> > >> did
> > >> > > (for
> > >> > > >>>> >> >> simplicity, it can reset to 0 on controller restart: we
> > >> expect
> > >> > > >>>> people to
> > >> > > >>>> >> >> monitor the change over time anyway)
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> best,
> > >> > > >>>> >> >> Colin
> > >> > > >>>> >> >>
> > >> > > >>>> >> >>
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > Yeah, for the ongoing unclean recoveries, the
> > controller
> > >> can
> > >> > > >>>> keep an
> > >> > > >>>> >> >> > accurate count through failover because partition
> > >> > registration
> > >> > > >>>> can
> > >> > > >>>> >> >> indicate
> > >> > > >>>> >> >> > whether a recovery is needed. However, for the
> happened
> > >> > ones,
> > >> > > >>>> unless
> > >> > > >>>> >> we
> > >> > > >>>> >> >> > want to persist the number somewhere, we can only
> > figure
> > >> it
> > >> > > out
> > >> > > >>>> from
> > >> > > >>>> >> the
> > >> > > >>>> >> >> > log.
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> > On Tue, Sep 12, 2023 at 3:16 PM Colin McCabe <
> > >> > > cmcc...@apache.org
> > >> > > >>>> >
> > >> > > >>>> >> wrote:
> > >> > > >>>> >> >> >
> > >> > > >>>> >> >> >> Also, we should have metrics that show what is going
> > on
> > >> > with
> > >> > > >>>> regard
> > >> > > >>>> >> to
> > >> > > >>>> >> >> the
> > >> > > >>>> >> >> >> eligible replica set. I'm not sure exactly what to
> > >> suggest,
> > >> > > but
> > >> > > >>>> >> >> something
> > >> > > >>>> >> >> >> that could identify when things are going wrong in
> the
> > >> > > clsuter.
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >> >> For example, maybe a metric for partitions
> containing
> > >> > > replicas
> > >> > > >>>> that
> > >> > > >>>> >> are
> > >> > > >>>> >> >> >> ineligible to be leader? That would show a spike
> when
> > a
> > >> > > broker
> > >> > > >>>> had an
> > >> > > >>>> >> >> >> unclean restart.
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >> >> Ideally, we'd also have a metric that indicates when
> > an
> > >> > > unclear
> > >> > > >>>> >> leader
> > >> > > >>>> >> >> >> election or a recovery happened. It's a bit tricky
> > >> because
> > >> > > the
> > >> > > >>>> simple
> > >> > > >>>> >> >> >> thing, of tracking it per controller, may be a bit
> > >> > confusing
> > >> > > >>>> during
> > >> > > >>>> >> >> >> failovers.
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >> >> best,
> > >> > > >>>> >> >> >> Colin
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >> >> On Tue, Sep 12, 2023, at 14:25, Colin McCabe wrote:
> > >> > > >>>> >> >> >> > Hi Calvin,
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > Thanks for the KIP. I think this is a great
> > >> improvement.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> >> Additional High Watermark advance requirement
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > Typo: change "advance" to "advancement"
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> >> A bit recap of some key concepts.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > Typo: change "bit" to "quick"
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> >> Ack=1/all produce request. It defines when the
> > Kafka
> > >> > > server
> > >> > > >>>> should
> > >> > > >>>> >> >> >> respond to the produce request
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > I think this section would be clearer if we talked
> > >> about
> > >> > > the
> > >> > > >>>> new
> > >> > > >>>> >> high
> > >> > > >>>> >> >> >> > watermark advancement requirement first, and THEN
> > >> talked
> > >> > > >>>> about its
> > >> > > >>>> >> >> >> > impact on acks=0, acks=1, and     acks=all.
> > acks=all
> > >> is
> > >> > of
> > >> > > >>>> course
> > >> > > >>>> >> the
> > >> > > >>>> >> >> >> > main case we care about here, so it would be good
> to
> > >> lead
> > >> > > with
> > >> > > >>>> >> that,
> > >> > > >>>> >> >> >> > rather than delving into the technicalities of
> > acks=0/1
> > >> > > first.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> >> Unclean recovery
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > So, here you are introducing a new configuration,
> > >> > > >>>> >> >> >> > unclean.recovery.strategy. The difficult thing
> here
> > is
> > >> > that
> > >> > > >>>> there
> > >> > > >>>> >> is a
> > >> > > >>>> >> >> >> > lot of overlap with
> unclean.leader.election.enable.
> > So
> > >> we
> > >> > > >>>> have 3
> > >> > > >>>> >> >> >> > different settings for unclean.recovery.strategy,
> > plus
> > >> 2
> > >> > > >>>> different
> > >> > > >>>> >> >> >> > settings for unclean.leader.election.enable,
> giving
> > a
> > >> > cross
> > >> > > >>>> >> product of
> > >> > > >>>> >> >> >> > 6 different options. The following "unclean
> recovery
> > >> > > manager"
> > >> > > >>>> >> section
> > >> > > >>>> >> >> >> > only applies to one fo those 6 different
> > possibilities
> > >> (I
> > >> > > >>>> think?)
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > I simply don't think we need so many different
> > election
> > >> > > types.
> > >> > > >>>> >> Really
> > >> > > >>>> >> >> >> > the use-cases we need are people who want NO
> unclean
> > >> > > >>>> elections,
> > >> > > >>>> >> people
> > >> > > >>>> >> >> >> > who want "the reasonable thing" and people who
> want
> > >> > > >>>> avaialbility at
> > >> > > >>>> >> >> all
> > >> > > >>>> >> >> >> > costs.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > Overall, I feel like the first half of the KIP is
> > about
> > >> > the
> > >> > > >>>> ELR,
> > >> > > >>>> >> and
> > >> > > >>>> >> >> >> > the second half is about reworking unclean leader
> > >> > > election. It
> > >> > > >>>> >> might
> > >> > > >>>> >> >> be
> > >> > > >>>> >> >> >> > better to move that second half to a separate KIP
> so
> > >> that
> > >> > > we
> > >> > > >>>> can
> > >> > > >>>> >> >> figure
> > >> > > >>>> >> >> >> > it out fully. It should be fine to punt this until
> > >> later
> > >> > > and
> > >> > > >>>> just
> > >> > > >>>> >> have
> > >> > > >>>> >> >> >> > the current behavior on empty ELR be waiting for
> the
> > >> last
> > >> > > >>>> known
> > >> > > >>>> >> leader
> > >> > > >>>> >> >> >> > to return. After all, that's what we do today.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> >> DescribeTopicRequest
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > Is the intention for AdminClient to use this RPC
> for
> > >> > > >>>> >> >> >> > Admin#describeTopics ? If so, we need to describe
> > all
> > >> of
> > >> > > the
> > >> > > >>>> >> changes
> > >> > > >>>> >> >> to
> > >> > > >>>> >> >> >> > the admin client API, as well as changes to
> > >> command-line
> > >> > > >>>> tools like
> > >> > > >>>> >> >> >> > kafka-topics.sh (if there are any). For example,
> you
> > >> will
> > >> > > >>>> probably
> > >> > > >>>> >> >> need
> > >> > > >>>> >> >> >> > changes to TopicDescription.java. You will also
> > need to
> > >> > > >>>> provide
> > >> > > >>>> >> all of
> > >> > > >>>> >> >> >> > the things that admin client needs -- for example,
> > >> > > >>>> >> >> >> > TopicAuthorizedOperations.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > I also don't think the controller should serve
> this
> > >> > > request.
> > >> > > >>>> We
> > >> > > >>>> >> want
> > >> > > >>>> >> >> to
> > >> > > >>>> >> >> >> > minimize load on the controller. Just like with
> the
> > >> other
> > >> > > >>>> metadata
> > >> > > >>>> >> >> >> > requests like MetadataRequest, this should be
> > served by
> > >> > > >>>> brokers.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > It's a bit confusing why both topic ID and topic
> > name
> > >> are
> > >> > > >>>> provided
> > >> > > >>>> >> to
> > >> > > >>>> >> >> >> > this API. Is the intention that callers should set
> > one
> > >> > but
> > >> > > >>>> not the
> > >> > > >>>> >> >> >> > other? Or both? This needs to be clarified. Also,
> > if we
> > >> > do
> > >> > > >>>> want to
> > >> > > >>>> >> >> >> > support lookups by UUID, that is another thing
> that
> > >> needs
> > >> > > to
> > >> > > >>>> be
> > >> > > >>>> >> added
> > >> > > >>>> >> >> >> > to adminclient.
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > In general, I feel like this should also probably
> be
> > >> its
> > >> > > own
> > >> > > >>>> KIP
> > >> > > >>>> >> since
> > >> > > >>>> >> >> >> > it's fairly complex
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > best,
> > >> > > >>>> >> >> >> > Colin
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> >
> > >> > > >>>> >> >> >> > On Thu, Aug 10, 2023, at 15:46, Calvin Liu wrote:
> > >> > > >>>> >> >> >> >> Hi everyone,
> > >> > > >>>> >> >> >> >> I'd like to discuss a series of enhancement to
> the
> > >> > > >>>> replication
> > >> > > >>>> >> >> protocol.
> > >> > > >>>> >> >> >> >>
> > >> > > >>>> >> >> >> >> A partition replica can experience local data
> loss
> > in
> > >> > > unclean
> > >> > > >>>> >> >> shutdown
> > >> > > >>>> >> >> >> >> scenarios where unflushed data in the OS page
> > cache is
> > >> > > lost
> > >> > > >>>> - such
> > >> > > >>>> >> >> as an
> > >> > > >>>> >> >> >> >> availability zone power outage or a server error.
> > The
> > >> > > Kafka
> > >> > > >>>> >> >> replication
> > >> > > >>>> >> >> >> >> protocol is designed to handle these situations
> by
> > >> > > removing
> > >> > > >>>> such
> > >> > > >>>> >> >> >> replicas
> > >> > > >>>> >> >> >> >> from the ISR and only re-adding them once they
> have
> > >> > caught
> > >> > > >>>> up and
> > >> > > >>>> >> >> >> therefore
> > >> > > >>>> >> >> >> >> recovered any lost data. This prevents replicas
> > that
> > >> > lost
> > >> > > an
> > >> > > >>>> >> >> arbitrary
> > >> > > >>>> >> >> >> log
> > >> > > >>>> >> >> >> >> suffix, which included committed data, from being
> > >> > elected
> > >> > > >>>> leader.
> > >> > > >>>> >> >> >> >> However, there is a "last replica standing" state
> > >> which
> > >> > > when
> > >> > > >>>> >> combined
> > >> > > >>>> >> >> >> with
> > >> > > >>>> >> >> >> >> a data loss unclean shutdown event can turn a
> local
> > >> data
> > >> > > loss
> > >> > > >>>> >> >> scenario
> > >> > > >>>> >> >> >> into
> > >> > > >>>> >> >> >> >> a global data loss scenario, i.e., committed data
> > can
> > >> be
> > >> > > >>>> removed
> > >> > > >>>> >> from
> > >> > > >>>> >> >> >> all
> > >> > > >>>> >> >> >> >> replicas. When the last replica in the ISR
> > experiences
> > >> > an
> > >> > > >>>> unclean
> > >> > > >>>> >> >> >> shutdown
> > >> > > >>>> >> >> >> >> and loses committed data, it will be reelected
> > leader
> > >> > > after
> > >> > > >>>> >> starting
> > >> > > >>>> >> >> up
> > >> > > >>>> >> >> >> >> again, causing rejoining followers to truncate
> > their
> > >> > logs
> > >> > > and
> > >> > > >>>> >> thereby
> > >> > > >>>> >> >> >> >> removing the last copies of the committed records
> > >> which
> > >> > > the
> > >> > > >>>> leader
> > >> > > >>>> >> >> lost
> > >> > > >>>> >> >> >> >> initially.
> > >> > > >>>> >> >> >> >>
> > >> > > >>>> >> >> >> >> The new KIP will maximize the protection and
> > provides
> > >> > > >>>> MinISR-1
> > >> > > >>>> >> >> >> tolerance to
> > >> > > >>>> >> >> >> >> data loss unclean shutdown events.
> > >> > > >>>> >> >> >> >>
> > >> > > >>>> >> >> >> >>
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >>
> > >> > > >>>> >>
> > >> > > >>>>
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> > >> > > >>>> >> >> >>
> > >> > > >>>> >> >>
> > >> > > >>>> >>
> > >> > > >>>>
> > >> > > >>>
> > >> > >
> > >> >
> > >>
> >
>

Re: [DISCUSS] KIP-966: Eligible Leader Replicas

Reply via email to