Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

Colin McCabe Thu, 25 Jul 2019 08:38:41 -0700

We still want to give the "blacklisted" broker the leadership if nobody else is 
available.  Therefore, isn't putting a broker on the blacklist pretty much the 
same as moving it to the last entry in the replicas list and then triggering a 
preferred leader election?


If we want this to be undone after a certain amount of time, or under certain 
conditions, that seems like something that would be more effectively done by an 
external system, rather than putting all these policies into Kafka.

best,
Colin


On Fri, Jul 19, 2019, at 18:23, George Li wrote:
>  Hi Satish,
> Thanks for the reviews and feedbacks.
> 
> > > The following is the requirements this KIP is trying to accomplish:
> > This can be moved to the"Proposed changes" section.
> 
> Updated the KIP-491. 
> 
> > >>The logic to determine the priority/order of which broker should be
> > preferred leader should be modified.  The broker in the preferred leader
> > blacklist should be moved to the end (lowest priority) when
> > determining leadership.
> >
> > I believe there is no change required in the ordering of the preferred
> > replica list. Brokers in the preferred leader blacklist are skipped
> > until other brokers int he list are unavailable.
> 
> Yes. partition assignment remained the same, replica & ordering. The 
> blacklist logic can be optimized during implementation. 
> 
> > >>The blacklist can be at the broker level. However, there might be use 
> > >>cases
> > where a specific topic should blacklist particular brokers, which
> > would be at the
> > Topic level Config. For this use cases of this KIP, it seems that broker 
> > level
> > blacklist would suffice.  Topic level preferred leader blacklist might
> > be future enhancement work.
> > 
> > I agree that the broker level preferred leader blacklist would be
> > sufficient. Do you have any use cases which require topic level
> > preferred blacklist?
> 
> 
> 
> I don't have any concrete use cases for Topic level preferred leader 
> blacklist.  One scenarios I can think of is when a broker has high CPU 
> usage, trying to identify the big topics (High MsgIn, High BytesIn, 
> etc), then try to move the leaders away from this broker,  before doing 
> an actual reassignment to change its preferred leader,  try to put this 
> preferred_leader_blacklist in the Topic Level config, and run preferred 
> leader election, and see whether CPU decreases for this broker,  if 
> yes, then do the reassignments to change the preferred leaders to be 
> "permanent" (the topic may have many partitions like 256 that has quite 
> a few of them having this broker as preferred leader).  So this Topic 
> Level config is an easy way of doing trial and check the result. 
> 
> 
> > You can add the below workaround as an item in the rejected alternatives 
> > section
> > "Reassigning all the topic/partitions which the intended broker is a
> > replica for."
> 
> Updated the KIP-491. 
> 
> 
> 
> Thanks, 
> George
> 
>     On Friday, July 19, 2019, 08:20:22 AM PDT, Satish Duggana 
> <[email protected]> wrote:  
>  
>  Thanks for the KIP. I have put my comments below.
> 
> This is a nice improvement to avoid cumbersome maintenance.
> 
> >> The following is the requirements this KIP is trying to accomplish:
>   The ability to add and remove the preferred leader deprioritized
> list/blacklist. e.g. new ZK path/node or new dynamic config.
> 
> This can be moved to the"Proposed changes" section.
> 
> >>The logic to determine the priority/order of which broker should be
> preferred leader should be modified.  The broker in the preferred leader
> blacklist should be moved to the end (lowest priority) when
> determining leadership.
> 
> I believe there is no change required in the ordering of the preferred
> replica list. Brokers in the preferred leader blacklist are skipped
> until other brokers int he list are unavailable.
> 
> >>The blacklist can be at the broker level. However, there might be use cases
> where a specific topic should blacklist particular brokers, which
> would be at the
> Topic level Config. For this use cases of this KIP, it seems that broker level
> blacklist would suffice.  Topic level preferred leader blacklist might
> be future enhancement work.
> 
> I agree that the broker level preferred leader blacklist would be
> sufficient. Do you have any use cases which require topic level
> preferred blacklist?
> 
> You can add the below workaround as an item in the rejected alternatives 
> section
> "Reassigning all the topic/partitions which the intended broker is a
> replica for."
> 
> Thanks,
> Satish.
> 
> On Fri, Jul 19, 2019 at 7:33 AM Stanislav Kozlovski
> <[email protected]> wrote:
> >
> > Hey George,
> >
> > Thanks for the KIP, it's an interesting idea.
> >
> > I was wondering whether we could achieve the same thing via the
> > kafka-reassign-partitions tool. As you had also said in the JIRA,  it is
> > true that this is currently very tedious with the tool. My thoughts are
> > that we could improve the tool and give it the notion of a "blacklisted
> > preferred leader".
> > This would have some benefits like:
> > - more fine-grained control over the blacklist. we may not want to
> > blacklist all the preferred leaders, as that would make the blacklisted
> > broker a follower of last resort which is not very useful. In the cases of
> > an underpowered AWS machine or a controller, you might overshoot and make
> > the broker very underutilized if you completely make it leaderless.
> > - is not permanent. If we are to have a blacklist leaders config,
> > rebalancing tools would also need to know about it and manipulate/respect
> > it to achieve a fair balance.
> > It seems like both problems are tied to balancing partitions, it's just
> > that KIP-491's use case wants to balance them against other factors in a
> > more nuanced way. It makes sense to have both be done from the same place
> >
> > To make note of the motivation section:
> > > Avoid bouncing broker in order to lose its leadership
> > The recommended way to make a broker lose its leadership is to run a
> > reassignment on its partitions
> > > The cross-data center cluster has AWS cloud instances which have less
> > computing power
> > We recommend running Kafka on homogeneous machines. It would be cool if the
> > system supported more flexibility in that regard but that is more nuanced
> > and a preferred leader blacklist may not be the best first approach to the
> > issue
> >
> > Adding a new config which can fundamentally change the way replication is
> > done is complex, both for the system (the replication code is complex
> > enough) and the user. Users would have another potential config that could
> > backfire on them - e.g if left forgotten.
> >
> > Could you think of any downsides to implementing this functionality (or a
> > variation of it) in the kafka-reassign-partitions.sh tool?
> > One downside I can see is that we would not have it handle new partitions
> > created after the "blacklist operation". As a first iteration I think that
> > may be acceptable
> >
> > Thanks,
> > Stanislav
> >
> > On Fri, Jul 19, 2019 at 3:20 AM George Li <[email protected]>
> > wrote:
> >
> > >  Hi,
> > >
> > > Pinging the list for the feedbacks of this KIP-491  (
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120736982
> > > )
> > >
> > >
> > > Thanks,
> > > George
> > >
> > >    On Saturday, July 13, 2019, 08:43:25 PM PDT, George Li <
> > > [email protected]> wrote:
> > >
> > >  Hi,
> > >
> > > I have created KIP-491 (
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120736982)
> > > for putting a broker to the preferred leader blacklist or deprioritized
> > > list so when determining leadership,  it's moved to the lowest priority 
> > > for
> > > some of the listed use-cases.
> > >
> > > Please provide your comments/feedbacks.
> > >
> > > Thanks,
> > > George
> > >
> > >
> > >
> > >  ----- Forwarded Message ----- From: Jose Armando Garcia Sancio (JIRA) <
> > > [email protected]>To: "[email protected]" 
> > > <[email protected]>Sent:
> > > Tuesday, July 9, 2019, 01:06:05 PM PDTSubject: [jira] [Commented]
> > > (KAFKA-8638) Preferred Leader Blacklist (deprioritized list)
> > >
> > >    [
> > > https://issues.apache.org/jira/browse/KAFKA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16881511#comment-16881511
> > > ]
> > >
> > > Jose Armando Garcia Sancio commented on KAFKA-8638:
> > > ---------------------------------------------------
> > >
> > > Thanks for feedback and clear use cases [~sql_consulting].
> > >
> > > > Preferred Leader Blacklist (deprioritized list)
> > > > -----------------------------------------------
> > > >
> > > >                Key: KAFKA-8638
> > > >                URL: https://issues.apache.org/jira/browse/KAFKA-8638
> > > >            Project: Kafka
> > > >          Issue Type: Improvement
> > > >          Components: config, controller, core
> > > >    Affects Versions: 1.1.1, 2.3.0, 2.2.1
> > > >            Reporter: GEORGE LI
> > > >            Assignee: GEORGE LI
> > > >            Priority: Major
> > > >
> > > > Currently, the kafka preferred leader election will pick the broker_id
> > > in the topic/partition replica assignments in a priority order when the
> > > broker is in ISR. The preferred leader is the broker id in the first
> > > position of replica. There are use-cases that, even the first broker in 
> > > the
> > > replica assignment is in ISR, there is a need for it to be moved to the 
> > > end
> > > of ordering (lowest priority) when deciding leadership during  preferred
> > > leader election.
> > > > Let’s use topic/partition replica (1,2,3) as an example. 1 is the
> > > preferred leader.  When preferred leadership is run, it will pick 1 as the
> > > leader if it's ISR, if 1 is not online and in ISR, then pick 2, if 2 is 
> > > not
> > > in ISR, then pick 3 as the leader. There are use cases that, even 1 is in
> > > ISR, we would like it to be moved to the end of ordering (lowest priority)
> > > when deciding leadership during preferred leader election.  Below is a 
> > > list
> > > of use cases:
> > > > * (If broker_id 1 is a swapped failed host and brought up with last
> > > segments or latest offset without historical data (There is another effort
> > > on this), it's better for it to not serve leadership till it's caught-up.
> > > > * The cross-data center cluster has AWS instances which have less
> > > computing power than the on-prem bare metal machines.  We could put the 
> > > AWS
> > > broker_ids in Preferred Leader Blacklist, so on-prem brokers can be 
> > > elected
> > > leaders, without changing the reassignments ordering of the replicas.
> > > > * If the broker_id 1 is constantly losing leadership after some time:
> > > "Flapping". we would want to exclude 1 to be a leader unless all other
> > > brokers of this topic/partition are offline.  The “Flapping” effect was
> > > seen in the past when 2 or more brokers were bad, when they lost 
> > > leadership
> > > constantly/quickly, the sets of partition replicas they belong to will see
> > > leadership constantly changing.  The ultimate solution is to swap these 
> > > bad
> > > hosts.  But for quick mitigation, we can also put the bad hosts in the
> > > Preferred Leader Blacklist to move the priority of its being elected as
> > > leaders to the lowest.
> > > > *  If the controller is busy serving an extra load of metadata requests
> > > and other tasks. we would like to put the controller's leaders to other
> > > brokers to lower its CPU load. currently bouncing to lose leadership would
> > > not work for Controller, because after the bounce, the controller fails
> > > over to another broker.
> > > > * Avoid bouncing broker in order to lose its leadership: it would be
> > > good if we have a way to specify which broker should be excluded from
> > > serving traffic/leadership (without changing the replica assignment
> > > ordering by reassignments, even though that's quick), and run preferred
> > > leader election.  A bouncing broker will cause temporary URP, and 
> > > sometimes
> > > other issues.  Also a bouncing of broker (e.g. broker_id 1) can 
> > > temporarily
> > > lose all its leadership, but if another broker (e.g. broker_id 2) fails or
> > > gets bounced, some of its leaderships will likely failover to broker_id 1
> > > on a replica with 3 brokers.  If broker_id 1 is in the blacklist, then in
> > > such a scenario even broker_id 2 offline,  the 3rd broker can take
> > > leadership.
> > > > The current work-around of the above is to change the topic/partition's
> > > replica reassignments to move the broker_id 1 from the first position to
> > > the last position and run preferred leader election. e.g. (1, 2, 3) => (2,
> > > 3, 1). This changes the replica reassignments, and we need to keep track 
> > > of
> > > the original one and restore if things change (e.g. controller fails over
> > > to another broker, the swapped empty broker caught up). That’s a rather
> > > tedious task.
> > > >
> > >
> > >
> > >
> > > --
> > > This message was sent by Atlassian JIRA
> > > (v7.6.3#76005)

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

Reply via email to