[jira] [Created] (KAFKA-13331) Slow reassignments in 2.8 because of large number of UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),) Events

2021-09-28 Thread GEORGE LI (Jira)
GEORGE LI created KAFKA-13331: - Summary: Slow reassignments in 2.8 because of large number of UpdateMetadataResponseReceived(UpdateMetadataResponseData(errorCode=0),) Events Key: KAFKA-13331 URL: https

[jira] [Created] (KAFKA-12971) Kakfa 1.1.x clients cache broker hostnames, client stuck when host is swapped for the same broker.id

2021-06-19 Thread GEORGE LI (Jira)
GEORGE LI created KAFKA-12971: - Summary: Kakfa 1.1.x clients cache broker hostnames, client stuck when host is swapped for the same broker.id Key: KAFKA-12971 URL: https://issues.apache.org/jira/browse/KAFKA-12971

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2020-02-03 Thread George Li
ig > for now. > > ~Satish. > > > On Sat, Sep 7, 2019 at 12:29 AM George Li > wrote: > > > >  Hi, > > > > Just want to ping and bubble up the discussion of KIP-491. > > > > On a large scale of Kafka clusters with thousands of brokers in many &

[jira] [Created] (KAFKA-8903) allow the new replica (offset 0) to catch up with current leader using latest offset

2019-09-12 Thread GEORGE LI (Jira)
GEORGE LI created KAFKA-8903: Summary: allow the new replica (offset 0) to catch up with current leader using latest offset Key: KAFKA-8903 URL: https://issues.apache.org/jira/browse/KAFKA-8903 Project

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-09-06 Thread George Li
ored when (or if) we needed to do so.  This seems like it might be simpler and easier to maintain than a separate set of metadata about blacklists. best, Colin On Fri, Sep 6, 2019, at 11:58, George Li wrote: >  Hi,  > > Just want to ping and bubble up the discussion of KIP-491.  &

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-09-06 Thread George Li
eferred leader blacklist.  I will update the KIP-491 later for this use case of Topic Level config for Preferred Leader Blacklist. Thanks, George   On Wednesday, August 7, 2019, 07:43:55 PM PDT, George Li wrote: Hi Colin, > In your example, I think we're comparing apples

Re: KIP-352: Distinguish URPs caused by reassignment

2019-08-08 Thread George Li
Hi Jason, Can KIP-352 split the two metrics MaxLag and TotalLag for reassignment replication as well?  From the dashboard of these 2 metrics, one can see whether the replication is stuck (flat line) and estimate how long the reassignments will complete (how fast the Lag line is going down). T

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-08-07 Thread George Li
have topic creation Policy enforced in Kafka server OR an external service. We have an external/central service managing topic creation/partition expansion which takes into account of rack-diversity, replication factor (2, 3 or 4 depending on cluster/topic type), Policy replicating the

Re: [VOTE] KIP-455: Create an Administrative API for Replica Reassignment

2019-08-07 Thread George Li
This email seemed to get lost in the dev email server. Resending. On Tuesday, August 6, 2019, 10:16:57 PM PDT, George Li wrote: The pending reassignments partitions would be reported as URP (Under Replicated Partitions). or maybe reported as a separate metrics of RURP (Reassignment URP

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-08-07 Thread George Li
, it still needs the blacklist info (e.g. a zk path node, or broker level/topic level config) to "blacklist" the broker to be preferred leader? Would it be the same as KIP-491 is introducing?  Thanks, George On Wednesday, August 7, 2019, 11:01:51 AM PDT, Colin McCabe wrote:

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-08-06 Thread George Li
be excluded from being elected leaders. " Thanks, George On Friday, August 2, 2019, 08:02:07 PM PDT, George Li wrote: Hi Colin, Thanks for looking into this KIP.  Sorry for the late response. been busy.  If a cluster has MAMY topic partitions, moving this "blacklist"

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-08-02 Thread George Li
ing that would be more effectively done by an external system, rather than putting all these policies into Kafka. best, Colin On Fri, Jul 19, 2019, at 18:23, George Li wrote: >  Hi Satish, > Thanks for the reviews and feedbacks. > > > > The following is the requirements this K

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-07-19 Thread George Li
stem (the replication code is complex > enough) and the user. Users would have another potential config that could > backfire on them - e.g if left forgotten. > > Could you think of any downsides to implementing this functionality (or a > variation of it) in the kafka-reassign-partitio

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-07-19 Thread George Li
em (the replication code is complex enough) and the user. Users would have another potential config that could backfire on them - e.g if left forgotten. Could you think of any downsides to implementing this functionality (or a variation of it) in the kafka-reassign-partitions.sh tool? One do

[jira] [Resolved] (KAFKA-8663) partition assignment would be better original_assignment + new_reassignment during reassignments

2019-07-19 Thread GEORGE LI (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-8663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] GEORGE LI resolved KAFKA-8663. -- Resolution: Won't Fix looks like RAR + OAR is required for KIP-455 to preserve the targetRep

Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-07-18 Thread George Li
Hi, Pinging the list for the feedbacks of this KIP-491   (https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=120736982)  Thanks, George On Saturday, July 13, 2019, 08:43:25 PM PDT, George Li wrote: Hi, I have created KIP-491 (https://cwiki.apache.org/confluence

Re: [VOTE] KIP-455: Create an Administrative API for Replica Reassignment

2019-07-18 Thread George Li
splay/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment > > > > > > Best, > > > Stanislav > > > > > > On Wed, May 22, 2019 at 6:12 PM Colin McCabe wrote: > > > > > > > Hi George, > > > > > > >

[DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-07-13 Thread George Li
Project: Kafka >          Issue Type: Improvement >          Components: config, controller, core >    Affects Versions: 1.1.1, 2.3.0, 2.2.1 >            Reporter: GEORGE LI >            Assignee: GEORGE LI >            Priority: Major > > Currently, the kafka preferred leader elect

Re: [DISCUSS] KIP-455 Create an Admin API for Replica Reassignments

2019-07-13 Thread George Li
are kicked off - this KIP proposes that we store the `targetReplicas` in a different collection and thus preserve the original replicas info until the reassignment is fully complete. It should allow you to implement rollback functionality. Please take a look at the KIP and confirm if that is the

[jira] [Created] (KAFKA-8663) partition assignment would be better original_assignment + new_reassignment during reassignments

2019-07-13 Thread GEORGE LI (JIRA)
GEORGE LI created KAFKA-8663: Summary: partition assignment would be better original_assignment + new_reassignment during reassignments Key: KAFKA-8663 URL: https://issues.apache.org/jira/browse/KAFKA-8663

Re: [DISCUSS] KIP-455 Create an Admin API for Replica Reassignments

2019-07-08 Thread George Li
> Now that we support multiple reassignment requests, users may add execute> > them incrementally. Suppose something goes horribly wrong and they want to> > revert as quickly as possible - they would need to run the tool with> > multiple rollback JSONs.  I think that it would be useful to ha

Re: [jira] [Created] (KAFKA-8638) Preferred Leader Blacklist (deprioritized list)

2019-07-08 Thread George Li
quot; broker, it will still be elected as leader to avoid offline partition.  Thanks, George On Monday, July 8, 2019, 03:07:06 PM PDT, GEORGE LI (JIRA) wrote: GEORGE LI created KAFKA-8638:             Summary: Preferred Leader Blacklist (deprioritized li

[jira] [Created] (KAFKA-8638) Preferred Leader Blacklist (deprioritized list)

2019-07-08 Thread GEORGE LI (JIRA)
GEORGE LI created KAFKA-8638: Summary: Preferred Leader Blacklist (deprioritized list) Key: KAFKA-8638 URL: https://issues.apache.org/jira/browse/KAFKA-8638 Project: Kafka Issue Type

Re: Partition Reassignment in Cloud

2019-06-20 Thread George Li
The new broker host meta.properties file can have the broker.id set to the original broker_id (with original host shutdown/decommission), the new host has the storage of the original host (either by copying or by change the network storage mount from original to new host).  This way, it saves t

Re: [VOTE] KIP-455: Create an Administrative API for Replica Reassignment

2019-05-21 Thread George Li
, 9:48:56 AM PDT, Colin McCabe wrote: Hi George, Yes, KIP-455 allows the reassignment of individual partitions to be cancelled.  I think it's very important for these operations to be at the partition level. best, Colin On Tue, May 14, 2019, at 16:34, George Li wrote: > 

Re: [VOTE] KIP-455: Create an Administrative API for Replica Reassignment

2019-05-14 Thread George Li
Hi Colin, Thanks for the updated KIP.  It has very good improvements of Kafka reassignment operations.  One question, looks like the KIP includes the Cancellation of individual pending reassignments as well when the AlterPartitionReasisgnmentRequest has empty replicas for the topic/partition.

Re: [DISCUSS] KIP-455: Create an Administrative API for Replica Reassignment

2019-04-30 Thread George Li
Hi Colin, Thanks for KIP-455!  yes. KIP-236, etc. will depend on it.  It is the good direction to go for the RPC.  Regarding storing the new reassignments & original replicas at the topic/partition level.  I have some concerns when controller is failing over, and the scalability of scanning t

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-04-30 Thread George Li
but it seems much more flexible in the long term. Thanks, Jason On Fri, Apr 5, 2019 at 7:25 PM George Li wrote: >  Hi Jun, > > Thanks for the feedback! > > for 40,  I agree.  It makes sense to do it via RPC request to the > controller.  Maybe for KIP-236,  I will just impleme

Re: [DISCUSS] KIP-435: Incremental Partition Reassignment

2019-04-08 Thread George Li
ll the controller about, > > and > > > let it decide how to do the batching. These ideal assignments could > > change > > > continuously over time, so from the admin's point of view, there would be > > > no start/stop/cancel, but just individual partition

Re: [DISCUSS] KIP-435: Incremental Partition Reassignment

2019-04-05 Thread George Li
Hi Jason / Viktor, For the URP during a reassignment,  if the "original_replicas" is kept for the current pending reassignment. I think it will be very easy to compare that with the topic/partition's ISR.  If all "original_replicas" are in ISR, then URP should be 0 for that topic/partition. 

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-04-05 Thread George Li
completion or not. 41. Is it necessary to add the new "original replicas" field in /admin/reassign_partitions? The original replicas are already in the topic path in ZK. Jun On Tue, Mar 26, 2019 at 5:24 PM George Li wrote: >  Hi Ismael, > > Thanks,  I understand your points. I will

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-26 Thread George Li
ple then decide to avoid the recommended path, they can deal with the consequences. However, if we add another structure in ZK and no RPC mechanism, then there is no recommended path apart from updating ZK (implicitly making it an API for users). Ismael On Mon, Mar 25, 2019 at 3:57 PM George Li

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-25 Thread George Li
Mar 21, 2019, at 20:51, George Li wrote: > >  Hi Colin, > > > > I agree with your proposal of having administrative APIs through RPC > > instead of ZooKeeper. But seems like it will incur significant changes > > to both submitting reassignments and this KIP'

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-21 Thread George Li
ents(); > > class PendingReassignmentResults { >  KafkaFuture> pending; >  KafkaFuture> previous; > } best, Colin On Tue, Mar 19, 2019, at 15:04, George Li wrote: >  Hi Viktor, > > Thanks for the review.  > > If there is reassignment in-progress while

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-19 Thread George Li
n you have time?  Thanks Can we start a vote on this KIP in one or two weeks?  Thanks,George On Tuesday, March 5, 2019, 10:58:45 PM PST, George Li wrote: Hi Viktor, >  2.: One follow-up question: if the reassignment cancellation gets >interrupted and a failover happens after

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-18 Thread George Li
this KIP in one or two weeks?  Thanks,George On Tuesday, March 5, 2019, 10:58:45 PM PST, George Li wrote: Hi Viktor, >  2.: One follow-up question: if the reassignment cancellation gets >interrupted and a failover happens after step #2 but before step #3, how will >the new c

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-05 Thread George Li
;t know about RAR. I would suppose the new controller would start from the beginning as it only knows what's in Zookeeper. Is that true?2.1: Another interesting question that are what are those replicas are doing which are online but not part of the leader and ISR? Are they still replicati

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-03-01 Thread George Li
(in controller failover scenario).  Thanks, George On Monday, February 25, 2019, 11:40:08 AM PST, George Li wrote: Hi Viktor,  Thanks for the response.  Good questions!  answers below:  > A few questions regarding the rollback algorithm:> 1. At step 2 how do you > elect

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-02-25 Thread George Li
st crashes? Is it able to continue after a controller failover?4. I think it would be a good addition if you could add few example scenarios for rollback. Best, Viktor On Fri, Feb 22, 2019 at 7:04 PM George Li wrote: Hi Viktor,  Thanks for reading and provide feedbacks on KIP-236.  Fo

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-02-22 Thread George Li
on a fully compatible way, meaning I won't >> change it just calculate each increment based on that and the current state >> of the ISR set for the partition in reassignment. >> I hope we could collaborate on this. >> >> Viktor >> >> On Thu, Feb 21, 2019 at

Re: [DISCUSS] KIP-236 Interruptible Partition Reassignment

2019-02-20 Thread George Li
Hi, After discussing with Tom, Harsha and I are picking up KIP-236.  The work focused on safely/cleanly cancel / rollback pending reassignments in a timely fashion.  Pull Request #6296  Still working on more integration/system tests.  Please review and provide feedbacks/suggestions.  Thanks,Georg