Here is the entire logic to rebalance the cluster which is done by this
groovy script (
https://github.com/Lowess/Kafka/blob/master/KafkaPartitionRebalancer.groovy)

#1: Request the zookeeper and get the broker id list
#2: Request zookeeper and get the list of topic
#3: Generate the topic-to-move.json which looks like:

{
  "version": 1,
  "topics": [
    {
      "topic": "SLOTS"
    },
    {
      "topic": "ASSETS"
    },
    {
      "topic": "AD_EVENTS"
    },
    {
      "topic": "B_IMPRESSION"
    },
    {
      "topic": "B_STATISTICS"
    },
    {
      "topic": "PAGES"
    },
    {
      "topic": "RTB"
    },
    {
      "topic": "D_STATISTICS"
    },
    {
      "topic": "D_REPORTING"
    }
  ]
}

#4: Upload this file on the kafka node (/tmp/topics-to-move.json) and
run the following command:
bin/kafka-reassign-partitions.sh --zookeeper ZK_IP:2181
--topics-to-move-json-file /tmp/topics-to-move.json --generate
--broker-list "ALL_BROKERS_THAT_ARE_RETURNED_BY_ZOOKEEPER_ON_STEP_#1"

#5: Parse the json returned by the previous step and slice it into
smaller json (the number of partitions contained in a Json is limited
by the groovy script (10 partitions in this example))that look like:

{
  "version": 1,
  "partitions": [
    {
      "topic": "D_REPORTING",
      "partition": 2,
      "replicas": [
        102311671,
        10517222,
        102311679
      ]
    },
    {
      "topic": "AD_EVENTS",
      "partition": 48,
      "replicas": [
        102311671,
        109715277,
        101531906
      ]
    },
    {
      "topic": "D_STATISTICS",
      "partition": 47,
      "replicas": [
        109715277,
        10517222,
        102311679
      ]
    },
    {
      "topic": "SLOTS",
      "partition": 46,
      "replicas": [
        101131445,
        102336284,
        10517222
      ]
    },
    {
      "topic": "RTB",
      "partition": 48,
      "replicas": [
        101021441,
        102311671,
        102336284
      ]
    },
    {
      "topic": "PAGES",
      "partition": 14,
      "replicas": [
        102311679,
        102311671,
        102336284
      ]
    },
    {
      "topic": "ASSETS",
      "partition": 35,
      "replicas": [
        10517222,
        101131445,
        102311679
      ]
    },
    {
      "topic": "B_IMPRESSION",
      "partition": 34,
      "replicas": [
        101131445,
        102311672,
        102311671
      ]
    },
    {
      "topic": "B_STATISTICS",
      "partition": 19,
      "replicas": [
        109715277,
        101531906,
        102311672
      ]
    },
    {
      "topic": "AD_EVENTS",
      "partition": 18,
      "replicas": [
        109715277,
        102311671,
        102336284
      ]
    }
  ]
}

#6: Upload the previous Json on the kafka node
(/tmp/expand-cluster-reassignment.json) and run the following command:
bin/kafka-reassign-partitions.sh --zookeeper ZK_IP:2181
--reassignment-json-file
/tmp/expand-cluster-reassignment.json --execute --broker-list
"ALL_BROKERS_THAT_ARE_RETURNED_BY_ZOOKEEPER_ON_STEP_#1"

#7: Loop on the verification step while the json returned by the following
command contains failed partitions:
bin/kafka-reassign-partitions.sh --zookeeper ZK_IP:2181
--reassignment-json-file
/tmp/expand-cluster-reassignment.json --verify --broker-list
"ALL_BROKERS_THAT_ARE_RETURNED_BY_ZOOKEEPER_ON_STEP_#1"

#8 Execute a new json part file similar as step #5 util all of them have
ran.


Hope that will help you guys.


On Tue, Jul 8, 2014 at 10:31 AM, Clark Haskins <
chask...@linkedin.com.invalid> wrote:

> Can you copy/paste the json you are passing to the reassignment tool? Plus
> the commands. Also do a describe on your topics.
>
> -Clark
>
> Clark Elliott Haskins III
> LinkedIn DDS Site Reliability Engineer
> Kafka, Zookeeper, Samza SRE
> Mobile: 505.385.1484
> BlueJeans: https://www.bluejeans.com/chaskins
>
>
> chask...@linkedin.com
> https://www.linkedin.com/in/clarkhaskins
> There is no place like 127.0.0.1
>
>
>
>
> On 7/8/14, 10:26 AM, "Florian Dambrine" <flor...@gumgum.com> wrote:
>
> >I let the tool running for an entire weekend on the test cluster and on
> >Monday it was still saying "failed"...
> >
> >I have 500 Go per Kafka node and it is a 8 nodes cluster.
> >
> >I am also wondering if I am using the tool correctly. Currently I am
> >running the tool to rebalance everything across the entire cluster. As I
> >have 3 replicas the tool requires at least 3 brokers.
> >
> >Should I add 3 new Kafka nodes and rebalance some topics to these new
> >nodes
> >only? I am afraid to unbalance the cluster with this option.
> >
> >Any suggestions?
> >
> >Thanks for your help.
> >
> >
> >On Mon, Jul 7, 2014 at 9:29 PM, Jun Rao <jun...@gmail.com> wrote:
> >
> >> The failure could mean that the reassignment is still in progress. If
> >>you
> >> have lots of data, it may take some time to move the data to new
> >>brokers.
> >> You could observe the max lag in each broker to see how far behind new
> >> replicas are (see
> >>http://kafka.apache.org/documentation.html#monitoring).
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >>
> >> On Mon, Jul 7, 2014 at 4:42 PM, Florian Dambrine <flor...@gumgum.com>
> >> wrote:
> >>
> >> > When I run the tool with the --verify option it says failed for the
> >>some
> >> > partitions.
> >> >
> >> > The problem is I do not know if it is a zookeeper issue or if the tool
> >> > really failed.
> >> >
> >> > I faced one time the zookeeper issue (
> >> > https://issues.apache.org/jira/browse/KAFKA-1382) and by killing the
> >> > responsible Kafka the partition switched from failed to completed
> >> > successfully.
> >> >
> >> > What should I do when the Kafka tool says that it failed to move the
> >> > partition?
> >> >
> >> >
> >> >
> >> >
> >> > On Mon, Jul 7, 2014 at 4:33 PM, Clark Haskins
> >> > <chask...@linkedin.com.invalid
> >> > > wrote:
> >> >
> >> > > How does it get stuck?
> >> > >
> >> > > -Clark
> >> > >
> >> > > Clark Elliott Haskins III
> >> > > LinkedIn DDS Site Reliability Engineer
> >> > > Kafka, Zookeeper, Samza SRE
> >> > > Mobile: 505.385.1484
> >> > > BlueJeans: https://www.bluejeans.com/chaskins
> >> > >
> >> > >
> >> > > chask...@linkedin.com
> >> > > https://www.linkedin.com/in/clarkhaskins
> >> > > There is no place like 127.0.0.1
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On 7/7/14, 3:49 PM, "Florian Dambrine" <flor...@gumgum.com> wrote:
> >> > >
> >> > > >Hi,
> >> > > >
> >> > > >I am trying to add new brokers to an existing 8 nodes Kafka
> >>cluster.
> >> We
> >> > > >have around 10 topics and the number of partition is set to 50. In
> >> order
> >> > > >to
> >> > > >test the reassgin-partitions scripts, I tried on a sandbox cluster
> >>the
> >> > > >following steps.
> >> > > >
> >> > > >I developed a script which is able to parse the reassignment
> >>partition
> >> > > >plan
> >> > > >given by the Kafka tool in smaller pieces (reassigning maximum 10
> >> > > >partitions at a time).
> >> > > >
> >> > > >Unfortunately I faced some issues with the tool that sometimes get
> >> stuck
> >> > > >on
> >> > > >one partition. In this case I have to kill and restart the three
> >> Kafkas
> >> > on
> >> > > >which the partition has been relocated to unlock the process (One
> >> kafka
> >> > at
> >> > > >a time).
> >> > > >
> >> > > >Moreover, I have also faced these two issues that are already on
> >>Jira:
> >> > > >
> >> > > >https://issues.apache.org/jira/browse/KAFKA-1382
> >> > > >https://issues.apache.org/jira/browse/KAFKA-1479
> >> > > >
> >> > > >We really need to add new nodes to our Kafka cluster, does anybody
> >> have
> >> > > >already rebalance a Kafka 0.8.1.1? What could you advise me?
> >> > > >
> >> > > >Thanks, and feel free to ask me if you need more details.
> >> > > >
> >> > > >
> >> > > >
> >> > > >--
> >> > > >*Florian Dambrine*  |  Intern, Big Data
> >> > > >*GumGum* <http://www.gumgum.com/>  |  *Ads that stick*
> >> > > >209-797-3994  |  flor...@gumgum.com
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > *Florian Dambrine*  |  Intern, Big Data
> >> > *GumGum* <http://www.gumgum.com/>  |  *Ads that stick*
> >> > 209-797-3994  |  flor...@gumgum.com
> >> >
> >>
> >
> >
> >
> >--
> >*Florian Dambrine*  |  Intern, Big Data
> >*GumGum* <http://www.gumgum.com/>  |  *Ads that stick*
> >209-797-3994  |  flor...@gumgum.com
>
>


-- 
*Florian Dambrine*  |  Intern, Big Data
*GumGum* <http://www.gumgum.com/>  |  *Ads that stick*
209-797-3994  |  flor...@gumgum.com

Reply via email to