Re: Kafka Connect Connector Tasks Uneven Division

Robin Moffatt Wed, 20 May 2020 04:13:20 -0700

OK, I understand better now.

You can read more about the guts of the rebalancing protocol that Kafka
Connect uses as of Apache Kafka 2.3 an onwards here:
https://www.confluent.io/blog/incremental-cooperative-rebalancing-in-kafka/


One thing I'd ask at this point is though if it makes any difference where
the tasks execute? The point of a cluster is that Kafka Connect manages the
workload allocation. If you need workload separation and
guaranteed execution locality I would suggest separate Kafka Connect
distributed clusters.


-- 

Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff


On Wed, 20 May 2020 at 10:24, Deepak Raghav <deepakragha...@gmail.com>
wrote:

> Hi Robin
>
> Thanks for your reply.
>
> We are having two worker on different IP. The example which I gave you it
> was just a example. We are using kafka version 2.3.1.
>
> Let me tell you again with a simple example.
>
> Suppose, we have two EC2 node, N1 and N2 having worker process W1 and W2
> running in distribute mode with groupId i.e in same cluster and two
> connectors with having two task each i.e
>
> Node N1: W1 is running
> Node N2 : W2 is running
>
> First Connector (C1) : Task1 with id : C1T1 and task 2 with id : C1T2
> Second Connector (C2) : Task1 with id : C2T1 and task 2 with id : C2T2
>
> Now Suppose If both W1 and W2 worker process are running  and I register
> Connector C1 and C2 one after another i.e sequentially, on any of the
> worker process, the tasks division between the worker
> node are happening like below, which is expected.
>
> *W1*                       *W2*
> C1T1                    C1T2
> C2T2                    C2T2
>
> Now, suppose I stop one worker process e.g W2 and start after some time,
> the tasks division is changed like below i.e first connector's task move to
> W1 and second connector's task move to W2
>
> *W1*                       *W2*
> C1T1                    C2T1
> C1T2                    C2T2
>
>
> Please let me know, If it is understandable or not.
>
> Note : Actually, In production, we are gonna have 16 connectors having 10
> task each and two worker node. With above scenario, first 8 connectors's
> task move to W1 and next 8 connector' task move to W2, Which is not
> expected.
>
>
> Regards and Thanks
> Deepak Raghav
>
>
>
> On Wed, May 20, 2020 at 1:41 PM Robin Moffatt <ro...@confluent.io> wrote:
>
> > So you're running two workers on the same machine (10.0.0.4), is
> > that correct? Normally you'd run one worker per machine unless there was
> a
> > particular reason otherwise.
> > What version of Apache Kafka are you using?
> > I'm not clear from your question if the distribution of tasks is
> > presenting a problem to you (if so please describe why), or if you're
> just
> > interested in the theory behind the rebalancing protocol?
> >
> >
> > --
> >
> > Robin Moffatt | Senior Developer Advocate | ro...@confluent.io | @rmoff
> >
> >
> > On Wed, 20 May 2020 at 06:46, Deepak Raghav <deepakragha...@gmail.com>
> > wrote:
> >
> > > Hi
> > >
> > > Please, can anybody help me with this?
> > >
> > > Regards and Thanks
> > > Deepak Raghav
> > >
> > >
> > >
> > > On Tue, May 19, 2020 at 1:37 PM Deepak Raghav <
> deepakragha...@gmail.com>
> > > wrote:
> > >
> > > > Hi Team
> > > >
> > > > We have two worker node in a cluster and 2 connector with having 10
> > tasks
> > > > each.
> > > >
> > > > Now, suppose if we have two kafka connect process W1(Port 8080) and
> > > > W2(Port 8078) started already in distribute mode and then register
> the
> > > > connectors, task of one connector i.e 10 tasks are divided equally
> > > between
> > > > two worker i.e first task of A connector to W1 worker node and sec
> task
> > > of
> > > > A connector to W2 worker node, similarly for first task of B
> connector,
> > > > will go to W1 node and sec task of B connector go to W2 node.
> > > >
> > > > e.g
> > > > *#First Connector : *
> > > > {
> > > >   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
> > > >   "connector": {
> > > >     "state": "RUNNING",
> > > >     "worker_id": "10.0.0.4:*8080*"
> > > >   },
> > > >   "tasks": [
> > > >     {
> > > >       "id": 0,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:*8078*"
> > > >     },
> > > >     {
> > > >       "id": 1,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 2,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 3,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 4,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 5,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 6,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 7,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 8,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 9,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     }
> > > >   ],
> > > >   "type": "sink"
> > > > }
> > > >
> > > >
> > > > *#Sec connector*
> > > >
> > > > {
> > > >   "name": "REGION_CODE_UPPER-Cdb_Neatransaction",
> > > >   "connector": {
> > > >     "state": "RUNNING",
> > > >     "worker_id": "10.0.0.4:8078"
> > > >   },
> > > >   "tasks": [
> > > >     {
> > > >       "id": 0,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 1,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 2,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 3,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 4,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 5,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 6,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 7,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 8,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 9,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     }
> > > >   ],
> > > >   "type": "sink"
> > > > }
> > > >
> > > > But I have seen a strange behavior, when I just shutdown W2 worker
> node
> > > > and start it again task are divided but in diff way i.e all the tasks
> > of
> > > A
> > > > connector will get into W1 node and tasks of B Connector into W2
> node.
> > > >
> > > > Can you please have a look for this.
> > > >
> > > > *#First Connector*
> > > >
> > > > {
> > > >   "name": "REGION_CODE_UPPER-Cdb_Dchchargeableevent",
> > > >   "connector": {
> > > >     "state": "RUNNING",
> > > >     "worker_id": "10.0.0.4:8080"
> > > >   },
> > > >   "tasks": [
> > > >     {
> > > >       "id": 0,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 1,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 2,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 3,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 4,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 5,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 6,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 7,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 8,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     },
> > > >     {
> > > >       "id": 9,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8080"
> > > >     }
> > > >   ],
> > > >   "type": "sink"
> > > > }
> > > >
> > > > *#Second Connector *:
> > > >
> > > > {
> > > >   "name": "REGION_CODE_UPPER-Cdb_Neatransaction",
> > > >   "connector": {
> > > >     "state": "RUNNING",
> > > >     "worker_id": "10.0.0.4:8078"
> > > >   },
> > > >   "tasks": [
> > > >     {
> > > >       "id": 0,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 1,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 2,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 3,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 4,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 5,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 6,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 7,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 8,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     },
> > > >     {
> > > >       "id": 9,
> > > >       "state": "RUNNING",
> > > >       "worker_id": "10.0.0.4:8078"
> > > >     }
> > > >   ],
> > > >   "type": "sink"
> > > > }
> > > >
> > > >
> > > > Regards and Thanks
> > > > Deepak Raghav
> > > >
> > > >
> > >
> >
>

Re: Kafka Connect Connector Tasks Uneven Division

Reply via email to