Hi Yeikel,

> To clarify, who initiates the step that assigns a
>  connector to a specific worker? If this process
> is controlled by the leader, wouldn't it result in a
> failure to assign tasks to workers with whom it
> cannot communicate?

This happens via the group rebalance process where each Kafka Connect
worker communicates with the Kafka broker that has been chosen as the group
co-ordinator for the Kafka Connect cluster. The assignment is indeed
computed by the leader Connect worker but it is disseminated to the other
Connect workers via the group coordinator [1].

> I should not find myself in a situation where a
> connector is assigned to a worker who cannot
> communicate with the leader

This can unfortunately happen, since the assignments aren't done directly
through leader -> non-leader Connect worker communication but via the Kafka
broker designated as the group co-ordinator for the Connect cluster.

[1] -
https://medium.com/streamthoughts/apache-kafka-rebalance-protocol-or-the-magic-behind-your-streams-applications-e94baf68e4f2

On Tue, Sep 26, 2023 at 8:25 AM Yeikel Santana <em...@yeikel.com> wrote:

> Thank you, Yash. Your explanation makes sense
>
> To clarify, who initiates the step that assigns a connector to a specific
> worker? If this process is controlled by the leader, wouldn't it result in
> a failure to assign tasks to workers with whom it cannot communicate?
>
> Although it is not ideal, it is acceptable for now if some workers remain
> inactive as long as the data center where the leader resides remains active
> and continues to handle task assignments. I should not find myself in a
> situation where a connector is assigned to a worker who cannot communicate
> with the leader as that would render it useless as you mentioned
>
> Thank you for taking the time
>
>
>
>
>
>
>
> ---- On Mon, 25 Sep 2023 11:41:18 -0400 Yash Mayya <yash.ma...@gmail.com>
> wrote ---
>
>
>
> Hi Yeikel,
>
> Heartbeats and group coordination in Kafka Connect do occur through Kafka,
> but a Kafka Connect cluster where all workers cannot communicate with
> each other won't work very well. You'll be able to create / update /
> delete
> connectors by making requests to any workers that can communicate with the
> leader like you noted. However, certain internal operations require cross
> Connect worker network access as well. For instance, after a connector is
> started, it needs to spawn tasks that do the actual work. The tasks are
> created via a POST request to the leader worker from the worker that is
> running the connector. When you issue a create connector request to a
> worker, a group rebalance ensues and the connector is assigned to a worker
> in the cluster (it could be any worker, not necessarily the one to
> which the request was issued). So if the connector that you created lands
> on a Connect worker that cannot communicate with the leader worker, it
> won't be able to create its tasks which will render the connector
> essentially useless.
>
> Thanks,
> Yash
>
> On Mon, Sep 25, 2023 at 7:51 PM Yeikel Santana <mailto:em...@yeikel.com>
> wrote:
>
> > Thank you, Nikhil.
> >
> > I did notice that challenge you're describing with the REST updates when
> I
> > had more than one worker within the same datacenter.
> >
> > Luckily, solving that was relatively simple as all my workers can
> > communicate within the same data center, and all I need to do is to
> ensure
> > that the update is initiated from the same datacenter as the leader.
> From
> > what I tested so far, this seems to work fine.
> >
> > My biggest concern was regarding other operations such as heartbeats or
> > general coordination. If that happens through Kafka, then I should be
> > fine.Thank you for taking the time ---- On Mon, 25 Sep 2023 09:45:43
> -0400
> > mailto:nikhilsrivastava4...@gmail.com wrote ----Hi Yeikel,
> >
> > Sharing my two cents. Would let others chime in to add to this.
> >
> > Based on my understanding, if connect workers (which are all part of the
> > same cluster) can communicate with the kafka brokers (which happens to
> be
> > the Group Coordinator and facilitates Connect Leader Election via Group
> > Membership Protocol), then only 1 connect worker will be elected as
> leader
> > amongst all others in the cluster. Outside of that, I believe a bunch of
> > REST calls to connect workers are forwarded to the connect leader (if
> the
> > REST request lands on a connect worker which isn't a leader). In case of
> a
> > non-retriable network partition between the non-leader worker and leader
> > worker, those REST requests will fail. I'm referring to REST requests
> like
> > CREATE / UPDATE / DELETE.
> >
> > Hope this helps a little.
> >
> > Thanks,
> > -Nikhil
> >
> > On Sun, 24 Sept 2023 at 06:36, Yeikel Santana <mailto:em...@yeikel.com>
> wrote:
> >
> > > Hello everyone,I'm currently designing a new Kafka Connect cluster,
> and
> > > I'm trying to understand how connectivity functions among workers.In
> my
> > > setup, I have a single Kafka Connect cluster connected to the same
> Kafka
> > > topics and Kafka cluster. However, the workers are deployed in
> > > geographically separated data centers, each of which is fully isolated
> at
> > > the networkI suspect that this setup might not work with Kafka Connect
> > > because my current understanding is that ALL workers need to
> communicate
> > > with the leader for task coordination and heartbeats.In terms of
> leader
> > > election, can this result in multiple leaders and other potential
> > > issues?Any input and suggestions would be appreciated
> >

Reply via email to