Thanks for the great KIP Konstantine!

+1 (non-binding)

Robert

On Thu, Mar 7, 2019 at 2:56 PM Guozhang Wang <wangg...@gmail.com> wrote:

> Thanks Konstantine, I've read the updated section on
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect
> and it lgtm.
>
> I'm +1 on the KIP.
>
>
> Guozhang
>
>
> On Thu, Mar 7, 2019 at 2:35 PM Konstantine Karantasis <
> konstant...@confluent.io> wrote:
>
> > Thanks Guozhang. This is a valid observation regarding the current status
> > of the PR.
> >
> > I updated the KIP to explicitly call out how the downgrade process should
> > work in the section Compatibility, Deprecation, and Migration.
> >
> > Additionally, I reduced the configuration modes for the connect.protocol
> to
> > only two: eager and compatible.
> > That's because there's no way at the moment to select a protocol based on
> > simple majority and not unanimity across at least one option for the
> > sub-protocol.
> > Therefore there's no way to lock a group of workers in a cooperative-only
> > mode at the moment, if we account for accidental joins of workers running
> > at an older version.
> >
> > The changes have been reflected in the KIP doc and will be reflected in
> the
> > PR in a subsequent commit.
> >
> > Thanks,
> > Konstantine
> >
> >
> > On Thu, Mar 7, 2019 at 1:17 PM Guozhang Wang <wangg...@gmail.com> wrote:
> >
> > > Hi Konstantine,
> > >
> > > Thanks for the updated KIP and the PR as well (which is huge :) I
> briefly
> > > looked through it as well as the KIP, and I have one minor comment to
> add
> > > (otherwise I'm binding +1 on it as well) about the backward
> > compatibility.
> > > I'll use one example to illustrate the issue:
> > >
> > > 1) Suppose you have workerA and B on newer version and configured the
> > > connect.protocol as "compatible", they will send both V0/V1 to the
> leader
> > > (say it's workerA) who will choose V1 as the current protocol, this
> will
> > be
> > > sent back to A and B who would remember the current protocol version is
> > > already V1. So after this rebalance everyone remembers that V1 can be
> > used,
> > > which means that upon prepareJoin they will not revoke all the assigned
> > > tasks.
> > >
> > > 2) Now let's say a new worker joins but with old version V0
> (practically
> > > this is rare, but for illustration purposes some common scenarios may
> > falls
> > > into this, e.g. an existing worker being downgraded, which is
> essentially
> > > as being kicked out of the group, and then rejoined as a new member on
> > the
> > > older version), the leader realized that at least one of the member
> does
> > > not know V1 and hence would fall back to use version V0 to perform
> > > assignment. V0 algorithm would do eager rebalance which may move some
> > tasks
> > > to the new comer immediately from the existing members, as it assumes
> > that
> > > everyone would revoke everything before join (a.k.a the sync-barrier)
> but
> > > this is actually not true, since everyone other than the old versioned
> > new
> > > comer would still follow the behavior of V1 --- not revoking anything
> ---
> > > before sending the join group request.
> > >
> > > This could be solvable though, e.g. when leader realized that he needs
> to
> > > use V0, while the previous "currentProtocol" value is V1, instead of
> just
> > > blindly follow the algorithm of V0 it could just reassign the existing
> > > partitions without migrating anything, while at the same time tell
> > everyone
> > > that the currentProtocol version is downgraded to V0; and then they can
> > > trigger another rebalance based on V0 where everything will revoke the
> > > tasks before sending join group requests.
> > >
> > >
> > > Guozhang
> > >
> > > On Wed, Mar 6, 2019 at 2:28 PM Konstantine Karantasis <
> > > konstant...@confluent.io> wrote:
> > >
> > > > I'd like to open the vote on KIP-415: Incremental Cooperative
> > Rebalancing
> > > > in Kafka Connect
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect
> > > >
> > > > a proposal that will allow Kafka Connect to scale significantly the
> > > number
> > > > of connectors and tasks it can run in a cluster of Connect workers.
> > > >
> > > > Thanks,
> > > > Konstantine
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
> --
> -- Guozhang
>

Reply via email to