Re: Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

Chris Egerton Mon, 17 Oct 2022 09:40:55 -0700

Thanks Greg!

I think sorting results (at least w/r/t pagination) is only valuable if
it's part of a formal contract; otherwise, a REST extension wouldn't really
be able to take advantage of that behavior without becoming brittle across
non-major version changes. I'd opt on the side of leaving it out for now
altogether; the extension can always implement sorting on its own, and it
seems unlikely that the size of the response in memory before it's
intercepted by the extension would be large enough to cause issues.


Appreciate the rest of your thoughts as well, and glad to hear that the
STOPPED state seems reasonable.

Cheers,

Chris

On Fri, Oct 14, 2022 at 1:48 PM Greg Harris <greg.har...@aiven.io.invalid>
wrote:

> Hey Chris,
>
> Thanks for your quick reply!
>
> Design 1:
>
> I think this is a completely reasonable way to implement a V1, without any
> more complicated pagination.
> I had not considered that a REST extension would be able to add pagination,
> but upon reflection that makes a lot of sense.
>
> I like the best-effort lexicographic sorting idea, if that's not too
> expensive then we should do it.
> We also don't necessarily need to have it as a formal contract in the API,
> it could just be an implementation detail that can be tweaked later.
> Maybe someone wants a ?sort= parameter later, and specifying here that the
> base endpoint can return an arbitrary order would allow that lattitude in
> implementation later.
> Anyone who relies on a particular order can always sort it themselves, I
> see the lexicographic sorting as more of a pre-emptive QOL improvement that
> may avoid a `jq` step when manually using the API.
>
> Design 2:
>
> This is very good news, and it's one less interoperability concern.
> Excellent design choice!
>
> Implementation 1:
>
> Thanks for clarifying, I do wonder if anyone else cares about the
> difference. Deleting the group would also serve as a "full cleanup" of the
> connector state, where deleting the offsets still leaves the group in the
> Kafka broker.
>
> Implementation 2:
>
> This makes a lot of sense, and seems like a good principle to keep moving
> forward.
> Also thanks for clarifying the difference between "empty set of task
> configs" and "tombstoning existing task configs", I think I meant the
> former in my initial question.
>
> Overall:
>
> I don't think I have any other concerns with the design.
> I did want to say that I really like the new "STOPPED" state, I think
> that's almost as-important as the title feature in this KIP, and could even
> be in a standalone KIP as a "connector maintenance mode."
> It definitely makes sense to add in this KIP, as you need both the
> connector config, and the connector to be offline, which previously wasn't
> possible inside of Connect.
> I also like the unified API for both source and sink, you're fixing the
> feature disparity between them and unifying the UX, which is valuable on
> it's own.
>
> Thanks so much for the KIP Chris!
>
> Greg
>
>
> On Fri, Oct 14, 2022 at 7:49 AM Chris Egerton <chr...@aiven.io.invalid>
> wrote:
>
> > Hi Greg,
> >
> > Thanks for your thoughts.
> >
> > RE your design questions:
> >
> > 1. The responses in the REST API may grow fairly large for sink
> connectors
> > that consume from a large number of Kafka topic partitions, and source
> > connectors that store a wide range of source partitions. If there is a
> > large amount of data for a single source partition for a source
> connector,
> > we will only show the latest committed offset (that the worker has read
> > from the offsets topic) for that source partition. We could consider
> adding
> > some kind of pagination API to put a default upper bound on the size of
> > responses and allow (or really, require) users to issue multiple requests
> > in order to get a complete view of the offsets of connectors with a large
> > number of partitions. My initial instinct is to hold off on introducing
> > this complexity, though. I can imagine some of the responses getting a
> > little ugly to view raw in a terminal or a browser, but I'm not sure
> we'll
> > see more significant issues than that. If we receive feedback that this
> is
> > a serious-enough issue, then we can revisit it in a follow-up KIP for
> > offsets V2. In the meantime, it's always possible to implement this
> > behavior via a REST extension if it's a blocker for some users. Perhaps
> to
> > make that alternative slightly easier to implement, we can add a contract
> > to the API that responses will always be sorted lexicographically by
> source
> > partition for source connectors or by Kafka topic (subsorting by
> partition)
> > for sink connectors. What are your thoughts? If this strategy makes
> sense I
> > can add it to the future work section.
> >
> > 2. The new STOPPED state should "just work" with the rebalancing
> algorithm,
> > since it'll be implemented under the hood by publishing an empty set of
> > task configs to the config topic for the connector. That should be enough
> > to trigger a rebalance and to cause no tasks for the connector to be
> > allocated across the cluster during that rebalance, regardless of the
> > protocol (eager or incremental) that's in use.
> >
> > RE your implementation questions:
> >
> > 1. It's mostly a matter of convenience; we can issue a single admin
> request
> > to delete the group rather than having to identify every topic partition
> > the consumer group has committed offsets for and then issue a follow-up
> > request to delete the offsets for that group. I made note of this detail
> in
> > the KIP to make sure that we were comfortable with completely removing
> the
> > consumer group instead of wiping its offsets, since it seems possible
> that
> > some users may find that behavior unexpected.
> >
> > 2. The idea is to only perform zombie fencing when we know that it is
> > necessary (which is a principle borrowed from KIP-618), so in this case,
> > we'll only do it in response to an offsets reset request, and not when a
> > connector is stopped. After being stopped, it's possible that the
> connector
> > gets deleted, in which case, a proactive round of fencing would have
> served
> > no benefit. It's also worth noting that publishing an empty set of task
> > configs is not the same as tombstoning existing task configs; putting a
> > connector into the STOPPED state should require no tombstones to be
> emitted
> > to the config topic.
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Oct 13, 2022 at 6:26 PM Greg Harris <greg.har...@aiven.io.invalid
> >
> > wrote:
> >
> > > Hey Chris,
> > >
> > > Thanks for the KIP!
> > >
> > > I think this is an important feature for both development and
> operations
> > > use-cases, and it's an obvious gap in the REST feature set.
> > > I also appreciate the incremental nature of the KIP and the future
> > > extensions that will now be possible.
> > >
> > > I had a couple of questions about the design and it's extensibility:
> > >
> > > 1. How do you imagine the API will behave with connectors that have
> > > extremely large numbers of partitions (thousands or more) and/or source
> > > connectors with large amounts of data per partition?
> > >
> > > 2. Does the new STOPPED state need any special integration with the
> > > rebalance subsystem, or can the rebalance algorithms remain ignorant of
> > the
> > > target state of connectors?
> > >
> > > And about the implementation:
> > >
> > > 1. For my own edification, what is the difference between deleting a
> > > consumer group and deleting all known offsets for that group? Does
> > deleting
> > > the group offer better/easier atomicity?
> > >
> > > 2. For EOS sources, will stopping the connector and tombstoning the
> task
> > > configs perform a fence-out, or will that fence-out only occur when
> > > performing the offsets DELETE operation?
> > >
> > > Thanks!
> > > Greg
> > >
> > > On 2022/10/13 20:52:26 Chris Egerton wrote:
> > > > Hi all,
> > > >
> > > > I noticed a fairly large gap in the first version of this KIP that I
> > > > published last Friday, which has to do with accommodating connectors
> > > > that target different Kafka clusters than the one that the Kafka
> > Connect
> > > > cluster uses for its internal topics and source connectors with
> > dedicated
> > > > offsets topics. I've since updated the KIP to address this gap, which
> > has
> > > > substantially altered the design. Wanted to give a heads-up to anyone
> > > > that's already started reviewing.
> > > >
> > > > Cheers,
> > > >
> > > > Chris
> > > >
> > > > On Fri, Oct 7, 2022 at 1:29 PM Chris Egerton <ch...@aiven.io> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to begin discussion on a KIP to add offsets support to the
> > > Kafka
> > > > > Connect REST API:
> > > > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Chris
> > > > >
> > > >
> > >
> >
>

Re: Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

Reply via email to