Re: Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

Chris Egerton Fri, 14 Oct 2022 07:49:03 -0700

Hi Greg,

Thanks for your thoughts.

RE your design questions:

1. The responses in the REST API may grow fairly large for sink connectors
that consume from a large number of Kafka topic partitions, and source
connectors that store a wide range of source partitions. If there is a
large amount of data for a single source partition for a source connector,
we will only show the latest committed offset (that the worker has read
from the offsets topic) for that source partition. We could consider adding
some kind of pagination API to put a default upper bound on the size of
responses and allow (or really, require) users to issue multiple requests
in order to get a complete view of the offsets of connectors with a large
number of partitions. My initial instinct is to hold off on introducing
this complexity, though. I can imagine some of the responses getting a
little ugly to view raw in a terminal or a browser, but I'm not sure we'll
see more significant issues than that. If we receive feedback that this is
a serious-enough issue, then we can revisit it in a follow-up KIP for
offsets V2. In the meantime, it's always possible to implement this
behavior via a REST extension if it's a blocker for some users. Perhaps to
make that alternative slightly easier to implement, we can add a contract
to the API that responses will always be sorted lexicographically by source
partition for source connectors or by Kafka topic (subsorting by partition)
for sink connectors. What are your thoughts? If this strategy makes sense I
can add it to the future work section.

2. The new STOPPED state should "just work" with the rebalancing algorithm,
since it'll be implemented under the hood by publishing an empty set of
task configs to the config topic for the connector. That should be enough
to trigger a rebalance and to cause no tasks for the connector to be
allocated across the cluster during that rebalance, regardless of the
protocol (eager or incremental) that's in use.

RE your implementation questions:

1. It's mostly a matter of convenience; we can issue a single admin request
to delete the group rather than having to identify every topic partition
the consumer group has committed offsets for and then issue a follow-up
request to delete the offsets for that group. I made note of this detail in
the KIP to make sure that we were comfortable with completely removing the
consumer group instead of wiping its offsets, since it seems possible that
some users may find that behavior unexpected.

2. The idea is to only perform zombie fencing when we know that it is
necessary (which is a principle borrowed from KIP-618), so in this case,
we'll only do it in response to an offsets reset request, and not when a
connector is stopped. After being stopped, it's possible that the connector
gets deleted, in which case, a proactive round of fencing would have served
no benefit. It's also worth noting that publishing an empty set of task
configs is not the same as tombstoning existing task configs; putting a
connector into the STOPPED state should require no tombstones to be emitted
to the config topic.

Cheers,

Chris

On Thu, Oct 13, 2022 at 6:26 PM Greg Harris <greg.har...@aiven.io.invalid>
wrote:

> Hey Chris,
>
> Thanks for the KIP!
>
> I think this is an important feature for both development and operations
> use-cases, and it's an obvious gap in the REST feature set.
> I also appreciate the incremental nature of the KIP and the future
> extensions that will now be possible.
>
> I had a couple of questions about the design and it's extensibility:
>
> 1. How do you imagine the API will behave with connectors that have
> extremely large numbers of partitions (thousands or more) and/or source
> connectors with large amounts of data per partition?
>
> 2. Does the new STOPPED state need any special integration with the
> rebalance subsystem, or can the rebalance algorithms remain ignorant of the
> target state of connectors?
>
> And about the implementation:
>
> 1. For my own edification, what is the difference between deleting a
> consumer group and deleting all known offsets for that group? Does deleting
> the group offer better/easier atomicity?
>
> 2. For EOS sources, will stopping the connector and tombstoning the task
> configs perform a fence-out, or will that fence-out only occur when
> performing the offsets DELETE operation?
>
> Thanks!
> Greg
>
> On 2022/10/13 20:52:26 Chris Egerton wrote:
> > Hi all,
> >
> > I noticed a fairly large gap in the first version of this KIP that I
> > published last Friday, which has to do with accommodating connectors
> > that target different Kafka clusters than the one that the Kafka Connect
> > cluster uses for its internal topics and source connectors with dedicated
> > offsets topics. I've since updated the KIP to address this gap, which has
> > substantially altered the design. Wanted to give a heads-up to anyone
> > that's already started reviewing.
> >
> > Cheers,
> >
> > Chris
> >
> > On Fri, Oct 7, 2022 at 1:29 PM Chris Egerton <ch...@aiven.io> wrote:
> >
> > > Hi all,
> > >
> > > I'd like to begin discussion on a KIP to add offsets support to the
> Kafka
> > > Connect REST API:
> > >
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> >
>

Re: Re: [DISCUSS] KIP-875: First-class offsets support in Kafka Connect

Reply via email to