Oh, one more thing--can we add the KIP to the list of KIPs? https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals#KafkaImprovementProposals-KIPsunderdiscussion
On Mon, Dec 4, 2023 at 10:33 AM Chris Egerton <chr...@aiven.io> wrote: > Hi Ashwin, > > Thanks for the KIP! This would be a nice simplification to the process for > migrating connectors enabled by KIP-980, and would also add global support > for a feature I've seen implemented by hand in at least a couple connectors > over the years. > > > High-level thoughts: > > 1. If there are leftover offsets from a previous connector, what will the > impact on those offsets (if any) be if a new connector is created with the > same name with initial offsets specified? I can think of at least two > options: we leave those offsets as-are but allow any of the initial offsets > in the new connector request to overwrite them, or we automatically wipe > all existing offsets for the connector first before writing its initial > offsets and then creating it. I have a slight preference for the first > because it's simpler to implement and aligns with existing precedent for > offset handling where we never wipe them unless explicitly requested by the > user or connector, but it could be argued that the second is less likely to > generate footguns for users. Interested in your thoughts! > > 2. IMO preflight validation (i.e., the "Validate initial_offset before > creating the connector in STOPPED state rejected alternative) is a > must-have for this kind of feature. I acknowledge the possible race > condition (and I don't think it's worth it to track in-flight connector > creation requests in the herder in order to prevent this race, since > ever-blocking Connector operations would be cumbersome to deal with), and > the extra implementation effort. But I don't think either of those tip the > scales far enough to override the benefit of ensuring the submitted offsets > are valid before creating the connector. > > 3. On the topic of preflight validation--I also think it's important that > we validate both the connector config and the initial offsets before either > creating the connector or storing the initial offsets. I don't think this > point requires any changes to the KIP that aren't already proposed with > point 2. above, but wanted to see if we could adopt this as a primary goal > for the design of the feature and keep it in mind with future changes. > > 4. In the proposed changes section, the first step is "Create a connector > in STOPPED state", and the second step is "Validate the offset...". What > exactly is entailed by "Create a connector"? Submit the config to the > config topic (presumably after a preflight validation of the config)? > Participate in the ensuing rebalance? I'm a little hesitant to start > performing rebalances inside REST requests, wondering if we can find a > lighter-weight way to implement this. > > > Nits: > > 5. You can remove the italicized "This page is meant as a template for > writing a KIP" section after the table of contents. > > 6. Can you file a JIRA ticket for this change and add a link to it in the > KIP under the status section? > > 7. What do you think about changing the name for the new field from > "initial_offset" (singular) to "initial_offsets" (plural)? This is > especially relevant for connectors that read from multiple source > partitions, like MM2 and the various JDBC source connectors out there. > > 8. IMO it's not necessary to include this section: "Please note that sink > and source connectors have different schemas for offset." While it's > technically true that the fields of the objects inside the "partition" and > "offset" fields will likely differ between sink and source connectors, > they'll also likely differ in the exact same way across different sink > connectors. I think it's enough to link to the relevant section in KIP-875 > on the details for the format. > > 9. Instead of "Connector-defined source partition" and "Connector-defined > source offset" in the comments for the sample connector creation body > (which aren't strictly accurate for sink connectors), can we say something > like "Source partition" and "Desired initial offset"? > > 10. In the compatibility section the KIP states that "This new feature > will use the current OffsetStorageWriter." IMO we should refrain from > referring to internal API class names in KIPs when possible, since those > class names may change and users may also mistakenly assume that they're > part of the public API. Can we say something like "This new feature will > reuse existing logic for storing offsets" instead? > > > Thanks again for the KIP! > > Cheers, > > Chris > > On Thu, Nov 30, 2023 at 2:26 AM Ashwin <apan...@confluent.io.invalid> > wrote: > >> Hi all, >> >> I'd like to begin discussion on KIP-995 which proposes to allow users >> to specify initial offset as part of the request to create a connector >> >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors >> >> During the discussion for KIP-980 >> < >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state >> >, >> which proposed the creation of connectors in STOPPED state, there was >> a suggestion to also allow setting the initial offset for a connector >> in the connector creation API. The proposal was deemed valid >> <https://lists.apache.org/thread/tq79l3975hy69wsycvwoh9n8gbp5j26f> >> (point no.4) but was deferred to a future KIP. This KIP proposes to >> implement that change and builds upon the changes introduced in >> KIP-875 < >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect#KIP875:FirstclassoffsetssupportinKafkaConnect-Alteringoffsets(request) >> > >> and KIP-980 < >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state >> >. >> >> Thanks, >> Ashwin >> >