Thanks Chris, It makes sense to call out the cleaning up of existing offsets in the KIP. I have made the change and will initiate the voting in a day.
Regards, Ashwin On Tue, Dec 12, 2023 at 6:57 PM Chris Egerton <chr...@aiven.io.invalid> wrote: > Hi Ashwin, > > LGTM! One small adjustment I'd suggest but we don't have to block on--it > may be clearer to put "Wipe all existing offsets for the connector" in > between steps 2 and 3 of the proposed changes section. > > Cheers, > > Chris > > On Mon, Dec 11, 2023 at 11:24 PM Ashwin <apan...@confluent.io.invalid> > wrote: > > > Thanks for pointing this out Chris. > > > > I have updated the KIP with the correct sequence of steps. > > > > Thanks, > > Ashwin > > > > On Wed, Dec 6, 2023 at 11:48 PM Chris Egerton <chr...@aiven.io.invalid> > > wrote: > > > > > Hi Ashwin, > > > > > > Regarding point 4--I think this is still a bit unwise. When workers > pick > > up > > > a new connector config from the config topic, they participate in a > > > rebalance. It may be safe to write offsets during that rebalance, but > in > > > the name of simplicity, do you think we can write the offsets for the > > > connector before its config? The sequence of steps would be something > > like > > > this: > > > > > > 1. Validate offsets and connector config (can be done in any order) > > > 2. Write offsets > > > 3. Write connector config (with whatever initial state is specified in > > the > > > request, or the default if none is specified) > > > > > > Cheers, > > > > > > Chris > > > > > > On Wed, Dec 6, 2023 at 9:13 AM Ashwin <apan...@confluent.io.invalid> > > > wrote: > > > > > > > Hello Chris, > > > > > > > > Thanks for the quick and detailed review. Please see my responses > below > > > > > > > > High-level thoughts: > > > > > > > > 1. I had not thought of this till now, thanks for pointing it out. I > > > > would lean towards the second option of cleaning previous offsets as > > > > it will result in the fewer surprises for the user. > > > > > > > > 2 and 3. I agree and have updated the wiki to that effect. I just > > > > wanted to use the connector name as a mutex - we can handle the race > > > > in other ways. > > > > > > > > 4. Yes, I meant submitting config to the config topic. Have updated > > > > the wiki to make it clearer. > > > > > > > > > > > > Nits: > > > > > > > > Thanks for pointing these - I have made the necessary changes. > > > > > > > > > > > > Thanks again, > > > > > > > > Ashwin > > > > > > > > > > > > On Mon, Dec 4, 2023 at 9:05 PM Chris Egerton <chr...@aiven.io.invalid > > > > > > wrote: > > > > > > > > > Hi Ashwin, > > > > > > > > > > Thanks for the KIP! This would be a nice simplification to the > > process > > > > for > > > > > migrating connectors enabled by KIP-980, and would also add global > > > > support > > > > > for a feature I've seen implemented by hand in at least a couple > > > > connectors > > > > > over the years. > > > > > > > > > > > > > > > High-level thoughts: > > > > > > > > > > 1. If there are leftover offsets from a previous connector, what > will > > > the > > > > > impact on those offsets (if any) be if a new connector is created > > with > > > > the > > > > > same name with initial offsets specified? I can think of at least > two > > > > > options: we leave those offsets as-are but allow any of the initial > > > > offsets > > > > > in the new connector request to overwrite them, or we automatically > > > wipe > > > > > all existing offsets for the connector first before writing its > > initial > > > > > offsets and then creating it. I have a slight preference for the > > first > > > > > because it's simpler to implement and aligns with existing > precedent > > > for > > > > > offset handling where we never wipe them unless explicitly > requested > > by > > > > the > > > > > user or connector, but it could be argued that the second is less > > > likely > > > > to > > > > > generate footguns for users. Interested in your thoughts! > > > > > > > > > > 2. IMO preflight validation (i.e., the "Validate initial_offset > > before > > > > > creating the connector in STOPPED state rejected alternative) is a > > > > > must-have for this kind of feature. I acknowledge the possible race > > > > > condition (and I don't think it's worth it to track in-flight > > connector > > > > > creation requests in the herder in order to prevent this race, > since > > > > > ever-blocking Connector operations would be cumbersome to deal > with), > > > and > > > > > the extra implementation effort. But I don't think either of those > > tip > > > > the > > > > > scales far enough to override the benefit of ensuring the submitted > > > > offsets > > > > > are valid before creating the connector. > > > > > > > > > > 3. On the topic of preflight validation--I also think it's > important > > > that > > > > > we validate both the connector config and the initial offsets > before > > > > either > > > > > creating the connector or storing the initial offsets. I don't > think > > > this > > > > > point requires any changes to the KIP that aren't already proposed > > with > > > > > point 2. above, but wanted to see if we could adopt this as a > primary > > > > goal > > > > > for the design of the feature and keep it in mind with future > > changes. > > > > > > > > > > 4. In the proposed changes section, the first step is "Create a > > > connector > > > > > in STOPPED state", and the second step is "Validate the offset...". > > > What > > > > > exactly is entailed by "Create a connector"? Submit the config to > the > > > > > config topic (presumably after a preflight validation of the > config)? > > > > > Participate in the ensuing rebalance? I'm a little hesitant to > start > > > > > performing rebalances inside REST requests, wondering if we can > find > > a > > > > > lighter-weight way to implement this. > > > > > > > > > > > > > > > Nits: > > > > > > > > > > 5. You can remove the italicized "This page is meant as a template > > for > > > > > writing a KIP" section after the table of contents. > > > > > > > > > > 6. Can you file a JIRA ticket for this change and add a link to it > in > > > the > > > > > KIP under the status section? > > > > > > > > > > 7. What do you think about changing the name for the new field from > > > > > "initial_offset" (singular) to "initial_offsets" (plural)? This is > > > > > especially relevant for connectors that read from multiple source > > > > > partitions, like MM2 and the various JDBC source connectors out > > there. > > > > > > > > > > 8. IMO it's not necessary to include this section: "Please note > that > > > sink > > > > > and source connectors have different schemas for offset." While > it's > > > > > technically true that the fields of the objects inside the > > "partition" > > > > and > > > > > "offset" fields will likely differ between sink and source > > connectors, > > > > > they'll also likely differ in the exact same way across different > > sink > > > > > connectors. I think it's enough to link to the relevant section in > > > > KIP-875 > > > > > on the details for the format. > > > > > > > > > > 9. Instead of "Connector-defined source partition" and > > > "Connector-defined > > > > > source offset" in the comments for the sample connector creation > body > > > > > (which aren't strictly accurate for sink connectors), can we say > > > > something > > > > > like "Source partition" and "Desired initial offset"? > > > > > > > > > > 10. In the compatibility section the KIP states that "This new > > feature > > > > will > > > > > use the current OffsetStorageWriter." IMO we should refrain from > > > > referring > > > > > to internal API class names in KIPs when possible, since those > class > > > > names > > > > > may change and users may also mistakenly assume that they're part > of > > > the > > > > > public API. Can we say something like "This new feature will reuse > > > > existing > > > > > logic for storing offsets" instead? > > > > > > > > > > > > > > > Thanks again for the KIP! > > > > > > > > > > Cheers, > > > > > > > > > > Chris > > > > > > > > > > On Thu, Nov 30, 2023 at 2:26 AM Ashwin > <apan...@confluent.io.invalid > > > > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > I'd like to begin discussion on KIP-995 which proposes to allow > > users > > > > > > to specify initial offset as part of the request to create a > > > connector > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-995%3A+Allow+users+to+specify+initial+offsets+while+creating+connectors > > > > > > > > > > > > During the discussion for KIP-980 > > > > > > < > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state > > > > > > >, > > > > > > which proposed the creation of connectors in STOPPED state, there > > was > > > > > > a suggestion to also allow setting the initial offset for a > > connector > > > > > > in the connector creation API. The proposal was deemed valid > > > > > > < > https://lists.apache.org/thread/tq79l3975hy69wsycvwoh9n8gbp5j26f> > > > > > > (point no.4) but was deferred to a future KIP. This KIP proposes > to > > > > > > implement that change and builds upon the changes introduced in > > > > > > KIP-875 < > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect#KIP875:FirstclassoffsetssupportinKafkaConnect-Alteringoffsets(request) > > > > > > > > > > > > > and KIP-980 < > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state > > > > > > >. > > > > > > > > > > > > Thanks, > > > > > > Ashwin > > > > > > > > > > > > > > > > > > > > >