Hi Chris, Thanks for taking a look at this KIP!
1. I chose to go with simply "state" as that exact term is already exposed via some of the existing REST API responses and would be one that users are already familiar with (although admittedly something like "initial_state" wouldn't be much of a jump). Since it's a field in the request body for the connector creation endpoint, wouldn't it be implied that it is the "initial" state just like the "config" field represents the "initial" configuration? Also, I don't think x.y has been established as the field naming convention in the Connect REST API right? From what I can tell, x_y is the convention being followed for fields in requests ("kafka_topic" / "kafka_partition" / "kafka_offset" in the offsets APIs for instance) and responses ("error_count", "kafka_cluster_id", "recommended_values" etc.). 2. The connector configuration record is currently used for both connector create requests as well as connector config update requests. Since we're only allowing configuring the target state for newly created connectors, I feel like it'll be a cleaner separation of concerns to use the existing records for connector configurations and connector target states rather than bundling the "state" and "state.v2" (or equivalent) fields into the connector configuration record. The additional write should be very minimal overhead and the two writes would be an atomic operation for Connect clusters that are using a transactional producer for the config topic anyway. Thoughts? 3. I was thinking that we'd support standalone mode via the same connector creation REST API endpoint changes (addition of the "state" field). If there is further interest in adding similar support to the command line method of creating connectors as well, perhaps we could do so in a small follow-on KIP? I feel like ever since the standalone mode started supporting the full Connect REST API, the command line method of creating connectors is more of a "legacy" concept. 4. Yeah, connector / offsets migration was just used as a representative example of how this feature could be useful - I didn't intend for it to be the sole purpose of this KIP. However, that said, I really like the idea of accepting an "offsets" field in the connector creation request since it'd reduce the number of user operations required from 3 (create the connector in a STOPPED state; alter the offsets; resume the connector) to 1. I'd be happy to either create or review a separate small KIP for that if it sounds acceptable to you? 5. Whoops, thanks, I hadn't even noticed that that's where I had linked to. Fixed! Thanks, Yash On Mon, Oct 2, 2023 at 11:14 PM Chris Egerton <chr...@aiven.io.invalid> wrote: > Hi Yash, > > Thanks for the KIP! Frankly this feels like an oversight in 875 and I'm > glad we're closing that loop ASAP. > > > Here are my thoughts: > > 1. (Nit): IMO "start.state", "initial.state", or "target.state" might be > better than just "state" for the field name in the connector creation > request. > > 2. Why implement this in distributed mode with two writes to the config > topic? We could augment the existing format for connector configs in the > config topic [1] with a new field instead. > > 3. Although standalone mode is mentioned in the KIP, there's no detail on > the Java properties file format that we support for standalone mode (i.e., > `connect-standalone config/connect-standalone.properties > config/connect-file-source.properties > config/connect-file-sink.properties`). Do we plan on adding support for > that mode as well? > > 4. I suspect there will be advantages for this feature beyond offsets > migration, but if offsets migration were the only motivation for it, > wouldn't it be simpler to accept an "offsets" field in connector creation > requests? That way, users wouldn't have to start a connector in a different > state and then resume it, they could just create the connector like normal. > I think the current proposal is acceptable, but wanted to float this > alternative in case we anticipate lots of connector migrations and want to > optimize for them a bit more. > > 5. (NIt): We can link to our own Javadocs [2] instead of javadoc.io > > > [1] - > > https://github.com/apache/kafka/blob/dcd8c7d05f2f22f2d815405e7ab3ad7439669239/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java#L234-L236 > > [2] - https://kafka.apache.org/35/javadoc/index.html?overview-summary.html > > > Cheers, > > Chris > > On Thu, Sep 21, 2023 at 2:37 AM Yash Mayya <yash.ma...@gmail.com> wrote: > > > Hi Ashwin, > > > > Thanks for taking a look and sharing your thoughts! > > > > 1. Yes, the request / response formats of the two APIs were intentionally > > made identical for such use-cases. [1] > > > > 2. I'm assuming you're referring to retaining the offset / config topic > > records for a connector when it is deleted by a user? Firstly, a > > connector's offsets aren't actually currently deleted when the connector > is > > deleted - it was listed as potential future work in KIP-875 [2]. > Secondly, > > the config topic is the source of truth for the state of a Connect > cluster > > and a connector deletion is done by writing a null-valued record > > (tombstone) with the connector key to the config topic. We could > > potentially modify the delete mechanism to use a special new record > > (instead of a tombstone with the connector key) in order to retain the > > latest configuration of a connector while still deleting the actual > > connector - however, this would have its downsides and I don't see too > many > > benefits. Furthermore, connector migration between different Kafka > clusters > > was just used as a representational use case for creating connectors in a > > stopped state - it isn't the primary focus of this KIP as such. > > > > 3. KIP-875 goes into some more details about this [3], but the TLDR is > that > > the STOPPED state will be treated like the PAUSED state on older workers > > that don't recognize the STOPPED state. > > > > Thanks, > > Yash > > > > [1] - > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect#KIP875:FirstclassoffsetssupportinKafkaConnect-Request/responseformat > > > > [2] - > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect#KIP875:FirstclassoffsetssupportinKafkaConnect-Automaticallydeleteoffsetswithconnectors > > > > [3] - > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-875%3A+First-class+offsets+support+in+Kafka+Connect#KIP875:FirstclassoffsetssupportinKafkaConnect-STOPPEDtargetstate > > > > On Wed, Sep 20, 2023 at 7:24 PM Ashwin <apan...@confluent.io.invalid> > > wrote: > > > > > Thanks Yash! This is very useful for migrating connectors from one > > cluster > > > to another. > > > > > > I had the following comments/questions > > > > > > 1. Is the offset read using `GET /offsets` api always guaranteed to be > > in a > > > format accepted by `PATCH /offsets` ? > > > 2. I had to tackle a similar migration situation but the two connect > > > clusters in question were using the same backing Kafka cluster. The > > > challenge in this case is that when I delete the original connector, I > > want > > > to retain offsets and config topics. Do you think we should support > > > deletion of a connector without removal of these topics as part of this > > KIP > > > ? > > > 3. In the case of a downgrade, how will Connect worker handle the > > optional > > > “state” field in config topic ? > > > > > > Thanks, > > > Ashwin > > > > > > > > > > > > > > > On Sun, Sep 17, 2023 at 11:09 PM Yash Mayya <yash.ma...@gmail.com> > > wrote: > > > > > > > Hi all, > > > > > > > > I'd like to begin discussion on a KIP to allow creating connectors > in a > > > > stopped state - > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-980%3A+Allow+creating+connectors+in+a+stopped+state > > > > > > > > > > > > Thanks, > > > > Yash > > > > > > > > > >