Thanks all for the feedback. +1 on the single repo and version for AWS connectors. The reduced maintenance cost and complexity is a clear winner here.
I will open a vote thread for this matter. Thanks all! On Tue, 25 Oct 2022, 03:04 Jark Wu, <imj...@gmail.com> wrote: > TBH, I suspect the way of “a single repository per connector”, considering > there are hundreds of connectors out there (Airbyte[1], Kafka[2]). > I don’t think the community is feasible to maintain hundreds of > repositories. > It makes sense to combine some connectors to reduce the maintenance > burden. > I can imagine we would have a flink-jdbc-connector repo in the future to > support PG, MySQL, MS SqlServer, Oracle, etc., together. > > Best, > Jark > > [1]: https://airbyte.com/connectors > [2]: https://www.confluent.io/product/connectors/ < > https://www.confluent.io/product/connectors/> > > > 2022年10月25日 06:56,Thomas Weise <t...@apache.org> 写道: > > > > Hi Danny, > > > > I'm also leaning slightly towards the single AWS connector repo > direction. > > > > Bumps in the underlying AWS SDK would bump all of the connectors in any > > case. And if a change occurs that is isolated to a single connector, then > > those that do not use that connector can just skip the release. > > > > Cheers, > > Thomas > > > > > > On Mon, Oct 24, 2022 at 3:01 PM Teoh, Hong <lian...@amazon.co.uk.invalid > > > > wrote: > > > >> I like the single repo with single version idea. > >> > >> Pros: > >> - Better discoverability for connectors for AWS services means a better > >> experience for Flink users > >> - Natural placement of AWS-related utils (Credentials, SDK Retry > strategy) > >> > >> Caveats: > >> - As you mentioned, it is not desirable if we have to evolve the major > >> version of the connector just for a change in a single connector (e.g. > >> DynamoDB). However, I think it is reasonable to only evolve the major > >> version of the AWS connector repo when there are Flink Source/Sink API > >> upgrades or AWS SDK major upgrades (probably quire rare). Any new > features > >> for individual connectors can be collapsed into minor releases. > >> - An additional callout here is that we should be careful adopting any > AWS > >> connectors that don't use the AWS SDK directly (e.g. how the Kinesis > >> connector used KPL for a long time). In my opinion, any new connectors > like > >> that would be better placed in their own repositories, otherwise we will > >> have a complex mesh of dependencies to manage. > >> > >> Regards, > >> Hong > >> > >> > >> > >> > >> On 21/10/2022, 16:59, "Danny Cranmer" <dannycran...@apache.org> wrote: > >> > >> CAUTION: This email originated from outside of the organization. Do > >> not click links or open attachments unless you can confirm the sender > and > >> know the content is safe. > >> > >> > >> > >> Thanks Chesnay for the suggestion, I will investigate this option. > >> > >> Related to the single repo idea, I have considered it in the past. > Are > >> you > >> proposing we also use a single version between all connectors? If we > >> have a > >> single version then it makes sense to combine them in a single repo, > if > >> they are separate versions, then splitting them makes sense. This was > >> discussed last year more generally [1] and the consensus was "we > >> ultimately > >> propose to have a single repository per connector". > >> > >> Combining all AWS connectors into a single repo with a single version > >> is > >> inline with how the AWS SDK works, therefore AWS users are familiar > >> with > >> this approach. However it is frustrating that we would have to > release > >> all > >> connectors to fix a bug or add a feature in one of them. Example: a > >> user is > >> using Kinesis Data Streams only (the most popular and mature > >> connector), > >> and we evolve the version from 1.x to 2.y (or 1.x to 1.y) for a > >> DynamoDB > >> change. > >> > >> I am torn and will think some more, but it would be great to hear > other > >> people's opinions. > >> > >> [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm > >> > >> Thanks, > >> Danny > >> > >> On Fri, Oct 21, 2022 at 3:11 PM Jing Ge <j...@ververica.com> wrote: > >> > >>> I agree with Jark. It would be easier for the further development and > >>> maintenance, if all aws related connectors and the base module are > >> in the > >>> same repo. It might make sense to upgrade the > >> flink-connector-dynamodb to > >>> flink-connector-aws and move the other modules including the > >>> flink-connector-aws-base into it. The aws sdk could be managed in > >>> flink-connector-aws-base. Any future common connector features could > >> also > >>> be developed in the base module. > >>> > >>> Best regards, > >>> Jing > >>> > >>> On Fri, Oct 21, 2022 at 1:26 PM Jark Wu <imj...@gmail.com> wrote: > >>> > >>>> How about creating a new repository flink-connector-aws and merging > >>>> dynamodb, kinesis firehouse into it? > >>>> This can reduce the maintenance for complex dependencies and make > >> the > >>>> release easy. > >>>> I think the maintainers of aws-releated connectors are the same > >> people. > >>>> > >>>> Best, > >>>> Jark > >>>> > >>>>> 2022年10月21日 17:41,Chesnay Schepler <ches...@apache.org> 写道: > >>>>> > >>>>> I would not go with 2); I think it'd just be messy . > >>>>> > >>>>> Here's another option: > >>>>> > >>>>> Create another repository (aws-connector-base) (following the > >>>> externalization model), add it as a sub-module to the downstream > >>>> repositories, and make it part of the release process of said > >> connector. > >>>>> > >>>>> I.e., we never create a release for aws-connector-bose, but > >> release it > >>>> as part of the connector. > >>>>> This main benefit here is that we'd always be able to make > >> changes to > >>>> the aws-base code without delaying connector releases. > >>>>> I would assume that any added overhead due to _technically_ > >> releasing > >>>> the aws code multiple times to be negligible. > >>>>> > >>>>> > >>>>> On 20/10/2022 22:38, Danny Cranmer wrote: > >>>>>> Hello all, > >>>>>> > >>>>>> Currently we have 2 AWS Flink connectors in the main Flink > >> codebase > >>>>>> (Kinesis Data Streams and Kinesis Data Firehose) and one new > >>>> externalized > >>>>>> connector in progress (DynamoDB). Currently all three of these > >> use > >>>> common > >>>>>> AWS utilities from the flink-connector-aws-base module. Common > >> code > >>>>>> includes client builders, property keys, validation, utils etc. > >>>>>> > >>>>>> Once we externalize the connectors, leaving > >> flink-connector-aws-base > >>>> in the > >>>>>> main Flink repository will restrict our ability to evolve the > >>>> connectors > >>>>>> quickly. For example, as part of the DynamoDB connector build we > >> are > >>>>>> considering adding a general retry strategy config that can be > >>>> leveraged by > >>>>>> all connectors. We would need to block on Flink 1.17 for this. > >>>>>> > >>>>>> In the past we have tried to keep the AWS SDK version consistent > >> across > >>>>>> connectors, with the externalization this is more likely to > >> diverge. > >>>>>> > >>>>>> Option 1: I propose we create a new repository, > >> flink-connector-aws, > >>>> which > >>>>>> we can move the flink-connector-aws-base module to and create a > >> new > >>>>>> flink-connector-aws-parent to manage SDK versions. Each of the > >>>> externalized > >>>>>> AWS connectors will depend on this new module and parent. > >> Downside is > >>>> an > >>>>>> additional module to release per Flink version, however I will > >>>> volunteer to > >>>>>> manage this. > >>>>>> > >>>>>> Option 2: We can move the flink-connector-aws-base module and > >> create > >>>>>> flink-connector-parent within the flink-connector-shared-utils > >> repo [2] > >>>>>> > >>>>>> Option 3: We do nothing. > >>>>>> > >>>>>> For option 1+2 we will follow the general externalized connector > >>>> versioning > >>>>>> strategy and rules. > >>>>>> > >>>>>> I am inclined towards option 1, and appreciate feedback from the > >>>> community. > >>>>>> > >>>>>> [1] > >>>>>> > >>>> > >> > https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-aws-base > >>>>>> [2] https://github.com/apache/flink-connector-shared-utils > >>>>>> > >>>>>> Thanks, > >>>>>> Danny > >>>>>> > >>>>> > >>>> > >>>> > >> > >> > >