Thanks all for the feedback. +1 on the single repo and version for AWS
connectors. The reduced maintenance cost and complexity is a clear winner
here.

I will open a vote thread for this matter.

Thanks all!

On Tue, 25 Oct 2022, 03:04 Jark Wu, <imj...@gmail.com> wrote:

> TBH, I suspect the way of “a single repository per connector”, considering
> there are hundreds of connectors out there (Airbyte[1],  Kafka[2]).
> I don’t think the community is feasible to maintain hundreds of
> repositories.
> It makes sense to combine some connectors to reduce the maintenance
> burden.
> I can imagine we would have a flink-jdbc-connector repo in the future to
> support PG, MySQL, MS SqlServer, Oracle, etc., together.
>
> Best,
> Jark
>
> [1]: https://airbyte.com/connectors
> [2]: https://www.confluent.io/product/connectors/ <
> https://www.confluent.io/product/connectors/>
>
> > 2022年10月25日 06:56,Thomas Weise <t...@apache.org> 写道:
> >
> > Hi Danny,
> >
> > I'm also leaning slightly towards the single AWS connector repo
> direction.
> >
> > Bumps in the underlying AWS SDK would bump all of the connectors in any
> > case. And if a change occurs that is isolated to a single connector, then
> > those that do not use that connector can just skip the release.
> >
> > Cheers,
> > Thomas
> >
> >
> > On Mon, Oct 24, 2022 at 3:01 PM Teoh, Hong <lian...@amazon.co.uk.invalid
> >
> > wrote:
> >
> >> I like the single repo with single version idea.
> >>
> >> Pros:
> >> - Better discoverability for connectors for AWS services means a better
> >> experience for Flink users
> >> - Natural placement of AWS-related utils (Credentials, SDK Retry
> strategy)
> >>
> >> Caveats:
> >> - As you mentioned, it is not desirable if we have to evolve the major
> >> version of the connector just for a change in a single connector (e.g.
> >> DynamoDB). However, I think it is reasonable to only evolve the major
> >> version of the AWS connector repo when there are Flink Source/Sink API
> >> upgrades or AWS SDK major upgrades (probably quire rare). Any new
> features
> >> for individual connectors can be collapsed into minor releases.
> >> - An additional callout here is that we should be careful adopting any
> AWS
> >> connectors that don't use the AWS SDK directly (e.g. how the Kinesis
> >> connector used KPL for a long time). In my opinion, any new connectors
> like
> >> that would be better placed in their own repositories, otherwise we will
> >> have a complex mesh of dependencies to manage.
> >>
> >> Regards,
> >> Hong
> >>
> >>
> >>
> >>
> >> On 21/10/2022, 16:59, "Danny Cranmer" <dannycran...@apache.org> wrote:
> >>
> >>    CAUTION: This email originated from outside of the organization. Do
> >> not click links or open attachments unless you can confirm the sender
> and
> >> know the content is safe.
> >>
> >>
> >>
> >>    Thanks Chesnay for the suggestion, I will investigate this option.
> >>
> >>    Related to the single repo idea, I have considered it in the past.
> Are
> >> you
> >>    proposing we also use a single version between all connectors? If we
> >> have a
> >>    single version then it makes sense to combine them in a single repo,
> if
> >>    they are separate versions, then splitting them makes sense. This was
> >>    discussed last year more generally [1] and the consensus was "we
> >> ultimately
> >>    propose to have a single repository per connector".
> >>
> >>    Combining all AWS connectors into a single repo with a single version
> >> is
> >>    inline with how the AWS SDK works, therefore AWS users are familiar
> >> with
> >>    this approach. However it is frustrating that we would have to
> release
> >> all
> >>    connectors to fix a bug or add a feature in one of them. Example: a
> >> user is
> >>    using Kinesis Data Streams only (the most popular and mature
> >> connector),
> >>    and we evolve the version from 1.x to 2.y (or 1.x to 1.y) for a
> >> DynamoDB
> >>    change.
> >>
> >>    I am torn and will think some more, but it would be great to hear
> other
> >>    people's opinions.
> >>
> >>    [1] https://lists.apache.org/thread/bywh947r2f5hfocxq598zhyh06zhksrm
> >>
> >>    Thanks,
> >>    Danny
> >>
> >>    On Fri, Oct 21, 2022 at 3:11 PM Jing Ge <j...@ververica.com> wrote:
> >>
> >>> I agree with Jark. It would be easier for the further development and
> >>> maintenance, if all aws related connectors and the base module are
> >> in the
> >>> same repo. It might make sense to upgrade the
> >> flink-connector-dynamodb to
> >>> flink-connector-aws and move the other modules including the
> >>> flink-connector-aws-base into it. The aws sdk could be managed in
> >>> flink-connector-aws-base. Any future common connector features could
> >> also
> >>> be developed in the base module.
> >>>
> >>> Best regards,
> >>> Jing
> >>>
> >>> On Fri, Oct 21, 2022 at 1:26 PM Jark Wu <imj...@gmail.com> wrote:
> >>>
> >>>> How about creating a new repository flink-connector-aws and merging
> >>>> dynamodb, kinesis firehouse into it?
> >>>> This can reduce the maintenance for complex dependencies and make
> >> the
> >>>> release easy.
> >>>> I think the maintainers of aws-releated connectors are the same
> >> people.
> >>>>
> >>>> Best,
> >>>> Jark
> >>>>
> >>>>> 2022年10月21日 17:41,Chesnay Schepler <ches...@apache.org> 写道:
> >>>>>
> >>>>> I would not go with 2); I think it'd just be messy .
> >>>>>
> >>>>> Here's another option:
> >>>>>
> >>>>> Create another repository (aws-connector-base) (following the
> >>>> externalization model), add it as a sub-module to the downstream
> >>>> repositories, and make it part of the release process of said
> >> connector.
> >>>>>
> >>>>> I.e., we never create a release for aws-connector-bose, but
> >> release it
> >>>> as part of the connector.
> >>>>> This main benefit here is that we'd always be able to make
> >> changes to
> >>>> the aws-base code without delaying connector releases.
> >>>>> I would assume that any added overhead due to _technically_
> >> releasing
> >>>> the aws code multiple times to be negligible.
> >>>>>
> >>>>>
> >>>>> On 20/10/2022 22:38, Danny Cranmer wrote:
> >>>>>> Hello all,
> >>>>>>
> >>>>>> Currently we have 2 AWS Flink connectors in the main Flink
> >> codebase
> >>>>>> (Kinesis Data Streams and Kinesis Data Firehose) and one new
> >>>> externalized
> >>>>>> connector in progress (DynamoDB). Currently all three of these
> >> use
> >>>> common
> >>>>>> AWS utilities from the flink-connector-aws-base module. Common
> >> code
> >>>>>> includes client builders, property keys, validation, utils etc.
> >>>>>>
> >>>>>> Once we externalize the connectors, leaving
> >> flink-connector-aws-base
> >>>> in the
> >>>>>> main Flink repository will restrict our ability to evolve the
> >>>> connectors
> >>>>>> quickly. For example, as part of the DynamoDB connector build we
> >> are
> >>>>>> considering adding a general retry strategy config that can be
> >>>> leveraged by
> >>>>>> all connectors. We would need to block on Flink 1.17 for this.
> >>>>>>
> >>>>>> In the past we have tried to keep the AWS SDK version consistent
> >> across
> >>>>>> connectors, with the externalization this is more likely to
> >> diverge.
> >>>>>>
> >>>>>> Option 1: I propose we create a new repository,
> >> flink-connector-aws,
> >>>> which
> >>>>>> we can move the flink-connector-aws-base module to and create a
> >> new
> >>>>>> flink-connector-aws-parent to manage SDK versions. Each of the
> >>>> externalized
> >>>>>> AWS connectors will depend on this new module and parent.
> >> Downside is
> >>>> an
> >>>>>> additional module to release per Flink version, however I will
> >>>> volunteer to
> >>>>>> manage this.
> >>>>>>
> >>>>>> Option 2: We can move the flink-connector-aws-base module and
> >> create
> >>>>>> flink-connector-parent within the flink-connector-shared-utils
> >> repo [2]
> >>>>>>
> >>>>>> Option 3: We do nothing.
> >>>>>>
> >>>>>> For option 1+2 we will follow the general externalized connector
> >>>> versioning
> >>>>>> strategy and rules.
> >>>>>>
> >>>>>> I am inclined towards option 1, and appreciate feedback from the
> >>>> community.
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>
> >>
> https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-aws-base
> >>>>>> [2] https://github.com/apache/flink-connector-shared-utils
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Danny
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Reply via email to