Hi Saurabh
I think this is going to be a valuable addition which is needed.
I have a couple of comments
- In the FLIP you mention the split is going to be 1 sqs Queue, does this
mean we would support reading from multiple queues? The builder example
seems to support a single queue
SqsSource.<T>builder.setSqsUrl("
https://sqs.us-east-1.amazonaws.com/23145433/sqs-test";)
- This is also not clear in the implementation of `addSplitsBack` whether
we are planning to support multiple sqs topics or not.
- Regarding Client creation, there has been some effort in the common
`aws-util` module like createAwsSyncClient  , we should reuse that for
`SqsClient` creation.
- On the same point for clients, Is there a reason the FLIP suggests async
clients? sync clients have proven more stable and the source threading
model already guarantees no blocking by sync clients.
- On mentioning threading, the FLIP doesn't mention the fetcher manager. Is
it going to be `SingleThreadFetcherManager`? Would it be better to make the
source reader extend the SingleThreadedMultiplexReaderBase or are we going
to implement a more simple version?
- The FLIP doesn't mention schema deserialization or the recordEmitter
implementation, Are we going to use `deserializationSchema` or some sort of
string to element converter? It is also not clear form the builder example
provided?
- Are the values mentioned in getSqsClientProperties recommended defaults?
If so we should highlight that
- Most importantly I am a bit skeptical regarding enforcing exactly-once
semantics with side effects especially with dependency on checkpointing
configuration, could we add flags to disable and disable by default if the
checkpointing is not enabled?

I am not 100% convinced we should block FLIP itself on the FLIP-438
implementation but I echo the fact that there might be some reusable code
between the 2 submodules we should make use of.

 Best Regards
Ahmed Hamdy


On Fri, 26 Jul 2024 at 17:55, Saurabh Singh <saurabhsingh9...@gmail.com>
wrote:

> Hi Li Wang,
>
> Thanks for the review and appreciate your feedback.
> I completely understand your concern and agree with it. Our goal is to
> provide users of connectors with a consistent and coherent ecosystem, free
> of any issues.
> To ensure that, we are closely monitoring/reviewing the SQS Sink
> implementation work by Dhingra. Our development work will commence once the
> AWS Sink is near completion or completed. This approach ensures that we
> take in the new learnings, do not duplicate any core modules and allow us
> to save valuable time.
>
> In the meantime, we would like to keep the discussion and process active on
> this FLIP. Gaining valuable community feedback (which is helping us) will
> help us address any potential gaps in the source connector design and
> finalize it. Behind the scenes, we are already designing and pre-planning
> our development work to adhere to feedback/best practices/ faster delivery
> when we implement this FLIP.
> Please share your thoughts on this.
>
> Regards
> Saurabh
>
> On Fri, Jul 26, 2024 at 5:52 PM Li Wang <liwang505...@gmail.com> wrote:
>
> > Hi Saurabh,
> >
> > Thanks for the FLIP. Given the ongoing effort to implement the SQS sink
> > connector (
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-438%3A+Amazon+SQS+Sink+Connector
> > ),
> > it is important to consider how the SQS source connector supports this
> > development, ensuring a unified framework for AWS integration that avoids
> > capacity overlap and maximizes synchronization. Since the implementation
> by
> > Dhingra is still WIP, I would suggest taking that into account and keep
> > this FLIP as an extension of that FLIP which could be taken up once the
> SQS
> > Sink FLIP is fully implemented. If not, this may lead to doublework in
> > multiple identity-related modules and structure of the overall SQS
> > connector codebase. What do you think?
> >
> > Thanks
> >
> >
> > On Thursday, July 25, 2024, Saurabh Singh <saurabhsingh9...@gmail.com>
> > wrote:
> >
> > > Hi Samrat,
> > >
> > > Thanks for the review and feedback.
> > > We have evaluated all the three points. Please find the answers below:
> > >
> > > 1. AWS has announced JSON protocol support in SQS [1]. Can you shed
> some
> > > light on how different protocols will be supported?
> > >  - We will utilize the AWS client library to connect with the AWS SQS
> > > Service. Versions beyond 2.21.19 now support JSON, so simply upgrading
> > the
> > > client library will suffice for the protocol switch. However, from the
> > > connector's perspective, we do not anticipate any changes in our
> > > communication process, as it is handled by the client library. [4]
> > >
> > > 2. AWS SQS has two types of queues [2]. What are the implementation
> > detail
> > > differences for the source connector?
> > > - SQS Connector is indifferent to the customer's choice of Queue type.
> If
> > > out-of-order messages are a concern, it will be the responsibility of
> the
> > > application code or main job logic to manage this.
> > >
> > > 3. Will the SQS source connector implement any kind of callbacks [3] on
> > > success to offer any kind of guarantee?
> > > - We have proposed deleting SQS messages using the notification
> provided
> > by
> > > the checkpoint framework on checkpoint completion. Thus providing
> exactly
> > > once guarantee.[5] Additionally, when deleting messages, we will
> monitor
> > > the API call responses and log any failures, along with providing
> > > observability through appropriate metrics.
> > >
> > > [1]
> > >
> > >
> >
> https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-sqs-support-json-protocol/
> > > [2]
> > >
> > >
> >
> https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-queue-types.html
> > > [3]
> > >
> > >
> >
> https://docs.aws.amazon.com/step-functions/latest/dg/callback-task-sample-sqs.html
> > > [4]
> > >
> > >
> >
> https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-json-faqs.html#json-protocol-getting-started
> > > [5]
> > >
> > >
> >
> https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit#heading=h.jdcikzojx5d9
> > >
> > >
> > > [Main Proposal doc] -
> > >
> > >
> >
> https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit#heading=h.ci1rrcgbsvkl
> > >
> > > Please feel free to reach out if you have more feedback.
> > >
> > > Regards
> > > Saurabh & Abhi
> > >
> > >
> > > On Wed, Jul 24, 2024 at 8:52 AM Samrat Deb <decordea...@gmail.com>
> > wrote:
> > >
> > > > Hi Saurabh,
> > > >
> > > > Thank you for sharing the FLIP for the SQS source connector. An SQS
> > > source
> > > > connector will be a great addition to the Flink ecosystem, as there
> is
> > a
> > > > growing demand for SQS source/sink integration.
> > > >
> > > > I have a few queries:
> > > >
> > > > 1. AWS has announced JSON protocol support in SQS [1]. Can you shed
> > some
> > > > light on how different protocols will be supported?
> > > > 2. AWS SQS has two types of queues [2]. What are the implementation
> > > detail
> > > > differences for the source connector?
> > > > 3. Will the SQS source connector implement any kind of callbacks [3]
> on
> > > > success to offer any kind of guarantee?
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-sqs-support-json-protocol/
> > > > [2]
> > > >
> > > >
> > >
> >
> https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-queue-types.html
> > > > [3]
> > > >
> > > >
> > >
> >
> https://docs.aws.amazon.com/step-functions/latest/dg/callback-task-sample-sqs.html
> > > >
> > > > Bests,
> > > > Samrat
> > > >
> > > >
> > > > On Fri, 19 Jul 2024 at 9:53 PM, Saurabh Singh <
> > > saurabhsingh9...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Fink Devs,
> > > > >
> > > > > Our team has been working on migrating various data pipelines to
> > Flink
> > > to
> > > > > leverage the benefits of exactly-once processing, checkpointing,
> and
> > > > > stateful computing. We have several use cases built around the AWS
> > SQS
> > > > > Service. For this migration, we have developed an SQS Source
> > Connector,
> > > > > which enables us to run both stateless and stateful Flink-based
> jobs.
> > > > >
> > > > > We believe that this SQS Source Connector would be a valuable
> > addition
> > > to
> > > > > the existing connector set. Therefore, we propose a FLIP to include
> > it.
> > > > >
> > > > > For more information, please refer to the FLIP document.
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit?usp=sharing
> > > > >
> > > > > Thanks
> > > > > Saurabh & Abhi
> > > > >
> > > >
> > >
> >
>

Reply via email to