Hi Saurabh I think this is going to be a valuable addition which is needed. I have a couple of comments - In the FLIP you mention the split is going to be 1 sqs Queue, does this mean we would support reading from multiple queues? The builder example seems to support a single queue SqsSource.<T>builder.setSqsUrl(" https://sqs.us-east-1.amazonaws.com/23145433/sqs-test") - This is also not clear in the implementation of `addSplitsBack` whether we are planning to support multiple sqs topics or not. - Regarding Client creation, there has been some effort in the common `aws-util` module like createAwsSyncClient , we should reuse that for `SqsClient` creation. - On the same point for clients, Is there a reason the FLIP suggests async clients? sync clients have proven more stable and the source threading model already guarantees no blocking by sync clients. - On mentioning threading, the FLIP doesn't mention the fetcher manager. Is it going to be `SingleThreadFetcherManager`? Would it be better to make the source reader extend the SingleThreadedMultiplexReaderBase or are we going to implement a more simple version? - The FLIP doesn't mention schema deserialization or the recordEmitter implementation, Are we going to use `deserializationSchema` or some sort of string to element converter? It is also not clear form the builder example provided? - Are the values mentioned in getSqsClientProperties recommended defaults? If so we should highlight that - Most importantly I am a bit skeptical regarding enforcing exactly-once semantics with side effects especially with dependency on checkpointing configuration, could we add flags to disable and disable by default if the checkpointing is not enabled?
I am not 100% convinced we should block FLIP itself on the FLIP-438 implementation but I echo the fact that there might be some reusable code between the 2 submodules we should make use of. Best Regards Ahmed Hamdy On Fri, 26 Jul 2024 at 17:55, Saurabh Singh <saurabhsingh9...@gmail.com> wrote: > Hi Li Wang, > > Thanks for the review and appreciate your feedback. > I completely understand your concern and agree with it. Our goal is to > provide users of connectors with a consistent and coherent ecosystem, free > of any issues. > To ensure that, we are closely monitoring/reviewing the SQS Sink > implementation work by Dhingra. Our development work will commence once the > AWS Sink is near completion or completed. This approach ensures that we > take in the new learnings, do not duplicate any core modules and allow us > to save valuable time. > > In the meantime, we would like to keep the discussion and process active on > this FLIP. Gaining valuable community feedback (which is helping us) will > help us address any potential gaps in the source connector design and > finalize it. Behind the scenes, we are already designing and pre-planning > our development work to adhere to feedback/best practices/ faster delivery > when we implement this FLIP. > Please share your thoughts on this. > > Regards > Saurabh > > On Fri, Jul 26, 2024 at 5:52 PM Li Wang <liwang505...@gmail.com> wrote: > > > Hi Saurabh, > > > > Thanks for the FLIP. Given the ongoing effort to implement the SQS sink > > connector ( > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-438%3A+Amazon+SQS+Sink+Connector > > ), > > it is important to consider how the SQS source connector supports this > > development, ensuring a unified framework for AWS integration that avoids > > capacity overlap and maximizes synchronization. Since the implementation > by > > Dhingra is still WIP, I would suggest taking that into account and keep > > this FLIP as an extension of that FLIP which could be taken up once the > SQS > > Sink FLIP is fully implemented. If not, this may lead to doublework in > > multiple identity-related modules and structure of the overall SQS > > connector codebase. What do you think? > > > > Thanks > > > > > > On Thursday, July 25, 2024, Saurabh Singh <saurabhsingh9...@gmail.com> > > wrote: > > > > > Hi Samrat, > > > > > > Thanks for the review and feedback. > > > We have evaluated all the three points. Please find the answers below: > > > > > > 1. AWS has announced JSON protocol support in SQS [1]. Can you shed > some > > > light on how different protocols will be supported? > > > - We will utilize the AWS client library to connect with the AWS SQS > > > Service. Versions beyond 2.21.19 now support JSON, so simply upgrading > > the > > > client library will suffice for the protocol switch. However, from the > > > connector's perspective, we do not anticipate any changes in our > > > communication process, as it is handled by the client library. [4] > > > > > > 2. AWS SQS has two types of queues [2]. What are the implementation > > detail > > > differences for the source connector? > > > - SQS Connector is indifferent to the customer's choice of Queue type. > If > > > out-of-order messages are a concern, it will be the responsibility of > the > > > application code or main job logic to manage this. > > > > > > 3. Will the SQS source connector implement any kind of callbacks [3] on > > > success to offer any kind of guarantee? > > > - We have proposed deleting SQS messages using the notification > provided > > by > > > the checkpoint framework on checkpoint completion. Thus providing > exactly > > > once guarantee.[5] Additionally, when deleting messages, we will > monitor > > > the API call responses and log any failures, along with providing > > > observability through appropriate metrics. > > > > > > [1] > > > > > > > > > https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-sqs-support-json-protocol/ > > > [2] > > > > > > > > > https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-queue-types.html > > > [3] > > > > > > > > > https://docs.aws.amazon.com/step-functions/latest/dg/callback-task-sample-sqs.html > > > [4] > > > > > > > > > https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-json-faqs.html#json-protocol-getting-started > > > [5] > > > > > > > > > https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit#heading=h.jdcikzojx5d9 > > > > > > > > > [Main Proposal doc] - > > > > > > > > > https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit#heading=h.ci1rrcgbsvkl > > > > > > Please feel free to reach out if you have more feedback. > > > > > > Regards > > > Saurabh & Abhi > > > > > > > > > On Wed, Jul 24, 2024 at 8:52 AM Samrat Deb <decordea...@gmail.com> > > wrote: > > > > > > > Hi Saurabh, > > > > > > > > Thank you for sharing the FLIP for the SQS source connector. An SQS > > > source > > > > connector will be a great addition to the Flink ecosystem, as there > is > > a > > > > growing demand for SQS source/sink integration. > > > > > > > > I have a few queries: > > > > > > > > 1. AWS has announced JSON protocol support in SQS [1]. Can you shed > > some > > > > light on how different protocols will be supported? > > > > 2. AWS SQS has two types of queues [2]. What are the implementation > > > detail > > > > differences for the source connector? > > > > 3. Will the SQS source connector implement any kind of callbacks [3] > on > > > > success to offer any kind of guarantee? > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-sqs-support-json-protocol/ > > > > [2] > > > > > > > > > > > > > > https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-queue-types.html > > > > [3] > > > > > > > > > > > > > > https://docs.aws.amazon.com/step-functions/latest/dg/callback-task-sample-sqs.html > > > > > > > > Bests, > > > > Samrat > > > > > > > > > > > > On Fri, 19 Jul 2024 at 9:53 PM, Saurabh Singh < > > > saurabhsingh9...@gmail.com> > > > > wrote: > > > > > > > > > Hi Fink Devs, > > > > > > > > > > Our team has been working on migrating various data pipelines to > > Flink > > > to > > > > > leverage the benefits of exactly-once processing, checkpointing, > and > > > > > stateful computing. We have several use cases built around the AWS > > SQS > > > > > Service. For this migration, we have developed an SQS Source > > Connector, > > > > > which enables us to run both stateless and stateful Flink-based > jobs. > > > > > > > > > > We believe that this SQS Source Connector would be a valuable > > addition > > > to > > > > > the existing connector set. Therefore, we propose a FLIP to include > > it. > > > > > > > > > > For more information, please refer to the FLIP document. > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1lreo27jNh0LkRs1Mj9B3wj3itrzMa38D4_XGryOIFks/edit?usp=sharing > > > > > > > > > > Thanks > > > > > Saurabh & Abhi > > > > > > > > > > > > > > >