Hi Sanjeet, I have been using spark streaming for processing of files present in S3 and HDFS. I am also using SQS messages for the same purpose as yours i.e. pointer to S3 file. As of now, I have a separate SQS job which receive message from SQS queue and gets the corresponding file from S3. Now, I wasnt to integrate the SQS receiver with spark streaming. Like, my spark streaming job would listen for new SQS messages and proceed accordingly. I was wondering if you find any solution to this. Please let me know in case!!
In your above approach, you can achieve #4 in the following way: When you are passing a forEach function to be applied on each RDD of Dstream, you can pass information of SQS message (lke receipthandle for deleting message) associated with that particualar file. After success/failure in processing you can perform deletion of your SQS message accordingly. Thanks --Lalit ----- Lalit Yadav [email protected] -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-at-least-once-guarantee-tp10902p11419.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
