Thanks Akhil!
I just looked it up in the code as well.
Receiver.store(ArrayBuffer[T], ...)
ReceiverSupervisorImpl.pushArrayBuffer(ArrayBuffer[T], ...)
ReceiverSupervisorImpl.pushAndReportBlock(...)
WriteAheadLogBasedBlockHandler.storeBlock(...)
This implementation stores the block into the block
manager as well as a write ahead log. It does this in parallel, using
Scala Futures, and returns only after the block has been stored in
both places.
https://www.codatlas.com/github.com/apache/spark/master/streaming/src/main/scala/org/apache/spark/streaming/receiver/ReceivedBlockHandler.scala?keyword=WriteAheadLogBasedBlockHandler&line=160
On 13 June 2015 at 06:46, Akhil Das <[email protected]> wrote:
> Yes, if you have enabled WAL and checkpointing then after the store, you can
> simply delete the SQS Messages from your receiver.
>
> Thanks
> Best Regards
>
> On Sat, Jun 13, 2015 at 6:14 AM, Michal Čizmazia <[email protected]> wrote:
>>
>> I would like to have a Spark Streaming SQS Receiver which deletes SQS
>> messages only after they were successfully stored on S3.
>>
>> For this a Custom Receiver can be implemented with the semantics of
>> the Reliable Receiver.
>>
>> The store(multiple-records) call blocks until the given records have
>> been stored and replicated inside Spark.
>>
>> If the write-ahead logs are enabled, all the data received from a
>> receiver gets written into a write ahead log in the configuration
>> checkpoint directory. The checkpoint directory can be pointed to S3.
>>
>> After the store(multiple-records) blocking call finishes, are the
>> records already stored in the checkpoint directory (and thus can be
>> safely deleted from SQS)?
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]