Hello.
I need to create a custom streaming source by extending *FileStreamSource*.
The idea is to override *commit*, so that processed files (S3 objects in my
case) are renamed to have a certain prefix. However, I don't know how to
use this custom source. Obviously I don't want to compile Spark --
this?
On Wed, Jun 27, 2018 at 12:26 AM Steve Loughran
wrote:
>
> On 25 Jun 2018, at 23:59, Farshid Zavareh wrote:
>
> I'm writing a Spark Streaming application where the input data is put into
> an S3 bucket in small batches (using Database Migration Service - DMS). The
> S
I'm writing a Spark Streaming application where the input data is put into
an S3 bucket in small batches (using Database Migration Service - DMS). The
Spark application is the only consumer. I'm considering two possible
architectures:
Have Spark Streaming watch an S3 prefix and pick up new objects