Not specifically, I want to generally be able to union any form of
DStream/RDD. I'm working on Apache Beam's Spark runner so the abstraction
their does not tell between streaming/batch (kinda like Dataset API).
Since I wrote my own InputDStream I will simply stream any "batch source"
instead, becau
Interestingly, I just faced with the same problem. By any change, do you
want to process old files in the directory as well as new ones? It's my
motivation and checkpointing my problem as well.
2017-02-08 22:02 GMT-08:00 Amit Sela :
> Not with checkpointing.
>
> On Thu, Feb 9, 2017, 04:58 Egor Pa
Not with checkpointing.
On Thu, Feb 9, 2017, 04:58 Egor Pahomov wrote:
> Just guessing here, but would
> http://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources
> "*Queue of RDDs as a Stream*" work? Basically create DStream from your
> RDD and than union with other DSt
Just guessing here, but would
http://spark.apache.org/docs/latest/streaming-programming-guide.html#basic-sources
"*Queue of RDDs as a Stream*" work? Basically create DStream from your RDD
and than union with other DStream.
2017-02-08 12:32 GMT-08:00 Amit Sela :
> Hi all,
>
> I'm looking to union