adding new elements to batch RDD from DStream RDD

Evo Eftimov Wed, 15 Apr 2015 11:38:27 -0700

The only way to join / union /cogroup a DStream RDD with Batch RDD is via the
"transform" method, which returns another DStream RDD and hence it gets
discarded at the end of the micro-batch.


Is there any way to e.g. union Dstream RDD with Batch RDD which produces a
new Batch RDD containing the elements of both the DStream RDD and the Batch
RDD. 

And once such Batch RDD is created in the above way, can it be used by other
DStream RDDs to e.g. join with as this time the result can be another
DStream RDD

Effectively the functionality described above will result in periodical
updates (additions) of elements to a Batch RDD - the additional elements
will keep coming from DStream RDDs which keep streaming in with every
micro-batch. 
Also newly arriving DStream RDDs will be able to join with the thus
previously updated BAtch RDD and produce a result DStream RDD  

Something almost like that can be achieved with updateStateByKey, but is
there a way to do it as described here   



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/adding-new-elements-to-batch-RDD-from-DStream-RDD-tp22504.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

adding new elements to batch RDD from DStream RDD

Reply via email to