Hi, I am trying to run an I/O intensive RDD in parallel with CPU intensive RDD within an application through a window like below:
var ssc = new StreamingContext(sc, 1min); var ds1 = ... var ds2 = ds1.Window(2min).ForeachRDD(...) ds1.ForeachRDD(...) I hope ds1 to start its job at 1min interval even if ds2's job not complete yet. but it is not the case when I run it - ds1's job won't start until ds2's job completes. I looked into document which mentions jobs within same SparkContext need to be submitted in different thread in order to run in parallel. is it true? then question becomes how should I submit the above 2 jobs in different threads? (I know there're concurrentJobs and receiver mode but both with some particular issues) Thanks a lot, Renyi.