I have two questions: 1) In a Spark Streaming program, after the various DStream transformations have being setup, the ssc.start() method is called to start the computation.
Can the underlying DAG change (ie. add another map or maybe a join) after ssc.start() has been called (and maybe messages have already been received/processed for some batches)? 2) In a Spark Streaming program (one process), can I have multiple DStream transformations, each series belonging to each own StreamingContext (in the same thread or in different threads)? For example: val lines_A = ssc_A.socketTextStream(..) val words_A = lines_A.flatMap(_.split(" ")) val wordCounts_A = words_A.map(x => (x, 1)).reduceByKey(_ + _) wordCounts_A.print() val lines_B = ssc_B.socketTextStream(..) val words_B = lines_B.flatMap(_.split(" ")) val wordCounts_B = words_B.map(x => (x, 1)).reduceByKey(_ + _) wordCounts_B.print() ssc_A.start() ssc_B.start() I think the answer is NO to both questions but I am wondering what is the reason for this behavior. Thanks, Nickos