I have two questions:

1) In a Spark Streaming program, after the various DStream transformations have 
being setup, 
the ssc.start() method is called to start the computation.

Can the underlying DAG change (ie. add another map or maybe a join) after 
ssc.start() has been 
called (and maybe messages have already been received/processed for some 
batches)?


2) In a Spark Streaming program (one process), can I have multiple DStream 
transformations, 
each series belonging to each own StreamingContext (in the same thread or in 
different threads)?

For example:
 val lines_A = ssc_A.socketTextStream(..)
 val words_A = lines_A.flatMap(_.split(" "))
 val wordCounts_A = words_A.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts_A.print()

val lines_B = ssc_B.socketTextStream(..)
val words_B = lines_B.flatMap(_.split(" "))
val wordCounts_B = words_B.map(x => (x, 1)).reduceByKey(_ + _)

wordCounts_B.print()

ssc_A.start()
  ssc_B.start()

I think the answer is NO to both questions but I am wondering what is the 
reason for this behavior.


Thanks,

Nickos

Reply via email to