Thanks for the reply. Are there plans to allow this runtime interactions with a 
dstream context? From the surface they seem doable. What is preventing this to 
work?

Also... I implemented the modifiable windowdstream and it seemed to work good. 
Thanks for the pointer. 

Gino B.

> On Jun 2, 2014, at 7:14 PM, Tathagata Das <[email protected]> wrote:
> 
> Currently Spark Streaming does not support addition/deletion/modification of 
> DStream after the streaming context has been started. 
> Nor can you restart a stopped streaming context. 
> Also, multiple spark contexts (and therefore multiple streaming contexts) 
> cannot be run concurrently in the same JVM. 
> 
> To change the window duration, I would one of the following.
> 
> 1. Stop the previous streaming context, create a new streaming context, and 
> setup the dstreams once again with the new window duration
> 2. Create a custom DStream, say DynamicWindowDStream. Take a look at how 
> WindowedDStream is implemented (pretty simple, just a union over RDDs across 
> time). That should allow you to modify the window duration. However, do make 
> sure you have a maximum window duration that you will ever reach, and make 
> sure you define parentRememberDuration as a "rememberDuration + 
> maxWindowDuration". That fields defines which RDDs can be forgotten, so is 
> sensitive to the window duration. Then you have to take care of correctly 
> (atomically, etc.) modifying the window duration as per your requirements.
> 
> Happy streaming!
> 
> TD
> 
> 
> 
> 
>> On Mon, Jun 2, 2014 at 2:46 PM, lbustelo <[email protected]> wrote:
>> This is a general question about whether Spark Streaming can be interactive
>> like batch Spark jobs. I've read plenty of threads and done my fair bit of
>> experimentation and I'm thinking the answer is NO, but it does not hurt to
>> ask.
>> 
>> More specifically, I would like to be able to do:
>> 1. Add/Remove steps to the Streaming Job
>> 2. Modify Window durations
>> 3. Stop and Restart context.
>> 
>> I've tried the following:
>> 
>> 1. Modify the DStream after it has been started… BOOM! Exceptions
>> everywhere.
>> 
>> 2. Stop the DStream, Make modification, Start… NOT GOOD :( In 0.9.0 I was
>> getting deadlocks. I also tried 1.0.0 and it did not work.
>> 
>> 3. Based on information provided  here
>> <http://apache-spark-user-list.1001560.n3.nabble.com/spark-streaming-and-the-spark-shell-tp3347p3371.html>
>> , I was been able to prototype modifying the RDD computation within a
>> forEachRDD. That is nice, but you are then bounded to the specified batch
>> size. That got me to wanting to modify Window durations. Is changing the
>> Window duration possible?
>> 
>> 4. Tried running multiple streaming context from within a single Driver
>> application and got several exceptions. The first one was bind exception on
>> the web port. Then once the app started getting run (cores were taken but
>> 1st job) it did not run correctly. A lot of
>> "akka.pattern.AskTimeoutException: Timed out"
>> .
>> 
>> I've tried my experiments in 0.9.0, 0.9.1 and 1.0.0 running on Standalone
>> Cluster setup.
>> Thanks in advanced
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/Interactive-modification-of-DStreams-tp6740.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 

Reply via email to