The execution of Spark Streaming (started with StreamingContext.start()) can stop in two ways. 1. steamingContext.stop() is called (could be from a different thread) 2. some exception occurs in the processing of data.
awaitTermination is the right way for the main thread that started the context to stay blocked, so that processing continues in the background threads. The reason why removing awaitTermination is making no difference because there is a bug in 0.9.0 that causes the main function to not terminate even though the main thread has terminated (one of the background thread is non-daemon). Also, without awaitTermination, it is very hard to catch and print exceptions that occur during the background data processing. TD On Thu, Mar 27, 2014 at 7:02 AM, Diana Carroll <dcarr...@cloudera.com> wrote: > The API docs for ssc.awaitTermination say simply "Wait for the execution to > stop. Any exceptions that occurs during the execution will be thrown in this > thread." > > Can someone help me understand what this means? What causes execution to > stop? Why do we need to wait for that to happen? > > I tried removing it from my simple NetworkWordCount example (running > locally, not on a cluster) and nothing changed. In both cases, I end my > program by hitting Ctrl-C. > > Thanks for any insight you can give me. > > Diana