Hi,

On Fri, Nov 14, 2014 at 3:20 PM, Mayur Rustagi <mayur.rust...@gmail.com>
wrote:

> I wonder if SparkConf is dynamically updated on all worker nodes or only
> during initialization. It can be used to piggyback information.
> Otherwise I guess you are stuck with Broadcast.
> Primarily I have had these issues moving legacy MR operators to Spark
> where MR piggybacks on Hadoop conf pretty  heavily, in spark Native
> application its rarely required. Do you have a usecase like that?
>

My "usecase" is
http://apache-spark-user-list.1001560.n3.nabble.com/StreamingContext-does-not-stop-td18826.html
– that is, notifying my Spark executors that the StreamingContext has been
shut down. (Even with non-graceful shutdown, Spark doesn't seem to end the
actual execution, just all the Spark-internal timers etc.) I need to do
this properly or processing will go on for a very long time.

I have been trying to mis-use broadcast as in
- create a class with a boolean var, set to true
- query this boolean on the executors as a prerequisite to process the next
item
- when I want to shutdown, I set the boolean to false and unpersist the
broadcast variable (which will trigger re-delivery).
This is very dirty, but it works with a "local[*]" master. Unfortunately,
when deployed on YARN, the new value will never arrive at my executors.

Any idea what could go wrong on YARN with this approach – or what is a
"good" way to do this?

Thanks
Tobias

Reply via email to