Deenar, when you say "just once", have you defined "across multiple <what>"
(e.g., across multiple threads in the same JVM on the same machine)? In
principle you can have multiple executors on the same machine.

In any case, assuming it's the same JVM, have you considered using a
singleton that maintains done/not-done state, that is invoked by each of
the instances (TaskNonce.getSingleton().doThisOnce()) ? You can, e.g., mark
the state boolean "transient" to prevent it from going through serdes.



--
Christopher T. Nguyen
Co-founder & CEO, Adatao <http://adatao.com>
linkedin.com/in/ctnguyen



On Tue, Mar 25, 2014 at 10:03 AM, deenar.toraskar <deenar.toras...@db.com>wrote:

> Hi
>
> Is there a way in Spark to run a function on each executor just once. I
> have
> a couple of use cases.
>
> a) I use an external library that is a singleton. It keeps some global
> state
> and provides some functions to manipulate it (e.g. reclaim memory. etc.) .
> I
> want to check the global state of this library on each executor.
>
> b) To get jvm stats or instrumentation on each executor.
>
> Currently I have a crude way of achieving something similar, I just run a
> map on a large RDD that is hash partitioned, this does not guarantee that
> the job would run just once.
>
> Deenar
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Running-a-task-once-on-each-executor-tp3203.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to