Deenar, yes, you may indeed be overthinking it a bit, about how Spark executes maps/filters etc. I'll focus on the high-order bits so it's clear.
Let's assume you're doing this in Java. Then you'd pass some *MyMapper*instance to J *avaRDD#map(myMapper)*. So you'd have a class *MyMapper extends Function<InType, OutType>*. The *call()* method of that class is effectively the function that will be executed by the workers on your RDD's rows. Within that *MyMapper#call()*, you can access static members and methods of *MyMapper* itself. You could implement your *runOnce() *there. -- Christopher T. Nguyen Co-founder & CEO, Adatao <http://adatao.com> linkedin.com/in/ctnguyen On Thu, Mar 27, 2014 at 4:20 PM, deenar.toraskar <deenar.toras...@db.com>wrote: > Christopher > > Sorry I might be missing the obvious, but how do i get my function called > on > all Executors used by the app? I dont want to use RDDs unless necessary. > > once I start my shell or app, how do I get > TaskNonce.getSingleton().doThisOnce() executed on each executor? > > @dmpour > >>rdd.mapPartitions and it would still work as code would only be executed > once in each VM, but was wondering if there is more efficient way of doing > this by using a generated RDD with one partition per executor. > This remark was misleading, what I meant was that in conjunction with the > TaskNonce pattern, my function would be called only once per executor as > long as the RDD had atleast one partition on each executor > > Deenar > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Running-a-task-once-on-each-executor-tp3203p3393.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >