SparkContext is thread safe. And RDDs just describe operations. While I generally agree that you want to model as much possible as transformations as possible, this is not always possible. And in that case, you have no option than to use threads.
Spark's designers should have made all actions return Futures, but alas... El viernes, 15 de enero de 2016, Jakob Odersky <joder...@gmail.com> escribió: > I don't think RDDs are threadsafe. > More fundamentally however, why would you want to run RDD actions in > parallel? The idea behind RDDs is to provide you with an abstraction for > computing parallel operations on distributed data. Even if you were to call > actions from several threads at once, the individual executors of your > spark environment would still have to perform operations sequentially. > > As an alternative, I would suggest to restructure your RDD transformations > to compute the required results in one single operation. > > On 15 January 2016 at 06:18, Jonathan Coveney <jcove...@gmail.com > <javascript:_e(%7B%7D,'cvml','jcove...@gmail.com');>> wrote: > >> Threads >> >> >> El viernes, 15 de enero de 2016, Kira <mennou...@gmail.com >> <javascript:_e(%7B%7D,'cvml','mennou...@gmail.com');>> escribió: >> >>> Hi, >>> >>> Can we run *simultaneous* actions on the *same RDD* ?; if yes how can >>> this >>> be done ? >>> >>> Thank you, >>> Regards >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/simultaneous-actions-tp25977.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >