Yes, I know that one task represent a JVM thread. This is what I confused. Usually users want to specify the memory on task level, so how can I do it if task if thread level and multiple tasks runs in the same executor. And even I don't know how many threads there will be. Besides that, if one task cause OOM, it would cause other tasks in the same executor fail too. There's no isolation between tasks.
On Tue, May 26, 2015 at 4:15 PM, Evo Eftimov <evo.efti...@isecc.com> wrote: > An Executor is a JVM instance spawned and running on a Cluster Node > (Server machine). Task is essentially a JVM Thread – you can have as many > Threads as you want per JVM. You will also hear about “Executor Slots” – > these are essentially the CPU Cores available on the machine and granted > for use to the Executor > > > > Ps: what creates ongoing confusion here is that the Spark folks have > “invented” their own terms to describe the design of their what is > essentially a Distributed OO Framework facilitating Parallel Programming > and Data Management in a Distributed Environment, BUT have not provided > clear dictionary/explanations linking these “inventions” with standard > concepts familiar to every Java, Scala etc developer > > > > *From:* canan chen [mailto:ccn...@gmail.com] > *Sent:* Tuesday, May 26, 2015 9:02 AM > *To:* user@spark.apache.org > *Subject:* How does spark manage the memory of executor with multiple > tasks > > > > Since spark can run multiple tasks in one executor, so I am curious to > know how does spark manage memory across these tasks. Say if one executor > takes 1GB memory, then if this executor can run 10 tasks simultaneously, > then each task can consume 100MB on average. Do I understand it correctly ? > It doesn't make sense to me that spark run multiple tasks in one executor. >