[ 
https://issues.apache.org/jira/browse/HIVE-9017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14245051#comment-14245051
 ] 

Marcelo Vanzin commented on HIVE-9017:
--------------------------------------

In Spark-speak, "executor" is the JVM that executes tasks. There's no name by 
which the individual threads an executor has are referred to, I guess you could 
say "task runner", but well, it's rare to see someone even talk about those.

About "can you run more than one executor per host", the answer is yes, but 
it's a little more complicated than that.

In Yarn mode, it's definitely possible, but then Yarn doesn't suffer from this 
issue.

In standalone mode, it's unusual. You can achieve that in two ways:

- run with a "local-cluster" master, which HoS uses for testing. But people 
shouldn't use that in production.
- run multiple "Worker" daemons on the same host; I don't know if that's 
possible, but right now Spark standalone has a 1:1 relationship between Worker 
daemons and executors.

But, long story short, you can't delete these files when the executor goes 
down. That could break Yarn mode, and even in standalone mode that is kinda 
sketchy (let's say the executor dies and is restarted, having these files 
around could avoid having to re-download a large jar from the driver node).

> Clean up temp files of RSC [Spark Branch]
> -----------------------------------------
>
>                 Key: HIVE-9017
>                 URL: https://issues.apache.org/jira/browse/HIVE-9017
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Rui Li
>
> Currently RSC will leave a lot of temp files in {{/tmp}}, including 
> {{*_lock}}, {{*_cache}}, {{spark-submit.*.properties}}, etc.
> We should clean up these files or it will exhaust disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to