[jira] [Commented] (HIVE-11276) Optimization around job submission and adding jars [Spark Branch]

Xuefu Zhang (JIRA) Thu, 16 Jul 2015 20:12:07 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630702#comment-14630702
 ]


Xuefu Zhang commented on HIVE-11276:
------------------------------------

Hi [~chengxiang li], your analysis is correct. I realized after I created this 
JIRA that we are not uploading the jars every time even though 
refreshLocalResources() is called. This is fine. Also, dynamic allocation 
worked well with the existing implementation. Therefore, the JIRA is "not a 
problem". I'm going to close this one.

I think we need a pre-warm containers for user sessions that executes only one 
query and then exits, such as those issued by Oozie. Spark session can be 
created right after user connects to Hive and the execution engine is Spark. 
This way, the remote driver and the executors will be up when the query comes. 
As part of that, some jars, such as hive-exec.jar, can be also uploaded to 
HDSF. Of course, connection will be slower. Thus, we need a configuration to 
turn on this. What do you think?



> Optimization around job submission and adding jars [Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-11276
>                 URL: https://issues.apache.org/jira/browse/HIVE-11276
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: 1.1.0
>            Reporter: Xuefu Zhang
>            Assignee: Chengxiang Li
>
> It seems that Hive on Spark has some room for performance improvement on job 
> submission. Specifically, we are calling refreshLocalResources() for every 
> job submission despite there is are no changes in the jar list. Since Hive on 
> Spark is reusing the containers in the whole user session, we might be able 
> to optimize that.
> We do need to take into consideration the case of dynamic allocation, in 
> which new executors might be added.
> This task is some R&D in this area.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11276) Optimization around job submission and adding jars [Spark Branch]

Reply via email to