[ https://issues.apache.org/jira/browse/HIVE-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324121#comment-16324121 ]
BELUGA BEHR commented on HIVE-14162: ------------------------------------ [~xuefuz] Thanks for the review. The issue here is that holding open a Hive session is relatively lightweight. However, holding open a Spark context is very heavy weight on the cluster and reserves resources from YARN that are not being utilized and cannot be used by other users. For convenience, a user may want to preserve their Hive session, with their session configurations in place, the entire work day but, for example, while they go out to lunch for 30 minutes, would like to allow the Spark resources to be returned to the cluster. This would require a Hive session timeout of 8-12 hours but a Spark context timeout of 15 minutes. If there are 100 employees using Hue for example, that's 200 containers reserved and not being used (1 for AM and 1 for Executor). > Allow disabling of long running job on Hive On Spark On YARN > ------------------------------------------------------------ > > Key: HIVE-14162 > URL: https://issues.apache.org/jira/browse/HIVE-14162 > Project: Hive > Issue Type: New Feature > Components: Spark > Reporter: Thomas Scott > Assignee: Aihua Xu > Priority: Minor > Attachments: HIVE-14162.1.patch > > > Hive On Spark launches a long running process on the first query to handle > all queries for that user session. In some use cases this is not desired, for > instance when using Hue with large intervals between query executions. > Could we have a property that would cause long running spark jobs to be > terminated after each query execution and started again for the next one? -- This message was sent by Atlassian JIRA (v6.4.14#64029)