[ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283310#comment-14283310
 ] 

Chengxiang Li commented on HIVE-9395:
-------------------------------------

That's a good question, Hive submit spark job asynchronously, and monitor the 
job status with SparkJobMonitor, all kinds of errors may happens before job get 
executed on Spark cluster, so we need to add timeout in SparkJobMonitor which 
would make sure it would not hang while could not get job state all the times, 
this should be quite important for our unit test, as once SparkJobMonitor 
hangs, it would blocks all the following tests.
When should we decide to timeout while we could not get state of job, after 
30s, or 60s? should it configurable to user?
My opinion is that make it configurable to user, as user may know more about 
the real cluster, which helps them to decide whether it's normal that 
SparkJobMonitor could not get job state in certain time.

> Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
> level.[Spark Branch]
> --------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-9395
>                 URL: https://issues.apache.org/jira/browse/HIVE-9395
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: Spark-M5
>         Attachments: HIVE-9395.1-spark.patch
>
>
> SparkJobMonitor may hang if job state return null all the times, we should 
> move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to