[ https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283310#comment-14283310 ]
Chengxiang Li commented on HIVE-9395: ------------------------------------- That's a good question, Hive submit spark job asynchronously, and monitor the job status with SparkJobMonitor, all kinds of errors may happens before job get executed on Spark cluster, so we need to add timeout in SparkJobMonitor which would make sure it would not hang while could not get job state all the times, this should be quite important for our unit test, as once SparkJobMonitor hangs, it would blocks all the following tests. When should we decide to timeout while we could not get state of job, after 30s, or 60s? should it configurable to user? My opinion is that make it configurable to user, as user may know more about the real cluster, which helps them to decide whether it's normal that SparkJobMonitor could not get job state in certain time. > Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor > level.[Spark Branch] > -------------------------------------------------------------------------------------------------- > > Key: HIVE-9395 > URL: https://issues.apache.org/jira/browse/HIVE-9395 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Chengxiang Li > Assignee: Chengxiang Li > Labels: Spark-M5 > Attachments: HIVE-9395.1-spark.patch > > > SparkJobMonitor may hang if job state return null all the times, we should > move the timeout check here to avoid it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)