[ https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chao Sun updated HIVE-10434: ---------------------------- Summary: Cancel connection when remote Spark driver process has failed [Spark Branch] (was: Cancel connection to HS2 when remote Spark driver process has failed [Spark Branch] ) > Cancel connection when remote Spark driver process has failed [Spark Branch] > ----------------------------------------------------------------------------- > > Key: HIVE-10434 > URL: https://issues.apache.org/jira/browse/HIVE-10434 > Project: Hive > Issue Type: Sub-task > Components: Spark > Affects Versions: 1.2.0 > Reporter: Chao Sun > Assignee: Chao Sun > Attachments: HIVE-10434.1-spark.patch > > > Currently in HoS, in SparkClientImpl it first launch a remote Driver process, > and then wait for it to connect back to the HS2. However, in certain > situations (for instance, permission issue), the remote process may fail and > exit with error code. In this situation, the HS2 process will still wait for > the process to connect, and wait for a full timeout period before it throws > the exception. > What makes it worth, user may need to wait for two timeout periods: one for > the SparkSetReducerParallelism, and another for the actual Spark job. This > could be very annoying. > We should cancel the timeout task once we found out that the process has > failed, and set the promise as failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)