[ https://issues.apache.org/jira/browse/HIVE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16607111#comment-16607111 ]
Brock Noland commented on HIVE-20506: ------------------------------------- I had a HOS query pending all night long with a 90 second handshake timeout. I finally cleared the workload this morning and the query started just fine. {noformat} 2018-09-07 07:11:23,771 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-3]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:12:53,801 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-4]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:14:23,832 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-5]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:15:53,862 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-6]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:17:23,892 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-7]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:18:53,922 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-0]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:20:23,951 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-1]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:21:53,981 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-2]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:23:24,011 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-3]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:24:54,042 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-4]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:26:24,072 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-5]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:27:54,103 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-6]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:29:24,133 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-7]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:30:54,163 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-0]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:32:24,193 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-1]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:33:54,222 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-2]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:35:24,252 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-3]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:36:54,282 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-4]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:38:24,311 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-5]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:39:54,342 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-6]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:41:24,372 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-7]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:42:54,401 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-0]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:44:24,445 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-1]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:45:54,474 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-2]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:47:24,504 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-3]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:48:54,536 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-4]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:50:24,565 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-5]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:51:54,595 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-6]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:53:24,625 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-7]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:54:54,655 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-0]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:56:24,686 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-1]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:57:54,716 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-2]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 07:59:24,748 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-3]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 08:00:54,778 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-4]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 08:02:24,807 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-5]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 08:03:54,837 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-6]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 08:05:24,867 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-7]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 08:06:54,898 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-0]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 2018-09-07 08:08:24,929 INFO org.apache.hive.spark.client.rpc.RpcServer: [RPC-Handler-1]: Extending timeout for client 4c0592e0-b1e4-4f5e-9f42-d31de39995e2 {noformat} Started after I cleared the queue: !Screen Shot 2018-09-07 at 8.10.37 AM.png! > HOS times out when cluster is full while Hive-on-MR waits > --------------------------------------------------------- > > Key: HIVE-20506 > URL: https://issues.apache.org/jira/browse/HIVE-20506 > Project: Hive > Issue Type: Improvement > Reporter: Brock Noland > Assignee: Brock Noland > Priority: Major > Attachments: HIVE-20506-CDH5.14.2.patch, Screen Shot 2018-09-07 at > 8.10.37 AM.png > > > My understanding is as follows: > Hive-on-MR when the cluster is full will wait for resources to be available > before submitting a job. This is because the hadoop jar command is the > primary mechanism Hive uses to know if a job is complete or failed. > > Hive-on-Spark will timeout after {{SPARK_RPC_CLIENT_CONNECT_TIMEOUT}} because > the RPC client in the AppMaster doesn't connect back to the RPC Server in > HS2. > This is a behavior difference it'd be great to close. -- This message was sent by Atlassian JIRA (v7.6.3#76005)