[ https://issues.apache.org/jira/browse/HIVE-16854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044048#comment-16044048 ]
Hive QA commented on HIVE-16854: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12872191/HIVE-16854.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10822 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[create_merge_compressed] (batchId=237) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=140) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=145) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=232) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query78] (batchId=232) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5597/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5597/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5597/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12872191 - PreCommit-HIVE-Build > SparkClientFactory is locked too aggressively > --------------------------------------------- > > Key: HIVE-16854 > URL: https://issues.apache.org/jira/browse/HIVE-16854 > Project: Hive > Issue Type: Bug > Components: Spark > Affects Versions: 1.1.0 > Reporter: Xuefu Zhang > Assignee: Rui Li > Attachments: 15763.jstack, HIVE-16854.patch > > > Most methods in SparkClientFactory are synchronized on the SparkClientFactory > singleton. However, some methods are very expensive, such as createClient(), > which returns a SparkClientImpl instance. However, creating a SparkClientImpl > instance requires starting a remote driver to connect back to RPCServer. This > process can take a long time such as in case of a busy yarn queue. When this > happens, all pending calls on SparkClientFactory will have to wait for a > long time. > In our case, hive.spark.client.server.connect.timeout is set to 1hr. This > makes some queries waiting for hours before starting. > The current implementation seems pretty much making all remote driver > launches serialized. If one of them takes time, the following ones will have > to wait. > HS2 stacktrace is attached for reference. It's based on earlier version of > Hive, so the line numbers might be slightly off. The following shows the > locking effect: > {code} > xuefu@hadoopservice20-sjc1:~$ grep > org.apache.hive.spark.client.SparkClientFactory 15763.jstack > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:79) > - waiting to lock <0x00007f78fa1a9cc0> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:79) > - waiting to lock <0x00007f78fa1a9cc0> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > - locked <0x00007f78fa1a9cc0> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:79) > - waiting to lock <0x00007f78fa1a9cc0> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:79) > - waiting to lock <0x00007f78fa1a9cc0> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:79) > - waiting to lock <0x00007f78fa1a9cc0> (a java.lang.Class for > org.apache.hive.spark.client.SparkClientFactory) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)