[ https://issues.apache.org/jira/browse/HIVE-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14199824#comment-14199824 ]
Hive QA commented on HIVE-8509: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12679785/HIVE-8509-spark.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 7099 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parallel org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join18 org.apache.hadoop.hive.ql.io.parquet.serde.TestParquetTimestampUtils.testTimezone org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json org.apache.hive.minikdc.TestJdbcWithMiniKdc.testNegativeTokenAuth {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/316/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/316/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-316/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12679785 - PreCommit-HIVE-SPARK-Build > UT: fix list_bucket_dml_2 test > ------------------------------ > > Key: HIVE-8509 > URL: https://issues.apache.org/jira/browse/HIVE-8509 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Thomas Friedrich > Assignee: Chinna Rao Lalam > Priority: Minor > Attachments: HIVE-8509-spark.patch, HIVE-8509-spark.patch > > > The test list_bucket_dml_2 fails in FileSinkOperator.publishStats: > org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: > StatsPublisher cannot be connected to.There was a error while connecting to > the StatsPublisher, and retrying might help. If you dont want the query to > fail because accurate statistics could not be collected, set > hive.stats.reliable=false > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.publishStats(FileSinkOperator.java:1079) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:971) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:582) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:594) > at > org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:175) > at > org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:57) > at > org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.hasNext(HiveBaseFunctionResultList.java:121) > I debugged and found that FileSinkOperator.publishStats throws the exception > when calling statsPublisher.connect here: > if (!statsPublisher.connect(hconf)) { > // just return, stats gathering should not block the main query > LOG.error("StatsPublishing error: cannot connect to database"); > if (isStatsReliable) > { throw new > HiveException(ErrorMsg.STATSPUBLISHER_CONNECTION_ERROR.getErrorCodedMsg()); } > return; > } > With the hive.stats.dbclass set to counter in data/conf/spark/hive-site.xml, > the statsPuvlisher is of type CounterStatsPublisher. > In CounterStatsPublisher, the exception is thrown because getReporter() > returns null for the MapredContext: > MapredContext context = MapredContext.get(); > if (context == null || context.getReporter() == null) > { return false; } > When changing hive.stats.dbclass to jdbc:derby in > data/conf/spark/hive-site.xml, similar to TestCliDriver it works: > <property> > <name>hive.stats.dbclass</name> > <!-- <value>counter</value> --> > <value>jdbc:derby</value> > <description>The default storatge that stores temporary hive statistics. > Currently, jdbc, hbase and counter type is supported</description> > </property> > In addition, I had to generate the out file for the test case for spark. > When running this test with TestCliDriver and hive.stats.dbclass set to > counter, the test case still works. The reporter is set to > org.apache.hadoop.mapred.Task$TaskReporter. > Might need some additional investigation why the CounterStatsPublisher has no > reporter in case of spark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)