[ https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431939#comment-15431939 ]
Gopal V commented on HIVE-14362: -------------------------------- [~pxiong]: tested this patch - running explain analyze seems to disable vectorization for all queries after that point. {code} + HiveConf.setBoolVar(conf, HiveConf.ConfVars.HIVE_VECTORIZATION_ENABLED, false); {code} And explain analyze does not actually work. {code} 2016-08-23T01:13:10,961 INFO [667a4e5f-6194-438f-85d6-339aca3ebecc main] physical.AnnotateRunTimeStatsOptimizer: setRuntimeStatsDir for RS_8 2016-08-23T01:13:10,962 INFO [667a4e5f-6194-438f-85d6-339aca3ebecc main] fs.FSStatsPublisher: created : file:/tmp/gopal/667a4e5f-6194-438f-85d6-339aca3ebecc/hive_2016-08-23_01-13-10_705_7555853843090786759-1/-local-10000/RS_8 {code} The paths for output are in local dirs, not the HDFS dirs - so the stats written on a machine are not making their way back to the HiveServer2 box. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: [Error 30002]: StatsPublisher cannot be connected to.There was a error while connecting to the StatsPublisher, and retrying might help. If you dont want the query to fail because accurate statistics could not be collected, set hive.stats.reliable=false at org.apache.hadoop.hive.ql.exec.Operator.publishRunTimeStats(Operator.java:1444) at org.apache.hadoop.hive.ql.exec.Operator.closeOp(Operator.java:723) at org.apache.hadoop.hive.ql.exec.TableScanOperator.closeOp(TableScanOperator.java:270) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:691) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:705) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:433) {code} > Support explain analyze in Hive > ------------------------------- > > Key: HIVE-14362 > URL: https://issues.apache.org/jira/browse/HIVE-14362 > Project: Hive > Issue Type: New Feature > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, > compare_on_cluster.pdf > > > Right now all the explain levels only support stats before query runs. We > would like to have an explain analyze similar to Postgres for real stats > after query runs. This will help to identify the major gap between > estimated/real stats and make not only query optimization better but also > query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)