[ https://issues.apache.org/jira/browse/HIVE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718351#comment-16718351 ]
Jesus Camacho Rodriguez commented on HIVE-16957: ------------------------------------------------ ALTER MV... REBUILD is working correctly. When incremental rebuild translates into a MERGE operation, i.e. MV contains a Group By statement, column stats are not present because the MERGE contains in turn an UPDATE operation, which currently invalidates column stats. When incremental rebuild translates into INSERT operation, i.e., MV does not contain a Group By statement, column stats for the MV are updated correctly. > Support CTAS for auto gather column stats > ----------------------------------------- > > Key: HIVE-16957 > URL: https://issues.apache.org/jira/browse/HIVE-16957 > Project: Hive > Issue Type: Sub-task > Reporter: Pengcheng Xiong > Assignee: Jesus Camacho Rodriguez > Priority: Major > Attachments: HIVE-16957.patch > > > The idea is to rely as much as possible on the logic in > ColumnStatsSemanticAnalyzer as other operations do. In particular, they > create a 'analyze table t compute statistics for columns', use > ColumnStatsSemanticAnalyzer to parse it, and connect resulting plan to > existing INSERT/INSERT OVERWRITE statement. The challenge for CTAS or CREATE > MATERIALIZED VIEW is that the table object does not exist yet, hence we > cannot rely fully on ColumnStatsSemanticAnalyzer. > Thus, we use same process, but ColumnStatsSemanticAnalyzer produces a > statement for column stats collection that uses a table values clause instead > of the original table reference: > {code} > select compute_stats(col1), compute_stats(col2), compute_stats(col3) > from table(values(cast(null as int), cast(null as int), cast(null as > string))) as t(col1, col2, col3); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)