[ https://issues.apache.org/jira/browse/HIVE-16572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chaoyu Tang updated HIVE-16572: ------------------------------- Status: Patch Available (was: Open) > Rename a partition should not drop its column stats > --------------------------------------------------- > > Key: HIVE-16572 > URL: https://issues.apache.org/jira/browse/HIVE-16572 > Project: Hive > Issue Type: Bug > Components: Statistics > Reporter: Chaoyu Tang > Assignee: Chaoyu Tang > Attachments: HIVE-16572.patch > > > The column stats for the table sample_pt partition (dummy=1) is as following: > {code} > hive> describe formatted sample_pt partition (dummy=1) code; > OK > # col_name data_type min > max num_nulls distinct_count > avg_col_len max_col_len num_trues > num_falses comment > > > code string > 0 303 6.985 > 7 > from deserializer > Time taken: 0.259 seconds, Fetched: 3 row(s) > {code} > But when this partition is renamed, say > alter table sample_pt partition (dummy=1) rename to partition (dummy=11); > The COLUMN_STATS in partition description are true, but column stats are > actually all deleted. > {code} > hive> describe formatted sample_pt partition (dummy=11); > OK > # col_name data_type comment > > code string > description string > salary int > total_emp int > > # Partition Information > # col_name data_type comment > > dummy int > > # Detailed Partition Information > Partition Value: [11] > Database: default > Table: sample_pt > CreateTime: Thu Mar 30 23:03:59 EDT 2017 > LastAccessTime: UNKNOWN > Location: file:/user/hive/warehouse/apache/sample_pt/dummy=11 > > Partition Parameters: > COLUMN_STATS_ACCURATE > {\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}} > numFiles 1 > numRows 200 > rawDataSize 10228 > totalSize 10428 > transient_lastDdlTime 1490929439 > > # Storage Information > SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > InputFormat: org.apache.hadoop.mapred.TextInputFormat > OutputFormat: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Compressed: No > Num Buckets: -1 > Bucket Columns: [] > Sort Columns: [] > Storage Desc Params: > serialization.format 1 > Time taken: 6.783 seconds, Fetched: 37 row(s) > === > hive> describe formatted sample_pt partition (dummy=11) code; > OK > # col_name data_type comment > > > > code string from deserializer > > Time taken: 9.429 seconds, Fetched: 3 row(s) > {code} > The column stats should not be drop when a partition is renamed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)