[ https://issues.apache.org/jira/browse/HIVE-25293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HIVE-25293 started by yongtaoliao. ------------------------------------------ > Alter partitioned table with "cascade" option create too many columns records. > ------------------------------------------------------------------------------ > > Key: HIVE-25293 > URL: https://issues.apache.org/jira/browse/HIVE-25293 > Project: Hive > Issue Type: Improvement > Components: Metastore > Affects Versions: 2.3.3, 3.1.2 > Reporter: yongtaoliao > Assignee: yongtaoliao > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When alter partitioned table with "cascade" option, all partitions supports > to be updated. Currently, a CD_ID will be created for each partition, > associated with a set of Columns, which will cause a large amount of > redundant data in the metadata database. > The following DDL statements can reproduce this scenario: > > {code:java} > create table test_table (f1 int) partitioned by (p string); > alter table test_table add partition(p='a'); > alter table test_table add partition(p='b'); > alter table test_table add partition(p='c'); > alter table test_table add columns (f2 int) cascade;{code} > All partitions use the table's `CD_ID` before adding columns, while each > partition use their own `CD_ID` after adding columns. > > My proposal is all partitions should use the same `CD_ID` when table was > altered with "cascade" option. -- This message was sent by Atlassian Jira (v8.3.4#803005)