yongtaoliao created HIVE-25293: ---------------------------------- Summary: Alter partitioned table with "cascade" option create too many columns records. Key: HIVE-25293 URL: https://issues.apache.org/jira/browse/HIVE-25293 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 3.1.2, 2.3.3 Reporter: yongtaoliao Assignee: yongtaoliao
When alter partitioned table with "cascade" option, all partitions supports to be updated. Currently, a CD_ID will be created for each partition, associated with a set of Columns, which will cause a large amount of redundant data in the metadata database. The following DDL statements can reproduce this scenario: {code:java} create table test_table (f1 int) partitioned by (p string); alter table test_table add partition(p='a'); alter table test_table add partition(p='b'); alter table test_table add partition(p='c'); alter table test_table add columns (f2 int) cascade;{code} All partitions use the table's `CD_ID` before adding columns, while each partition use their own `CD_ID` after adding columns. My proposal is all partitions should use the same `CD_ID` when table was altered with "cascade" option. -- This message was sent by Atlassian Jira (v8.3.4#803005)