[ https://issues.apache.org/jira/browse/HIVE-15530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812396#comment-15812396 ]
Chaoyu Tang commented on HIVE-15530: ------------------------------------ [~Yibing] The patch looks good. However, I have a small question about this: {code} static boolean columnsIncluded(List<FieldSchema> oldCols, List<FieldSchema> newCols) { if (oldCols.size() > newCols.size()) { return false; } else if (oldCols.size() == newCols.size()){ return areSameColumns(oldCols, newCols); } else { return areSameColumns(oldCols, newCols.subList(0, oldCols.size())); } } {code} For the alter table only changing the column name or/and position in a table, the oldCols.size() equals to newCols.size(), but areSameColumns(oldCols, newCols) might return false, in this case, should we still update the the column statistics? > Optimize the column stats update logic in table alteration > ---------------------------------------------------------- > > Key: HIVE-15530 > URL: https://issues.apache.org/jira/browse/HIVE-15530 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Yibing Shi > Assignee: Yibing Shi > Attachments: HIVE-15530.1.patch, HIVE-15530.2.patch, > HIVE-15530.3.patch, HIVE-15530.4.patch > > > Currently when a table is altered, if any of below conditions is true, HMS > would try to update column statistics for the table: > # database name is changed > # table name is changed > # old columns and new columns are not the same > As a result, when a column is added to a table, Hive also tries to update > column statistics, which is not necessary. We can loose the last condition by > checking whether all existing columns are changed or not. If not, we don't > have to update stats info. -- This message was sent by Atlassian JIRA (v6.3.4#6332)