[ https://issues.apache.org/jira/browse/HIVE-26335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
zhangdonglin updated HIVE-26335: -------------------------------- Description: Hi, I found that when partition A already exists, after calling Hive.loadPartition to load data into partition A again, the metadata of partition params in table PARTITION_PARAMS was not updated. even I set hasFollowingStatsTask=false. The reason is below, in the method of Hive.loadPartition, newTPart was set to oldPart when old partition exists, when calling alter_partition, oldPart info was send to metastore and it will not update partition params. {code:java} Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec, newPartPath); ... if (oldPart == null) { // ... } else { setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart); } private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, Table tbl, Partition newTPart) throws MetaException, TException { EnvironmentContext environmentContext = null; if (hasFollowingStatsTask) { environmentContext = new EnvironmentContext(); environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, StatsSetupConst.TRUE); } LOG.debug("Altering existing partition " + newTPart.getSpec()); getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(), newTPart.getTPartition(), environmentContext); }{code} I think we should recompute the numFiles and totalSize of the new partition before calling alter_partition in setStatsPropAndAlterPartition. was: Hi, I found that when partition A already exists, after calling Hive.loadPartition to load data into partition A again, the metadata of partition params in table PARTITION_PARAMS was not updated. even I set hasFollowingStatsTask=false. The reason is below, in the method of Hive.loadPartition, newTPart was set to oldPart when old partition exists, {code:java} Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec, newPartPath); ... if (oldPart == null) { // ... } else { setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart); } private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, Table tbl, Partition newTPart) throws MetaException, TException { EnvironmentContext environmentContext = null; if (hasFollowingStatsTask) { environmentContext = new EnvironmentContext(); environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, StatsSetupConst.TRUE); } LOG.debug("Altering existing partition " + newTPart.getSpec()); getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(), newTPart.getTPartition(), environmentContext); }{code} Due to this, when calling alter_partition, oldPart info was send to metastore and it will not update partition params. I think we should recompute the numFiles and totalSize of the new partition before calling alter_partition > Metadata of Partition params dit not updated after calling Hive.loadPartition > ----------------------------------------------------------------------------- > > Key: HIVE-26335 > URL: https://issues.apache.org/jira/browse/HIVE-26335 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: All Versions > Reporter: zhangdonglin > Priority: Major > > Hi, > I found that when partition A already exists, after calling > Hive.loadPartition to load data into partition A again, the metadata of > partition params in table PARTITION_PARAMS was not updated. even I set > hasFollowingStatsTask=false. > The reason is below, in the method of Hive.loadPartition, newTPart was set > to oldPart when old partition exists, when calling alter_partition, oldPart > info was send to metastore and it will not update partition params. > {code:java} > Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec, > newPartPath); > ... > if (oldPart == null) { > // ... > } else { > setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart); > } > private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, > Table tbl, > Partition newTPart) throws MetaException, TException { > EnvironmentContext environmentContext = null; > if (hasFollowingStatsTask) { > environmentContext = new EnvironmentContext(); > environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, > StatsSetupConst.TRUE); > } > > LOG.debug("Altering existing partition " + newTPart.getSpec()); > getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(), > newTPart.getTPartition(), environmentContext); > }{code} > I think we should recompute the numFiles and totalSize of the new > partition before calling alter_partition in setStatsPropAndAlterPartition. -- This message was sent by Atlassian Jira (v8.20.7#820007)