[ 
https://issues.apache.org/jira/browse/HIVE-26335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangdonglin updated HIVE-26335:
--------------------------------
    Description: 
Hi,

   I found that when partition A already exists,   after calling 
Hive.loadPartition to load data into partition A again, the metadata of 
partition params in table PARTITION_PARAMS was not updated. even I set 
hasFollowingStatsTask=false.

   The reason is below, in the method of Hive.loadPartition, newTPart was set 
to oldPart when old partition exists,  when calling alter_partition, oldPart 
info was send to metastore and it will not update partition params.
{code:java}
Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec, 
newPartPath); 
...
if (oldPart == null) {
  // ...
} else {
  setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart);
}

private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, Table 
tbl,
    Partition newTPart) throws MetaException, TException {
  EnvironmentContext environmentContext = null;
  if (hasFollowingStatsTask) {
    environmentContext = new EnvironmentContext();
    environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, 
StatsSetupConst.TRUE);
  }
  
  LOG.debug("Altering existing partition " + newTPart.getSpec());
  getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(),
    newTPart.getTPartition(), environmentContext);
}{code}
   I think we should recompute the numFiles and totalSize of the new partition 
before calling alter_partition in setStatsPropAndAlterPartition.

  was:
Hi,

   I found that when partition A already exists,   after calling 
Hive.loadPartition to load data into partition A again, the metadata of 
partition params in table PARTITION_PARAMS was not updated. even I set 
hasFollowingStatsTask=false.

   The reason is below, in the method of Hive.loadPartition, newTPart was set 
to oldPart when old partition exists, 
{code:java}
Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec, 
newPartPath); 
...
if (oldPart == null) {
  // ...
} else {
  setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart);
}

private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, Table 
tbl,
    Partition newTPart) throws MetaException, TException {
  EnvironmentContext environmentContext = null;
  if (hasFollowingStatsTask) {
    environmentContext = new EnvironmentContext();
    environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, 
StatsSetupConst.TRUE);
  }
  
  LOG.debug("Altering existing partition " + newTPart.getSpec());
  getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(),
    newTPart.getTPartition(), environmentContext);
}{code}
   Due to this, when calling alter_partition, oldPart info was send to 
metastore and it will not update partition params.

   I think we should recompute the numFiles and totalSize of the new partition 
before calling alter_partition


> Metadata of Partition params dit not updated after calling Hive.loadPartition
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-26335
>                 URL: https://issues.apache.org/jira/browse/HIVE-26335
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: All Versions
>            Reporter: zhangdonglin
>            Priority: Major
>
> Hi,
>    I found that when partition A already exists,   after calling 
> Hive.loadPartition to load data into partition A again, the metadata of 
> partition params in table PARTITION_PARAMS was not updated. even I set 
> hasFollowingStatsTask=false.
>    The reason is below, in the method of Hive.loadPartition, newTPart was set 
> to oldPart when old partition exists,  when calling alter_partition, oldPart 
> info was send to metastore and it will not update partition params.
> {code:java}
> Partition newTPart = oldPart != null ? oldPart : new Partition(tbl, partSpec, 
> newPartPath); 
> ...
> if (oldPart == null) {
>   // ...
> } else {
>   setStatsPropAndAlterPartition(hasFollowingStatsTask, tbl, newTPart);
> }
> private void setStatsPropAndAlterPartition(boolean hasFollowingStatsTask, 
> Table tbl,
>     Partition newTPart) throws MetaException, TException {
>   EnvironmentContext environmentContext = null;
>   if (hasFollowingStatsTask) {
>     environmentContext = new EnvironmentContext();
>     environmentContext.putToProperties(StatsSetupConst.DO_NOT_UPDATE_STATS, 
> StatsSetupConst.TRUE);
>   }
>   
>   LOG.debug("Altering existing partition " + newTPart.getSpec());
>   getSychronizedMSC().alter_partition(tbl.getDbName(), tbl.getTableName(),
>     newTPart.getTPartition(), environmentContext);
> }{code}
>    I think we should recompute the numFiles and totalSize of the new 
> partition before calling alter_partition in setStatsPropAndAlterPartition.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to