Dongwook Kwon created HIVE-10631:
------------------------------------

             Summary: create_table_core method has invalid update for Fast Stats
                 Key: HIVE-10631
                 URL: https://issues.apache.org/jira/browse/HIVE-10631
             Project: Hive
          Issue Type: Bug
          Components: Metastore
    Affects Versions: 1.0.0, 0.13.0
            Reporter: Dongwook Kwon
            Priority: Minor


HiveMetaStore.create_table_core method calls 
MetaStoreUtils.updateUnpartitionedTableStatsFast when hive.stats.autogather is 
on, however for partitioned table, this updateUnpartitionedTableStatsFast call 
scanning warehouse dir and doesn't seem to use it. 

"Fast Stats" was implemented by HIVE-3959

https://github.com/apache/hive/blob/branch-1.0/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L1363

>From create_table_core method
{code}
        if (HiveConf.getBoolVar(hiveConf, 
HiveConf.ConfVars.HIVESTATSAUTOGATHER) &&
            !MetaStoreUtils.isView(tbl)) {
          if (tbl.getPartitionKeysSize() == 0)  { // Unpartitioned table
            MetaStoreUtils.updateUnpartitionedTableStatsFast(db, tbl, wh, 
madeDir);
          } else { // Partitioned table with no partitions.
            MetaStoreUtils.updateUnpartitionedTableStatsFast(db, tbl, wh, true);
          }
        }
{code}

Particularly Line 1363: // Partitioned table with no partitions.
{code}
MetaStoreUtils.updateUnpartitionedTableStatsFast(db, tbl, wh, true);
{code}

This call ends up calling Warehouse.getFileStatusesForUnpartitionedTable and do 
nothing in MetaStoreUtils.updateUnpartitionedTableStatsFast method due to 
newDir flag is always true

Impact of this bug is minor with HDFS warehouse 
location(hive.metastore.warehouse.dir), it could be big with S3 warehouse 
location especially for large existing partitions.
Also the impact is heighten with HIVE-6727 when warehouse location is S3, 
basically it could scan wrong S3 directory recursively and do nothing with it. 
I will add more detail of cases in comments



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to