Chaoyu Tang created HIVE-16147:
----------------------------------

             Summary: Rename a partitioned table should not drop its partition 
columns stats
                 Key: HIVE-16147
                 URL: https://issues.apache.org/jira/browse/HIVE-16147
             Project: Hive
          Issue Type: Bug
            Reporter: Chaoyu Tang
            Assignee: Chaoyu Tang


When a partitioned table (e.g. sample_pt) is renamed (e.g to sample_pt_rename), 
describing its partition shows that the partition column stats are still 
accurate, but actually they all have been dropped.
It could be reproduce as following:
1. analyze table sample_pt compute statistics for columns;
2. describe formatted default.sample_pt partition (dummy = 3):  COLUMN_STATS 
for all columns are true
{code}
...
# Detailed Partition Information                 
Partition Value:        [3]                      
Database:               default                  
Table:                  sample_pt                
CreateTime:             Fri Jan 20 15:42:30 EST 2017     
LastAccessTime:         UNKNOWN                  
Location:               file:/user/hive/warehouse/apache/sample_pt/dummy=3
Partition Parameters:            
        COLUMN_STATS_ACCURATE   
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
        last_modified_by        ctang               
        last_modified_time      1485217063          
        numFiles                1                   
        numRows                 100                 
        rawDataSize             5143                
        totalSize               5243                
        transient_lastDdlTime   1488842358    
... 
{code}
3: describe formatted default.sample_pt partition (dummy = 3) salary: column 
stats exists
{code}
# col_name              data_type               min                     max     
                num_nulls               distinct_count          avg_col_len     
        max_col_len             num_trues               num_falses              
comment             
                                                                                
 
salary                  int                     1                       151370  
                0                       94                                      
                                                                                
from deserializer 
{code}
4. alter table sample_pt rename to sample_pt_rename;
5. describe formatted default.sample_pt_rename partition (dummy = 3): describe 
the rename table partition (dummy =3) shows that COLUMN_STATS for columns are 
still true.
{code}
# Detailed Partition Information                 
Partition Value:        [3]                      
Database:               default                  
Table:                  sample_pt_rename         
CreateTime:             Fri Jan 20 15:42:30 EST 2017     
LastAccessTime:         UNKNOWN                  
Location:               
file:/user/hive/warehouse/apache/sample_pt_rename/dummy=3        
Partition Parameters:            
        COLUMN_STATS_ACCURATE   
{\"BASIC_STATS\":\"true\",\"COLUMN_STATS\":{\"code\":\"true\",\"description\":\"true\",\"salary\":\"true\",\"total_emp\":\"true\"}}
        last_modified_by        ctang               
        last_modified_time      1485217063          
        numFiles                1                   
        numRows                 100                 
        rawDataSize             5143                
        totalSize               5243                
        transient_lastDdlTime   1488842358  
{code}
describe formatted default.sample_pt_rename partition (dummy = 3) salary: the 
column stats have been dropped.
{code}
# col_name              data_type               comment                         
                                                 
                                                                                
 
salary                  int                     from deserializer               
                                                 
Time taken: 0.131 seconds, Fetched: 3 row(s)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to