[ 
https://issues.apache.org/jira/browse/HIVE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-15396:
--------------------------------
    Description: 
Basic stats are not collected when a managed table is created with a specified 
{{LOCATION}} clause.

{code}
0: jdbc:hive2://localhost:10000> create table hdfs_1 (col int);
0: jdbc:hive2://localhost:10000> describe formatted hdfs_1;
+-------------------------------+----------------------------------------------------+-----------------------------+
|           col_name            |                     data_type                 
     |           comment           |
+-------------------------------+----------------------------------------------------+-----------------------------+
| # col_name                    | data_type                                     
     | comment                     |
|                               | NULL                                          
     | NULL                        |
| col                           | int                                           
     |                             |
|                               | NULL                                          
     | NULL                        |
| # Detailed Table Information  | NULL                                          
     | NULL                        |
| Database:                     | default                                       
     | NULL                        |
| Owner:                        | anonymous                                     
     | NULL                        |
| CreateTime:                   | Wed Mar 22 18:09:19 PDT 2017                  
     | NULL                        |
| LastAccessTime:               | UNKNOWN                                       
     | NULL                        |
| Retention:                    | 0                                             
     | NULL                        |
| Location:                     | file:/warehouse/hdfs_1 | NULL                 
       |
| Table Type:                   | MANAGED_TABLE                                 
     | NULL                        |
| Table Parameters:             | NULL                                          
     | NULL                        |
|                               | COLUMN_STATS_ACCURATE                         
     | {\"BASIC_STATS\":\"true\"}  |
|                               | numFiles                                      
     | 0                           |
|                               | numRows                                       
     | 0                           |
|                               | rawDataSize                                   
     | 0                           |
|                               | totalSize                                     
     | 0                           |
|                               | transient_lastDdlTime                         
     | 1490231359                  |
|                               | NULL                                          
     | NULL                        |
| # Storage Information         | NULL                                          
     | NULL                        |
| SerDe Library:                | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                       
 |
| InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat      
     | NULL                        |
| OutputFormat:                 | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL               
         |
| Compressed:                   | No                                            
     | NULL                        |
| Num Buckets:                  | -1                                            
     | NULL                        |
| Bucket Columns:               | []                                            
     | NULL                        |
| Sort Columns:                 | []                                            
     | NULL                        |
| Storage Desc Params:          | NULL                                          
     | NULL                        |
|                               | serialization.format                          
     | 1                           |
+-------------------------------+----------------------------------------------------+-----------------------------+
0: jdbc:hive2://localhost:10000> create table s3_1 (col int) location 
's3a://[bucket]/test-tables/s3-1';
0: jdbc:hive2://localhost:10000> describe formatted s3_1;
+-------------------------------+----------------------------------------------------+-----------------------+
|           col_name            |                     data_type                 
     |        comment        |
+-------------------------------+----------------------------------------------------+-----------------------+
| # col_name                    | data_type                                     
     | comment               |
|                               | NULL                                          
     | NULL                  |
| col                           | int                                           
     |                       |
|                               | NULL                                          
     | NULL                  |
| # Detailed Table Information  | NULL                                          
     | NULL                  |
| Database:                     | default                                       
     | NULL                  |
| Owner:                        | anonymous                                     
     | NULL                  |
| CreateTime:                   | Wed Mar 22 18:10:01 PDT 2017                  
     | NULL                  |
| LastAccessTime:               | UNKNOWN                                       
     | NULL                  |
| Retention:                    | 0                                             
     | NULL                  |
| Location:                     | s3a://[bucket]/test-tables/s3-1     | NULL    
              |
| Table Type:                   | MANAGED_TABLE                                 
     | NULL                  |
| Table Parameters:             | NULL                                          
     | NULL                  |
|                               | transient_lastDdlTime                         
     | 1490231401            |
|                               | NULL                                          
     | NULL                  |
| # Storage Information         | NULL                                          
     | NULL                  |
| SerDe Library:                | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                  |
| InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat      
     | NULL                  |
| OutputFormat:                 | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL               
   |
| Compressed:                   | No                                            
     | NULL                  |
| Num Buckets:                  | -1                                            
     | NULL                  |
| Bucket Columns:               | []                                            
     | NULL                  |
| Sort Columns:                 | []                                            
     | NULL                  |
| Storage Desc Params:          | NULL                                          
     | NULL                  |
|                               | serialization.format                          
     | 1                     |
+-------------------------------+----------------------------------------------------+-----------------------+
{code}

There are no stats defined in the describe for the s3 table. Furthermore, when 
inserting into the s3 table the {{numRows}} stats are not collected for the s3 
table.

  was:
Basic stats are not collected when a managed table is created with a specified 
{{LOCATION}} clause.

{code}
0: jdbc:hive2://localhost:10000> create table hdfs_1 (col int);
0: jdbc:hive2://localhost:10000> describe formatted hdfs_1;
+-------------------------------+----------------------------------------------------+-----------------------------+
|           col_name            |                     data_type                 
     |           comment           |
+-------------------------------+----------------------------------------------------+-----------------------------+
| # col_name                    | data_type                                     
     | comment                     |
|                               | NULL                                          
     | NULL                        |
| col                           | int                                           
     |                             |
|                               | NULL                                          
     | NULL                        |
| # Detailed Table Information  | NULL                                          
     | NULL                        |
| Database:                     | default                                       
     | NULL                        |
| Owner:                        | anonymous                                     
     | NULL                        |
| CreateTime:                   | Wed Mar 22 18:09:19 PDT 2017                  
     | NULL                        |
| LastAccessTime:               | UNKNOWN                                       
     | NULL                        |
| Retention:                    | 0                                             
     | NULL                        |
| Location:                     | 
file:/Users/stakiar/Documents/idea/apache-hive/warehouse/hdfs_2 | NULL          
              |
| Table Type:                   | MANAGED_TABLE                                 
     | NULL                        |
| Table Parameters:             | NULL                                          
     | NULL                        |
|                               | COLUMN_STATS_ACCURATE                         
     | {\"BASIC_STATS\":\"true\"}  |
|                               | numFiles                                      
     | 0                           |
|                               | numRows                                       
     | 0                           |
|                               | rawDataSize                                   
     | 0                           |
|                               | totalSize                                     
     | 0                           |
|                               | transient_lastDdlTime                         
     | 1490231359                  |
|                               | NULL                                          
     | NULL                        |
| # Storage Information         | NULL                                          
     | NULL                        |
| SerDe Library:                | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                       
 |
| InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat      
     | NULL                        |
| OutputFormat:                 | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL               
         |
| Compressed:                   | No                                            
     | NULL                        |
| Num Buckets:                  | -1                                            
     | NULL                        |
| Bucket Columns:               | []                                            
     | NULL                        |
| Sort Columns:                 | []                                            
     | NULL                        |
| Storage Desc Params:          | NULL                                          
     | NULL                        |
|                               | serialization.format                          
     | 1                           |
+-------------------------------+----------------------------------------------------+-----------------------------+
0: jdbc:hive2://localhost:10000> create table s3_1 (col int) location 
's3a://[bucket]/test-tables/s3-1';
0: jdbc:hive2://localhost:10000> describe formatted s3_1;
+-------------------------------+----------------------------------------------------+-----------------------+
|           col_name            |                     data_type                 
     |        comment        |
+-------------------------------+----------------------------------------------------+-----------------------+
| # col_name                    | data_type                                     
     | comment               |
|                               | NULL                                          
     | NULL                  |
| col                           | int                                           
     |                       |
|                               | NULL                                          
     | NULL                  |
| # Detailed Table Information  | NULL                                          
     | NULL                  |
| Database:                     | default                                       
     | NULL                  |
| Owner:                        | anonymous                                     
     | NULL                  |
| CreateTime:                   | Wed Mar 22 18:10:01 PDT 2017                  
     | NULL                  |
| LastAccessTime:               | UNKNOWN                                       
     | NULL                  |
| Retention:                    | 0                                             
     | NULL                  |
| Location:                     | 
s3a://cloudera-dev-hive-on-s3/test-tables/s3-6     | NULL                  |
| Table Type:                   | MANAGED_TABLE                                 
     | NULL                  |
| Table Parameters:             | NULL                                          
     | NULL                  |
|                               | transient_lastDdlTime                         
     | 1490231401            |
|                               | NULL                                          
     | NULL                  |
| # Storage Information         | NULL                                          
     | NULL                  |
| SerDe Library:                | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                  |
| InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat      
     | NULL                  |
| OutputFormat:                 | 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL               
   |
| Compressed:                   | No                                            
     | NULL                  |
| Num Buckets:                  | -1                                            
     | NULL                  |
| Bucket Columns:               | []                                            
     | NULL                  |
| Sort Columns:                 | []                                            
     | NULL                  |
| Storage Desc Params:          | NULL                                          
     | NULL                  |
|                               | serialization.format                          
     | 1                     |
+-------------------------------+----------------------------------------------------+-----------------------+
{code}


> Basic Stats are not collected when for managed tables with LOCATION specified
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-15396
>                 URL: https://issues.apache.org/jira/browse/HIVE-15396
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>         Attachments: HIVE-15396.1.patch
>
>
> Basic stats are not collected when a managed table is created with a 
> specified {{LOCATION}} clause.
> {code}
> 0: jdbc:hive2://localhost:10000> create table hdfs_1 (col int);
> 0: jdbc:hive2://localhost:10000> describe formatted hdfs_1;
> +-------------------------------+----------------------------------------------------+-----------------------------+
> |           col_name            |                     data_type               
>        |           comment           |
> +-------------------------------+----------------------------------------------------+-----------------------------+
> | # col_name                    | data_type                                   
>        | comment                     |
> |                               | NULL                                        
>        | NULL                        |
> | col                           | int                                         
>        |                             |
> |                               | NULL                                        
>        | NULL                        |
> | # Detailed Table Information  | NULL                                        
>        | NULL                        |
> | Database:                     | default                                     
>        | NULL                        |
> | Owner:                        | anonymous                                   
>        | NULL                        |
> | CreateTime:                   | Wed Mar 22 18:09:19 PDT 2017                
>        | NULL                        |
> | LastAccessTime:               | UNKNOWN                                     
>        | NULL                        |
> | Retention:                    | 0                                           
>        | NULL                        |
> | Location:                     | file:/warehouse/hdfs_1 | NULL               
>          |
> | Table Type:                   | MANAGED_TABLE                               
>        | NULL                        |
> | Table Parameters:             | NULL                                        
>        | NULL                        |
> |                               | COLUMN_STATS_ACCURATE                       
>        | {\"BASIC_STATS\":\"true\"}  |
> |                               | numFiles                                    
>        | 0                           |
> |                               | numRows                                     
>        | 0                           |
> |                               | rawDataSize                                 
>        | 0                           |
> |                               | totalSize                                   
>        | 0                           |
> |                               | transient_lastDdlTime                       
>        | 1490231359                  |
> |                               | NULL                                        
>        | NULL                        |
> | # Storage Information         | NULL                                        
>        | NULL                        |
> | SerDe Library:                | 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                     
>    |
> | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat    
>        | NULL                        |
> | OutputFormat:                 | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL             
>            |
> | Compressed:                   | No                                          
>        | NULL                        |
> | Num Buckets:                  | -1                                          
>        | NULL                        |
> | Bucket Columns:               | []                                          
>        | NULL                        |
> | Sort Columns:                 | []                                          
>        | NULL                        |
> | Storage Desc Params:          | NULL                                        
>        | NULL                        |
> |                               | serialization.format                        
>        | 1                           |
> +-------------------------------+----------------------------------------------------+-----------------------------+
> 0: jdbc:hive2://localhost:10000> create table s3_1 (col int) location 
> 's3a://[bucket]/test-tables/s3-1';
> 0: jdbc:hive2://localhost:10000> describe formatted s3_1;
> +-------------------------------+----------------------------------------------------+-----------------------+
> |           col_name            |                     data_type               
>        |        comment        |
> +-------------------------------+----------------------------------------------------+-----------------------+
> | # col_name                    | data_type                                   
>        | comment               |
> |                               | NULL                                        
>        | NULL                  |
> | col                           | int                                         
>        |                       |
> |                               | NULL                                        
>        | NULL                  |
> | # Detailed Table Information  | NULL                                        
>        | NULL                  |
> | Database:                     | default                                     
>        | NULL                  |
> | Owner:                        | anonymous                                   
>        | NULL                  |
> | CreateTime:                   | Wed Mar 22 18:10:01 PDT 2017                
>        | NULL                  |
> | LastAccessTime:               | UNKNOWN                                     
>        | NULL                  |
> | Retention:                    | 0                                           
>        | NULL                  |
> | Location:                     | s3a://[bucket]/test-tables/s3-1     | NULL  
>                 |
> | Table Type:                   | MANAGED_TABLE                               
>        | NULL                  |
> | Table Parameters:             | NULL                                        
>        | NULL                  |
> |                               | transient_lastDdlTime                       
>        | 1490231401            |
> |                               | NULL                                        
>        | NULL                  |
> | # Storage Information         | NULL                                        
>        | NULL                  |
> | SerDe Library:                | 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL                  |
> | InputFormat:                  | org.apache.hadoop.mapred.TextInputFormat    
>        | NULL                  |
> | OutputFormat:                 | 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | NULL             
>      |
> | Compressed:                   | No                                          
>        | NULL                  |
> | Num Buckets:                  | -1                                          
>        | NULL                  |
> | Bucket Columns:               | []                                          
>        | NULL                  |
> | Sort Columns:                 | []                                          
>        | NULL                  |
> | Storage Desc Params:          | NULL                                        
>        | NULL                  |
> |                               | serialization.format                        
>        | 1                     |
> +-------------------------------+----------------------------------------------------+-----------------------+
> {code}
> There are no stats defined in the describe for the s3 table. Furthermore, 
> when inserting into the s3 table the {{numRows}} stats are not collected for 
> the s3 table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to