David Lavati created HIVE-21831:
-----------------------------------

             Summary: Stats should be reset correctly during load of a 
partitioned ACID table
                 Key: HIVE-21831
                 URL: https://issues.apache.org/jira/browse/HIVE-21831
             Project: Hive
          Issue Type: Bug
          Components: Hive, Import/Export
    Affects Versions: 3.1.1, 3.1.0, 3.0.0
            Reporter: David Lavati
            Assignee: David Lavati


While running something similar to the following example, I noticed that an 
import of a partitioned ACID table using the ORC format fails to provide table 
statistics:
{code:java}
set hive.stats.autogather=true;
set hive.stats.column.autogather=true;
set hive.fetch.task.conversion=none;


set hive.support.concurrency=true;
set hive.default.fileformat.managed=ORC;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;


create transactional table int_src (foo int, bar int);
insert into int_src select 1,1;


create transactional table int_exp(foo int) partitioned by (bar int);
insert into int_exp select * from int_src;
select count(*) from int_exp;


create transactional table int_imp(foo int) partitioned by (bar int);


EXPORT TABLE int_exp to '/tmp/expint';
IMPORT TABLE int_imp FROM '/tmp/expint';


select count(*) FROM int_imp;
{code}
The count returned 0 (opposed to 1, but even for 100k order of records it was 
0) and correct statistics were only available after running compute statistics.

 

This was unique to ACID + partitioning + ORC, but this isn't the expected 
behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to