David Lavati created HIVE-21831: ----------------------------------- Summary: Stats should be reset correctly during load of a partitioned ACID table Key: HIVE-21831 URL: https://issues.apache.org/jira/browse/HIVE-21831 Project: Hive Issue Type: Bug Components: Hive, Import/Export Affects Versions: 3.1.1, 3.1.0, 3.0.0 Reporter: David Lavati Assignee: David Lavati
While running something similar to the following example, I noticed that an import of a partitioned ACID table using the ORC format fails to provide table statistics: {code:java} set hive.stats.autogather=true; set hive.stats.column.autogather=true; set hive.fetch.task.conversion=none; set hive.support.concurrency=true; set hive.default.fileformat.managed=ORC; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; create transactional table int_src (foo int, bar int); insert into int_src select 1,1; create transactional table int_exp(foo int) partitioned by (bar int); insert into int_exp select * from int_src; select count(*) from int_exp; create transactional table int_imp(foo int) partitioned by (bar int); EXPORT TABLE int_exp to '/tmp/expint'; IMPORT TABLE int_imp FROM '/tmp/expint'; select count(*) FROM int_imp; {code} The count returned 0 (opposed to 1, but even for 100k order of records it was 0) and correct statistics were only available after running compute statistics. This was unique to ACID + partitioning + ORC, but this isn't the expected behavior. -- This message was sent by Atlassian JIRA (v7.6.3#76005)