[ 
https://issues.apache.org/jira/browse/HIVE-28960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-28960.
-----------------------------------
    Fix Version/s: 4.1.0
       Resolution: Fixed

Merged to master. Thanks [~dkuzmenko] for the review.

> Compaction Stats updater does not collect column stats when 
> hive.stats.autogather is true
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-28960
>                 URL: https://issues.apache.org/jira/browse/HIVE-28960
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.1.0
>
>
> To reproduce the issue add a test to {{TestCrudCompactorOnTez.java}}
> {code:java}
>   private static String TABLE1 = "t1";
>   @Test
>   public void testMajorCompaction() throws Exception {
>     executeStatementOnDriver("create table " + TABLE1 + "(a int, b 
> varchar(128), c float) stored as orc TBLPROPERTIES ('transactional'='true')", 
> driver);
>     executeStatementOnDriver("insert into " + TABLE1 + "(a, b, c) values (1, 
> 'one', 1.1)", driver);
>     executeStatementOnDriver("insert into " + TABLE1 + "(a, b, c) values (2, 
> 'two', 2.2)", driver);
>     executeStatementOnDriver("delete from " + TABLE1 + " where a = 1", 
> driver);
>     CompactorTestUtil.runCompaction(conf, "default",  TABLE1 , 
> CompactionType.MAJOR, true);
>     CompactorTestUtil.runCleaner(conf);
>     verifySuccessfulCompaction(1);
>     List<String> result = execSelectAndDumpData("describe formatted " + 
> TABLE1, driver, "");
> {code}
> In the {{results}} array the string
> {code:java}
>       COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
> {code}
> shows that only basic stats are accurate.
> The delete operation made the column stats stale and the stats updater 
> running after compactor didn't collected the column stats because 
> {{hive.stats.autogather}} is true:
> [https://github.com/apache/hive/blob/2495898ae937de2bfd8fe72c63eec2ae905c908c/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/StatsUpdater.java#L84-L88]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to