Shreepadma Venugopalan created HIVE-3516:
--------------------------------------------

             Summary: Fast incremental statistics computation on column in Hive 
tables
                 Key: HIVE-3516
                 URL: https://issues.apache.org/jira/browse/HIVE-3516
             Project: Hive
          Issue Type: Bug
          Components: Statistics
            Reporter: Shreepadma Venugopalan
            Assignee: Shreepadma Venugopalan


Statistics computed on Hive columns in partition can be rolled up to avoid 
scanning the table again to compute column statistics at the table(global) 
level. While its straightforward to roll up some statistics such as max, min, 
avgcollen, maxcollen etc, rolling up other statistics such as ndv requires 
maintaining intermediate state. This ticket covers the task of a) maintaining 
the necessary intermediate state needed to roll up partition level statistics 
b) detecting that the partition level statistics can be rolled up and actually 
computing table level statistics from partition level statistics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to