[ 
https://issues.apache.org/jira/browse/HIVE-22939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-22939:
---------------------------------------


> Datasketches support
> --------------------
>
>                 Key: HIVE-22939
>                 URL: https://issues.apache.org/jira/browse/HIVE-22939
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Zoltan Haindrich
>            Assignee: Zoltan Haindrich
>            Priority: Major
>
> We could probably integrate with the Datasketches  more closely; it has very 
> usefull alogrithms which could utilized various ways:
> * provide an optional way to transparently rewrite count(distinct) to use 
> some distinct counting sketch
> * fequent items could be gathered during statistics collection; knowing the 
> most frequent elements could extremely helpfull in giving more accurate 
> estimates for our plans
> * and...it also has a way to estimate a CDF function; which might be usefull 
> in giving better estimates for range queries
> https://datasketches.apache.org/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to