[ https://issues.apache.org/jira/browse/HIVE-22939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltan Haindrich reassigned HIVE-22939: --------------------------------------- > Datasketches support > -------------------- > > Key: HIVE-22939 > URL: https://issues.apache.org/jira/browse/HIVE-22939 > Project: Hive > Issue Type: Improvement > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > > We could probably integrate with the Datasketches more closely; it has very > usefull alogrithms which could utilized various ways: > * provide an optional way to transparently rewrite count(distinct) to use > some distinct counting sketch > * fequent items could be gathered during statistics collection; knowing the > most frequent elements could extremely helpfull in giving more accurate > estimates for our plans > * and...it also has a way to estimate a CDF function; which might be usefull > in giving better estimates for range queries > https://datasketches.apache.org/ -- This message was sent by Atlassian Jira (v8.3.4#803005)