[ https://issues.apache.org/jira/browse/HIVE-23095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17071690#comment-17071690 ]
Zoltan Haindrich commented on HIVE-23095: ----------------------------------------- the [getSize()|https://github.com/apache/hive/blob/d2ad5b061706a1d3cd55e59c769ed4f2af01cdbe/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/common/ndv/hll/HLLSparseRegister.java#L152] method was adjusted with the tempList size in HIVE-19578; which causes the {{getSize}} method to be an overestimation of the actual size - because there is limit value at which the SPARSE/DENSE switch happens ; that code could be triggered for much less values triggered [in HyperLogLog.add|https://github.com/apache/hive/blob/d2ad5b061706a1d3cd55e59c769ed4f2af01cdbe/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/common/ndv/hll/HyperLogLog.java#L261] > NDV might be overestimated for a table with ~70 value > ----------------------------------------------------- > > Key: HIVE-23095 > URL: https://issues.apache.org/jira/browse/HIVE-23095 > Project: Hive > Issue Type: Bug > Reporter: Zoltan Haindrich > Assignee: Zoltan Haindrich > Priority: Major > > uncovered during looking into HIVE-23082 > https://issues.apache.org/jira/browse/HIVE-23082?focusedCommentId=17067773&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17067773 -- This message was sent by Atlassian Jira (v8.3.4#803005)