[ https://issues.apache.org/jira/browse/HIVE-13809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314482#comment-15314482 ]
Wei Zheng commented on HIVE-13809: ---------------------------------- This ticket rely depend on HIVE-13934's work > Hybrid Grace Hash Join memory usage estimation didn't take into account the > bloom filter size > --------------------------------------------------------------------------------------------- > > Key: HIVE-13809 > URL: https://issues.apache.org/jira/browse/HIVE-13809 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 2.0.0, 2.1.0 > Reporter: Wei Zheng > Assignee: Wei Zheng > > Memory estimation is important during hash table loading, because we need to > make the decision of whether to load the next hash partition in memory or > spill it. If the assumption is there's enough memory but it turns out not the > case, we will run into OOM problem. > Currently hybrid grace hash join memory usage estimation didn't take into > account the bloom filter size. In large test cases (TB scale) the bloom > filter grows as big as hundreds of MB, big enough to cause estimation error. > The solution is to count in the bloom filter size into memory estimation. > Another issue this patch will fix is possible NPE due to object cache reuse > during hybrid grace hash join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)