[ 
https://issues.apache.org/jira/browse/HIVE-24251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis reassigned HIVE-24251:
------------------------------------------


> Improve bloom filter size estimation for multi column semijoin reducers
> -----------------------------------------------------------------------
>
>                 Key: HIVE-24251
>                 URL: https://issues.apache.org/jira/browse/HIVE-24251
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Planning
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>
> There are various cases where the expected size of the bloom filter is 
> largely underestimated  making the semijoin reducer completely ineffective. 
> This more relevant for multi-column semi join reducers since the current 
> [code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273]
>  does not take them into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to