[ https://issues.apache.org/jira/browse/HIVE-24221?focusedWorklogId=523706&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-523706 ]
ASF GitHub Bot logged work on HIVE-24221: ----------------------------------------- Author: ASF GitHub Bot Created on: 14/Dec/20 05:02 Start Date: 14/Dec/20 05:02 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #1544: URL: https://github.com/apache/hive/pull/1544#discussion_r542113182 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java ########## @@ -233,6 +235,23 @@ public static ExprNodeGenericFuncDesc and(List<ExprNodeDesc> exps) { return new ExprNodeGenericFuncDesc(TypeInfoFactory.booleanTypeInfo, new GenericUDFOPAnd(), "and", flatExps); } + /** + * Create an expression for computing a hash by recursively hashing given expressions by two: + * <pre> + * Input: HASH(A, B, C, D) + * Output: HASH(HASH(HASH(A,B),C),D) + * </pre> + */ + public static ExprNodeGenericFuncDesc hash(List<ExprNodeDesc> exps) { + assert exps.size() >= 2; + ExprNodeDesc hashExp = exps.get(0); + for (int i = 1; i < exps.size(); i++) { + List<ExprNodeDesc> hArgs = Arrays.asList(hashExp, exps.get(i)); + hashExp = new ExprNodeGenericFuncDesc(TypeInfoFactory.intTypeInfo, new GenericUDFMurmurHash(), "hash", hArgs); Review comment: yes; excatly ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 523706) Time Spent: 50m (was: 40m) > Use vectorizable expression to combine multiple columns in semijoin bloom > filters > --------------------------------------------------------------------------------- > > Key: HIVE-24221 > URL: https://issues.apache.org/jira/browse/HIVE-24221 > Project: Hive > Issue Type: Improvement > Components: Query Planning > Environment: > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Currently, multi-column semijoin reducers use an n-ary call to > GenericUDFMurmurHash to combine multiple values into one, which is used as an > entry to the bloom filter. However, there are no vectorized operators that > treat n-ary inputs. The same goes for the vectorized implementation of > GenericUDFMurmurHash introduced in HIVE-23976. > The goal of this issue is to choose an alternative way to combine multiple > values into one to pass in the bloom filter comprising only vectorized > operators. -- This message was sent by Atlassian Jira (v8.3.4#803005)