[ https://issues.apache.org/jira/browse/HIVE-24761?focusedWorklogId=585725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-585725 ]
ASF GitHub Bot logged work on HIVE-24761: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/Apr/21 11:35 Start Date: 20/Apr/21 11:35 Worklog Time Spent: 10m Work Description: abstractdog commented on a change in pull request #2099: URL: https://github.com/apache/hive/pull/2099#discussion_r616599645 ########## File path: ql/src/gen/vectorization/ExpressionTemplates/ColumnArithmeticColumn.txt ########## @@ -34,20 +34,17 @@ public class <ClassName> extends VectorExpression { private static final long serialVersionUID = 1L; - private final int colNum1; private final int colNum2; Review comment: I agree that the current solution is not really clean by having only the first column put into VectorExpression a couple of notes here, which needs to be discussed before proceeding with this huge refactor (which I'm happy to do once we 100% certain about the "perfect" solution): 1. unary, binary is not enough, unfortunately, we have even expressions involving even more cols, this is not a problem, we have the language support for that :) tertiary, quaternary... 2. what's confusing is, how to show with simple class names that unary/binary/... is only a story about the input columns? an expression can have constants too, e.g. in IfExprScalarScalar.txt: ``` this.arg1Column = arg1Column; this.arg2Scalar = arg2Scalar; this.arg3Scalar = arg3Scalar; ``` in our terminology here, this is a unary expression because of arg1Column + scalars, but in reality, it's obviously not a unary function... 3. with subclasses, we'll have to implement a general VectorExpression.setInputColumnNum(int i, int j, int k, ...vararg), otherwise, we won't be able to change the input column numbers (which is important, this was the intention of this huge vector expression refactor), I think this will simply work by simply overriding vararg method in subclasses -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 585725) Time Spent: 1.5h (was: 1h 20m) > Vectorization: Support PTF - bounded start windows > -------------------------------------------------- > > Key: HIVE-24761 > URL: https://issues.apache.org/jira/browse/HIVE-24761 > Project: Hive > Issue Type: Sub-task > Reporter: László Bodor > Assignee: László Bodor > Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > {code} > notVectorizedReason: PTF operator: *** only UNBOUNDED start frame is > supported > {code} > Currently, bounded windows are not supported in VectorPTFOperator. If we > simply remove the check compile-time: > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L2911 > {code} > if (!windowFrameDef.isStartUnbounded()) { > setOperatorIssue(functionName + " only UNBOUNDED start frame is > supported"); > return false; > } > {code} > We get incorrect results, that's because vectorized codepath completely > ignores boundaries, and simply iterates through all the input batches in > [VectorPTFGroupBatches|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ptf/VectorPTFGroupBatches.java#L172]: > {code} > for (VectorPTFEvaluatorBase evaluator : evaluators) { > evaluator.evaluateGroupBatch(batch); > if (isLastGroupBatch) { > evaluator.doLastBatchWork(); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)