Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/21423 )
Change subject: IMPALA-12732: Add support for MERGE statements for Iceberg tables ...................................................................... Patch Set 7: (22 comments) doing another dump. Still not finished with the analyzer part, but getting there soon. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java File fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java: http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java@92 PS7, Line 92: protected void checkSubQuery(SlotRef lhsSlotRef, Expr rhsExpr) These functions, if I don't miss anything, can work without an actual object of this class as they receive all information via params and don't need a state. Can they be static to the class? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java File fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java: http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@189 PS7, Line 189: (MergeCase) Is it necessary to convert the items to MergeCase. I believe they are of MergeCase type already. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@225 PS7, Line 225: public Expr getRowPresentExpr() { Haven't checked all of these but at least some of them are only called within this class. They could be private then. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@246 PS7, Line 246: * SELECT /* +straight_join */ nit: could you indent this example SQL for better readability? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@247 PS7, Line 247: * CAST(TupleIsNull(0) + TupleIsNull(1) * 2 AS TINYINT) row_present, this is rather: CAST(TupleIsNull(0) + CAST(TupleIsNull(1) * 2 AS TINYINT) 1: pls update the comment 2: do we need the inner cast to TINYINT after we multiple a bool with a tinyint? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@262 PS7, Line 262: targetColumns this is rather 'targetSlotRefs' http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@263 PS7, Line 263: targetTableRef_.getTable() : .getColumns() : .stream() : .map(column : -> new SlotRef( : ImmutableList.of(targetTableRef_.getUniqueAlias(), : column.getName()))) : .collect(Collectors.toList()); not sure if this is auto formatted, but I find it more compact and readable if we indent like this: tblRef.getTable.getColumns().stream() .map() .collect() http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@362 PS7, Line 362: sortColumns this is rather sortColumnPositions or sortColumnIndexes, right? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@365 PS7, Line 365: resultExpressions::get could you write expressions_.resultExpressions()::get (or similar) here and get rid of L363? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@372 PS7, Line 372: in the same format nit: what do you mean by 'in the same format'? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@374 PS7, Line 374: protected static class MergeExpressions { If I'm not mistaken, this class is to store intermediate states of this class, but this doesn't really add any extra logic on top. I feel that anything that is within this class could be simply members of the outer class. And then we would save all these getters/setters. No strong feelings, though. If we decide to keep this, then I think the members should be public to get rid of these getters/setters. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@378 PS7, Line 378: private final List<Integer> sortingOrder_; this name is confusing with the one 2 lines below. What if we called this sortingColumnPositions or such? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@391 PS7, Line 391: private List<Expr> targetExpressions; nit: member names should have a '_' suffix http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@399 PS7, Line 399: public MergeExpressions() { If we init this with empty lists then it will be pretty ambiguous what part can we expect to be populated at what point in the execution. These classes that hold a subset of the members of the upper class are more comfortable and straightforward to use if they are immutable and being populated at construction. Then there is no confusion about whether a specific member is populated when we query it. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java File fe/src/main/java/org/apache/impala/analysis/MergeCase.java: http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@27 PS7, Line 27: public abstract class MergeCase extends StatementBase { comments pls http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@28 PS7, Line 28: parent_ nit: I think naming this mergeStmt_ would be more descriptive than 'parent_' http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@42 PS7, Line 42: this.parent_ = parent; fn could fit into one line. I'll stop commenting these. Would you mind taking a look for this through the entire PS? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@85 PS7, Line 85: public enum MatchType { nit: pls add some linebreaks between the different sections of this enum. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeDelete.java File fe/src/main/java/org/apache/impala/analysis/MergeDelete.java: http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeDelete.java@39 PS7, Line 39: for (Column col : parent_.table_.getColumns()) { we are expected to call analyze() for an object constructed by the no-param constructor. So in theory 'parent_' could be null here if the setParent() wasn't called. Maybe verify it with a Precondition? http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java File fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java: http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java@49 PS7, Line 49: nit: I'd rather put a linebreak after L52. http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java@57 PS7, Line 57: Column c = lhs.getResolvedPath().destColumn(); can we expect lhs to be analyzed? I recall getResolvedPath needs that. If we can, pls add a Precondition http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java@61 PS7, Line 61: nit: extra space -- To view, visit http://gerrit.cloudera.org:8080/21423 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3416a79740eddc446c87f72bf1a85ed3f71af268 Gerrit-Change-Number: 21423 Gerrit-PatchSet: 7 Gerrit-Owner: Peter Rozsa <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 14 Jun 2024 13:09:17 +0000 Gerrit-HasComments: Yes
