Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21423 )

Change subject: IMPALA-12732: Add support for MERGE statements for Iceberg 
tables
......................................................................


Patch Set 7:

(22 comments)

doing another dump. Still not finished with the analyzer part, but getting 
there soon.

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java
File fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/DmlStatementBase.java@92
PS7, Line 92:   protected void checkSubQuery(SlotRef lhsSlotRef, Expr rhsExpr)
These functions, if I don't miss anything, can work without an actual object of 
this class as they receive all information via params and don't need a state. 
Can they be static to the class?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java
File fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@189
PS7, Line 189: (MergeCase)
Is it necessary to convert the items to MergeCase. I believe they are of 
MergeCase type already.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@225
PS7, Line 225:   public Expr getRowPresentExpr() {
Haven't checked all of these but at least some of them are only called within 
this class. They could be private then.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@246
PS7, Line 246:    * SELECT /* +straight_join */
nit: could you indent this example SQL for better readability?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@247
PS7, Line 247:    * CAST(TupleIsNull(0) + TupleIsNull(1) * 2 AS TINYINT) 
row_present,
this is rather:
CAST(TupleIsNull(0) + CAST(TupleIsNull(1) * 2 AS TINYINT)

1: pls update the comment
2: do we need the inner cast to TINYINT after we multiple a bool with a tinyint?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@262
PS7, Line 262: targetColumns
this is rather 'targetSlotRefs'


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@263
PS7, Line 263: targetTableRef_.getTable()
             :             .getColumns()
             :             .stream()
             :             .map(column
             :                 -> new SlotRef(
             :                 
ImmutableList.of(targetTableRef_.getUniqueAlias(),
             :                     column.getName())))
             :             .collect(Collectors.toList());
not sure if this is auto formatted, but I find it more compact and readable if 
we indent like this:

tblRef.getTable.getColumns().stream()
    .map()
    .collect()


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@362
PS7, Line 362: sortColumns
this is rather sortColumnPositions or sortColumnIndexes, right?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@365
PS7, Line 365: resultExpressions::get
could you write expressions_.resultExpressions()::get (or similar) here and get 
rid of L363?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@372
PS7, Line 372: in the same format
nit: what do you mean by 'in the same format'?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@374
PS7, Line 374:   protected static class MergeExpressions {
If I'm not mistaken, this class is to store intermediate states of this class, 
but this doesn't really add any extra logic on top. I feel that anything that 
is within this class could be simply members of the outer class. And then we 
would save all these getters/setters. No strong feelings, though.

If we decide to keep this, then I think the members should be public to get rid 
of these getters/setters.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@378
PS7, Line 378:       private final List<Integer> sortingOrder_;
this name is confusing with the one 2 lines below. What if we called this 
sortingColumnPositions or such?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@391
PS7, Line 391:     private List<Expr> targetExpressions;
nit: member names should have a '_' suffix


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/IcebergMergeImpl.java@399
PS7, Line 399:     public MergeExpressions() {
If we init this with empty lists then it will be pretty ambiguous what part can 
we expect to be populated at what point in the execution. These classes that 
hold a subset of the members of the upper class are more comfortable and 
straightforward to use if they are immutable and being populated at 
construction. Then there is no confusion about whether a specific member is 
populated when we query it.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java
File fe/src/main/java/org/apache/impala/analysis/MergeCase.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@27
PS7, Line 27: public abstract class MergeCase extends StatementBase {
comments pls


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@28
PS7, Line 28: parent_
nit: I think naming this mergeStmt_ would be more descriptive than 'parent_'


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@42
PS7, Line 42:     this.parent_ = parent;
fn could fit into one line. I'll stop commenting these. Would you mind taking a 
look for this through the entire PS?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeCase.java@85
PS7, Line 85:   public enum MatchType {
nit: pls add some linebreaks between the different sections of this enum.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeDelete.java
File fe/src/main/java/org/apache/impala/analysis/MergeDelete.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeDelete.java@39
PS7, Line 39:     for (Column col : parent_.table_.getColumns()) {
we are expected to call analyze() for an object constructed by the no-param 
constructor. So in theory 'parent_' could be null here if the setParent() 
wasn't called. Maybe verify it with a Precondition?


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java
File fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java:

http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java@49
PS7, Line 49:
nit: I'd rather put a linebreak after L52.


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java@57
PS7, Line 57:       Column c = lhs.getResolvedPath().destColumn();
can we expect lhs to be analyzed? I recall getResolvedPath needs that. If we 
can, pls add a Precondition


http://gerrit.cloudera.org:8080/#/c/21423/7/fe/src/main/java/org/apache/impala/analysis/MergeUpdate.java@61
PS7, Line 61:
nit: extra space



--
To view, visit http://gerrit.cloudera.org:8080/21423
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3416a79740eddc446c87f72bf1a85ed3f71af268
Gerrit-Change-Number: 21423
Gerrit-PatchSet: 7
Gerrit-Owner: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 14 Jun 2024 13:09:17 +0000
Gerrit-HasComments: Yes

Reply via email to