Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/22407 )
Change subject: IMPALA-12588: Don't UPDATE rows that already have the desired value ...................................................................... Patch Set 2: (12 comments) http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java File fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java: http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@54 PS2, Line 54: needsWherePredicateRewrite_ > To me, this name suggests that there are cases where the WHERE predicate ne Done http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@54 PS2, Line 54: needsWherePredicateRewrite_ > If I'm not mistaken this flag is responsible for avoiding multiple rewrites Interesting idea, Peter. The problem is that if we always restore the wherePredicate_ to the original state (without the extra negated assignments), it will not be analyzed multiple rounds, therefore we cannot iterate on the intermediate results. And also in the first round, after adding the extra predicates, the wherePredicate_ (intentionally) becomes part of the sourceStmt_ which is also reset and reanalyzed in multiple rounds and its whereClause_ would not correspond to the restored wherePredicate_ in UpdateStmt. They should be kept in sync. I propose adding the extra filtering predicates only once, the earlier the better. I moved it to ModifyImpl just before creating the selectStmt. http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@141 PS2, Line 141: where > Capitalise WHERE. Done http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@142 PS2, Line 142: (IMPALA-12588) > nit: I think it's not necessary to include it here, the ticket ID is visibl Done http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@146 PS2, Line 146: 10 > This could be extracted into a constant. Done http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@149 PS2, Line 149: predicates > You could add more details to what the predicates are, especially as this t Done http://gerrit.cloudera.org:8080/#/c/22407/2/fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java@164 PS2, Line 164: Predicate next = negateAssignment(assignments_.get(i)); > It should be checked whether the second part of the assignment is a functio Good catch. Then I think it is best to switch off this optimization in case we have UDFs in the SET list. http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test: http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test@256 PS2, Line 256: s > nit: capital s Done http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-update-basic.test@332 PS2, Line 332: ref_view > Can we try add tests with the two other kinds of views: nested query and WI Nested queries are not allowed for the assignments. WITH statements are not supported for UPDATE. Added a test case with inline view in FROM clause. http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test File testdata/workloads/functional-query/queries/QueryTest/kudu_update.test: http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test@27 PS2, Line 27: IMPALA-12588 > nit: I think we mention the ticket ID when it's a regression fix Done http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test@252 PS2, Line 252: t > nit: capital t Done http://gerrit.cloudera.org:8080/#/c/22407/2/testdata/workloads/functional-query/queries/QueryTest/kudu_update.test@253 PS2, Line 253: # predicate on non-key > The phrasing sounds a bit odd and I thought it could be removed, but it can I think we should keep this comment, because it helps enumerating all different cases for coverage. Swapped the two lines so maybe it sounds better. -- To view, visit http://gerrit.cloudera.org:8080/22407 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I926c80e8110de5a4615a3624a81a330f54317c8b Gerrit-Change-Number: 22407 Gerrit-PatchSet: 2 Gerrit-Owner: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 20 Feb 2025 17:05:43 +0000 Gerrit-HasComments: Yes
