Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21142 )
Change subject: IMPALA-12737: List columns in profile and query history ...................................................................... Patch Set 27: (13 comments) I have not thoroughly review this, but I want to give my initial review. Please also address clang-tidy failure in https://jenkins.impala.io/job/clang-tidy-ub2004/4252/console http://gerrit.cloudera.org:8080/#/c/21142/27//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21142/27//COMMIT_MSG@9 PS27, Line 9: Adds "Select Columns", "Where Columns", "Join Columns", "Aggregate : Columns", and "OrderBy Columns" I think this patch has grown beyond just adding new columns in WM table. Please consider containing the refactoring changes (WM versioning, initialization, result fetching, etc) into its own patch. http://gerrit.cloudera.org:8080/#/c/21142/27//COMMIT_MSG@19 PS27, Line 19: The "aggregate_columns" field contains a de-duped list of all columns : in both the order by and having clauses. Please mention Tests added/done for this patch. http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/kudu/util/version_util.h File be/src/kudu/util/version_util.h: http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/kudu/util/version_util.h@42 PS27, Line 42: // Compare two Version objects. Versions that contain an extra component sort before : // (less than) versions that do not contain an extra component. No special handling is : // done of the extra component. Thus '-SNAPSHOT' sorts as greater than '-RELEASE'. : // Example sort order (from least to greatest): : // 1.0.0-RELEASE, 1.0.0-SNAPSHOT, 1.0.0, 1.0.1-SNAPSHOT. : bool operator<(const Version& other) const; : bool operator<=(const Version& other) const; : bool operator>(const Version& other) const; In general, I'm worried about making changes under be/src/kudu/* since we usually copy and replace codes under this dir straight from Kudu repository (see IMPALA-9335 and IMPALA-10931). We risk losing this changes if we do another Kudu source code rebase in the future. Perhaps we can contribute this code to Upstream Kudu first? http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/impala-server.cc@760 PS27, Line 760: MonotonicNanos Use MonotonicMillis() instead? http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/impala-server.cc@3235 PS27, Line 3235: VLOG(2) << "Internal server ready"; Seems redundant with LOG(INFO) below. http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/internal-server.h File be/src/service/internal-server.h: http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/internal-server.h@149 PS27, Line 149: Intended for use as a convenience method when query results are small Will be great if we can LOG(WARNING) if the query results turns out not small. http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/internal-server.cc File be/src/service/internal-server.cc: http://gerrit.cloudera.org:8080/#/c/21142/27/be/src/service/internal-server.cc@305 PS27, Line 305: i=0; i<results_metadata->columns.size() nit: add space between '=' and '<' operator, http://gerrit.cloudera.org:8080/#/c/21142/26/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/21142/26/common/thrift/Frontend.thrift@705 PS26, Line 705: // Columns referenced in a select list. : 21: optional list<string> select_columns : : // Columns referenced in a where clause. : 22: optional list<string> where_columns : : // Columns referenced in a join clause. : 23: optional list<string> join_columns : : // Columns referenced in an aggregation. : 24: optional list<string> aggregate_columns : : // Columns referenced in an order by clause. : 25: optional list<string> orderby_columns > Perhaps we don't want that in the profile. I think in the query log table t Ack http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@4488 PS26, Line 4488: mplified ? t > Mostly avoiding allocating another Set. But I do agree it's an unusual patt Ack http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@383 PS26, Line 383: selectList_.getItems().stream().filter(elem -> !elem.isStar()) : .forEach(item -> item.getExpr().collect(SlotRef.class, slotRefs)); : analyzer_.addSelectColumns(Stream.concat( > Done Done http://gerrit.cloudera.org:8080/#/c/21142/26/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@387 PS26, Line 387: > Done Done http://gerrit.cloudera.org:8080/#/c/21142/27/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java: http://gerrit.cloudera.org:8080/#/c/21142/27/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@380 PS27, Line 380: // Register all columns referenced in this statement, starting with select clause. : // Joins will be added during planning. Can you add comment explaining about how column alias is treated? Is the alias will be registered, or the real column name? An example of this is SELECT over UNION of different column names. http://gerrit.cloudera.org:8080/#/c/21142/26/tests/util/workload_management.py File tests/util/workload_management.py: http://gerrit.cloudera.org:8080/#/c/21142/26/tests/util/workload_management.py@30 PS26, Line 30: : def round_to_3(val): : # The differences betwe > Done Done -- To view, visit http://gerrit.cloudera.org:8080/21142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I78f3670b067c0c192ee8a212fba95466fbcb51d7 Gerrit-Change-Number: 21142 Gerrit-PatchSet: 27 Gerrit-Owner: Michael Smith <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Tue, 30 Jul 2024 04:11:36 +0000 Gerrit-HasComments: Yes
