Surya Hebbar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23889 )

Change subject: IMPALA-12027: Support additional details for DataSink nodes in 
ExecSummary
......................................................................


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@9
PS3, Line 9: we show table nam
> nit: 'we show'
Done


http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@10
PS3, Line 10: s of `Pla
> nit: 'some'
Done


http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@13
PS3, Line 13: This field is not available for any type of `DataSink` nodes.
            :
            : With this change, we support displaying table names and other 
details
            : for table sink nodes and other such `DataSink` nodes
> nit: Please avoid using 'currently' when you refer to the old behavior. You
Done


http://gerrit.cloudera.org:8080/#/c/23889/3/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java
File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java:

http://gerrit.cloudera.org:8080/#/c/23889/3/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@111
PS3, Line 111: minString = table_sinks_set.iterat
> Currently this returns the table name twice, concatenated: the 'table_name'
Done


http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py@150
PS3, Line 150: # INSERT query.
             :     query = "create table {0}.tmp as select count(*) from 
functional.alltypes".format(
             :         unique_database)
             :     result = self.client.execute(query, fetch_exec_summary=True)
             :     # Sanity-check the HDFS writer sink.
             :     assert result.exec_summary[0]['operator'] == 'F01:HDFS 
WRITER'
             :     assert result.exec_summary[0]['max_time'] >= 0
             :     assert result.exec_summary[0]['num_rows'] == -1
             :     assert result.exec_summary[0]['est_num_rows'] == -1
             :     assert result.exec_summary[0]['peak_mem'] >= 0
             :     assert result.exec_summary[0]['est_peak_mem'] >= 0
> Maybe you could check for the table sink label detail here instead of addin
Table details are currently not being verified here even for a DQL here.

Augmenting the table verification here would require running DDL and other DQL 
here, which would change the entire definition of the current function.


http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py@268
PS3, Line 268: "select count(*) from functional.alltypes"
> Or alternatively, you could modify and extend this test case to check for t
Yes, I had initially thought of doing exactly that.

But verifying the write table requires running a DDL query involving create 
table or create table as.

This would require the method to run DDL and 3 other DQL queries.

Seperating the DDL requiring test to another function seems safe as we are 
creating and dropping tables in case of error or failure.

So, for these reasons it might be better to create a seperate method here.



--
To view, visit http://gerrit.cloudera.org:8080/23889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2652dd896f72c5c6bbe7e76facdede2a237808d5
Gerrit-Change-Number: 23889
Gerrit-PatchSet: 4
Gerrit-Owner: Surya Hebbar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Tue, 03 Mar 2026 12:59:17 +0000
Gerrit-HasComments: Yes

Reply via email to