Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23811 )

Change subject: IMPALA-14641: Add detailed operation page for Catalogd 
operations
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/23811/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23811/2//COMMIT_MSG@12
PS2, Line 12: - Formatted Thrift request sent to Catalogd
I am a bit concerned about storing this, especially for already finished 
requests. While most requests are small, some can be large due to including a 
list of partitions or files, for example TUpdateCatalog. Keeping it for some 
time can increase memory usage.

One solution would be to have some limit, e.g. 1KB. With default 
catalog_operation_log_size=100 this would keep extra mem usage under 100KB 
which seems ignorable. Showing large json messages with lot of partitions/files 
also doesn't seem very useful. Large requests could be dropped only when 
archiving the record so that we have full diagnostics for long stuck DDL 
operations.

If there is no consensus about storing the request, it would make sense to 
split this commit in two and merge the timeline first.


http://gerrit.cloudera.org:8080/#/c/23811/2/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java
File 
fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java:

http://gerrit.cloudera.org:8080/#/c/23811/2/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java@188
PS2, Line 188:         record.setThrift_request(serializeThriftToJson(request));
It seems wasteful to me to do this when the record is added, as probably the 
json format will be never requested. It would be possible to save reference to 
the request and serialize to json only when the details are requested.

Also see my comment about the request in the commit message.


http://gerrit.cloudera.org:8080/#/c/23811/2/fe/src/main/java/org/apache/impala/catalog/monitor/CatalogOperationTracker.java@189
PS2, Line 189:         record.setRequest_size_bytes(getThriftSize(request));
The request was originally passed as a byte array, its could be saved there: 
https://github.com/apache/impala/blob/3a5a6f612a332fc509cfdc73c4566356a00ac730/fe/src/main/java/org/apache/impala/service/JniCatalog.java#L312


http://gerrit.cloudera.org:8080/#/c/23811/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/23811/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@670
PS2, Line 670:       TEventSequence timelineThrift = catalogTimeline.toThrift();
Adding the timeline only at the end limits it usefulness - I didn't think it 
through too much, but it would be nicer to see the timeline of the events 
happened so far for an active event. This is also how query profiles work, you 
can see the timeline of a running query.


http://gerrit.cloudera.org:8080/#/c/23811/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@682
PS2, Line 682: response
This should be generally null when an exception was thrown.


http://gerrit.cloudera.org:8080/#/c/23811/2/tests/webserver/test_web_pages.py
File tests/webserver/test_web_pages.py:

http://gerrit.cloudera.org:8080/#/c/23811/2/tests/webserver/test_web_pages.py@1088
PS2, Line 1088: test_catalog_operation_detail_endpoint
It would be also useful to test long running DDL that is still running and is 
not in the archive. See test_query_cancel_load_tables for an example where a 
delay is injected with a debug action to achieve this.



--
To view, visit http://gerrit.cloudera.org:8080/23811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib80ce3b5873672065b04b00a21d3419a1db0969c
Gerrit-Change-Number: 23811
Gerrit-PatchSet: 2
Gerrit-Owner: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Tue, 30 Dec 2025 11:06:52 +0000
Gerrit-HasComments: Yes

Reply via email to