Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20986 )

Change subject: IMPALA-12782: Show info of the event processing in /events webUI
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20986/2/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

http://gerrit.cloudera.org:8080/#/c/20986/2/common/thrift/JniCatalog.thrift@1066
PS2, Line 1066: }
> is_current_event_batched doesn't seem to be used ?
Nice catch! I planned to show the MetastoreEvent at the beginning but decided 
to leave it as a follow-up task. So currently only the original HMS event is 
shown. We can delete this field.


http://gerrit.cloudera.org:8080/#/c/20986/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/20986/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1130
PS2, Line 1130:       progressInfo.max_event_time_s = lastEvent.getEventTime();
> currentEventBatch_, currentFilteredEvents_, currentBatchStartTimeMs_, curre
Yeah, I should have mentioned this somewhere. There are costs to provide 
consistency in metrics. E.g. if we use a mutex (synchronized) to protect these 
fields, the web requests could block the event processor. We already have 
inconsistent metrics, e.g. lastSyncedEventId_ and lastSyncedEventTimeSecs_ 
could be inconsistent since the updates on them are not atomic.

So currently we are best effort to provide consistency on metrics but cannot 
garuantee it. E.g. we garuantee the consistency between num_hms_events, 
min_event_*, max_event*  by using the snapshot of currentEventBatch_, But they 
could be inconsistent with other metrics like num_synced_events.

I added some codes to reset these fields to reduce the possibility of seeing 
inconsistent metrics. Another way to fix this is wrapping these metrics into a 
snapshot class and create new instance whenever any metric changes. I think it 
doesn't worth adding this complexity since the web page is mainly used by 
admins when event processing is slow. In such situation, the metrics are also 
updated slowly.


http://gerrit.cloudera.org:8080/#/c/20986/2/www/events.tmpl
File www/events.tmpl:

http://gerrit.cloudera.org:8080/#/c/20986/2/www/events.tmpl@53
PS2, Line 53: Started processing the current batch at 
{{progress-info.start_time}} ({{progress-info.elapsed_time}} ago).</br>
> Nit: does 'Started processing' sound good?
Done


http://gerrit.cloudera.org:8080/#/c/20986/2/www/events.tmpl@54
PS2, Line 54: Started processing the current event at 
{{progress-info.start_time_of_event}} 
({{progress-info.elapsed_time_current_event}} ago).
> Nit: Should we also show the current event ID here?
It's shown in the table below. Maybe it's ok to skip it here.


http://gerrit.cloudera.org:8080/#/c/20986/2/www/events.tmpl@59
PS2, Line 59: {{/progress-info.current_event_batch_size}}
> nit: Can we use the same style in both headers ? Event ID or EventId, Event
Done



--
To view, visit http://gerrit.cloudera.org:8080/20986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2e7d4952c7fd04ae89b6751204499bf9dd99f57c
Gerrit-Change-Number: 20986
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Comment-Date: Tue, 20 Feb 2024 12:14:54 +0000
Gerrit-HasComments: Yes

Reply via email to