Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/23789 )
Change subject: IMPALA-14535: Improve wait for HMS events sync with hierarchical event processing ...................................................................... Patch Set 2: (13 comments) http://gerrit.cloudera.org:8080/#/c/23789/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23789/2//COMMIT_MSG@11 PS2, Line 11: . nit: add space http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/DbEventExecutor.java File fe/src/main/java/org/apache/impala/catalog/events/DbEventExecutor.java: http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/DbEventExecutor.java@476 PS2, Line 476: prior to the given event id nit: I think we also ensure the given event id is processed here. http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/DbEventExecutor.java@483 PS2, Line 483: if (isTerminating()) { nit: Please add a comment for when this will happen. If an idle DbProcessor is being removed, will isTerminating() be true? Or will this only happen when we stop the EventProcessor? http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java: http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1912 PS2, Line 1912: res.setStatus_code(TErrorCode.GENERAL); : res.addToError_msgs("Failed to expand views: " + e.getMessage()); : return res; This will fail the query. Shouldn't we add a warning and use latestEventId like L1923-1926? http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1969 PS2, Line 1969: nit: we tend to not align method parameters in Java. Just add 4 more whitespaces then the previous line is enough. void syncMetadataWithHierarchicalEventProcessing(TWaitForHmsEventRequest req, TStatus res, long latestEventId) { http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1976 PS2, Line 1976: getLastSyncedEventId() nit: It'd be less misleading to use a new method name like maxDispatchedEventId(). http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@1981 PS2, Line 1981: // Get all db names nit: I think this is just a subset of all dbs since idle DbProcessors will be removed so not returned here. IIUC, please comment that we just make sure all pending events that could change the db list are processed. So we don't need to get all the db names from the catalog. Coordinator just wants the changes on db list to make its cached db list up-to-date. http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@2006 PS2, Line 2006: // Remove the synced dbs : dbNames.removeIf(dbName -> : eventExecutorService_.isProcessed(dbName, latestEventId)); : // Remove the synced tables : Iterator<Map.Entry<String, List<String>>> it = db2Tables.entrySet().iterator(); : while (it.hasNext()) { : Map.Entry<String, List<String>> entry = it.next(); : entry.getValue().removeIf(tableName -> : eventExecutorService_.isProcessed(entry.getKey(), tableName, latestEventId)); : if (entry.getValue().isEmpty()) { : it.remove(); : } : } nit: extract these into a method and invoke it once before the loop to avoid always sleep for 100ms. http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@2050 PS2, Line 2050: throws MetastoreNotificationFetchException { nit: Please add a Precondition check that isHierarchicalEventProcessingEnabled() is false since this method can only be used when hierarchical event processing is disabled. http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@2086 PS2, Line 2086: getDbAndTableNames nit: might be more clear to mention we are getting names from the request, not from the catalog. E.g. "getRequestedDbTableNames" http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@2087 PS2, Line 2087: nit: shorten the indentation http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/test/java/org/apache/impala/catalog/events/EventExecutorServiceTest.java File fe/src/test/java/org/apache/impala/catalog/events/EventExecutorServiceTest.java: http://gerrit.cloudera.org:8080/#/c/23789/2/fe/src/test/java/org/apache/impala/catalog/events/EventExecutorServiceTest.java@202 PS2, Line 202: = Should we use "|=" here? http://gerrit.cloudera.org:8080/#/c/23789/2/tests/custom_cluster/test_events_custom_configs.py File tests/custom_cluster/test_events_custom_configs.py: http://gerrit.cloudera.org:8080/#/c/23789/2/tests/custom_cluster/test_events_custom_configs.py@2005 PS2, Line 2005: assert ". Last synced event id: " in client_log Let's also verify the table names when IMPALA_TEST_CLUSTER_PROPERTIES.is_hierarchical_event_processing_enabled() is True. -- To view, visit http://gerrit.cloudera.org:8080/23789 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55cea4cb8e04860202e56e1b1bf2596613b4946c Gerrit-Change-Number: 23789 Gerrit-PatchSet: 2 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]> Gerrit-Comment-Date: Thu, 01 Jan 2026 02:57:44 +0000 Gerrit-HasComments: Yes
