Hello Riza Suminto, k.venureddy2...@gmail.com, Sai Hemanth Gantasala, Csaba Ringhofer, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/22571 to look at the new patch set (#3). Change subject: IMPALA-13684: Improve waitForHmsEvent() to only wait for related events ...................................................................... IMPALA-13684: Improve waitForHmsEvent() to only wait for related events waitForHmsEvent is a catalogd RPC for coordinators to send a requested db/table names to catalogd and wait until it's safe (i.e. no stale metadata) to start analyze the statement. Currently, catalogd waits until it syncs to the latest HMS event regardless what the query is. This patch improves this by only checking related events and wait until the last event of them has been processed. In the ideal case, if there are no pending events that are related, query doesn't need to wait. We check the related pending events in the following way: - For queries that need the db list, i.e. SHOW DATABASES, check pending CREATE_DATABASE, DROP_DATABASE events on all dbs. - For db statements like SHOW FUNCTIONS, CREATE/ALTER/DROP DATABASE, check pending CREATE/ALTER/DROP events on that db. Note that ALTER events are checked in case the ownership changes. For db names that are used in the query and are missing in catalog, also check their events. - For db statements that require the table list, i.e. SHOW TABLES, also check CREATE_TABLE, DROP_TABLE events under that db. - For loaded transactional tables, check all the pending COMMIT_TXN, ABORT_TXN events. Note that these events might modify multiple transactional tables and we don't know their table names until they are processed. To be safe, wait for all transactional events. - For all the other table names, - if they are all missing/unloaded in the catalog, check all the pending CREATE_TABLE, DROP_TABLE events on them for their existence. - Otherwise, some of them are loaded, check all the table events on them. Note that we can fetch events on multiple tables under the same db in a single fetch. This patch leverages the HMS API to fetch events of several tables under the same db in batch. MetastoreEventsProcessor.MetaDataFilter is improved for this. Tests: - Ran CORE tests Change-Id: Ic033b7e197cd19505653c3ff80c4857cc474bcfc --- M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java 6 files changed, 248 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/22571/3 -- To view, visit http://gerrit.cloudera.org:8080/22571 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic033b7e197cd19505653c3ff80c4857cc474bcfc Gerrit-Change-Number: 22571 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>