Hello Riza Suminto, Kurt Deschler, Jason Fehr, k.venureddy2...@gmail.com, Sai 
Hemanth Gantasala, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/22571

to look at the new patch set (#7).

Change subject: IMPALA-13684: Improve waitForHmsEvent() to only wait for 
related events
......................................................................

IMPALA-13684: Improve waitForHmsEvent() to only wait for related events

waitForHmsEvent is a catalogd RPC for coordinators to send a requested
db/table names to catalogd and wait until it's safe (i.e. no stale
metadata) to start analyzing the statement. The wait time is configured
by query option sync_hms_events_wait_time_s. Currently, when this option
is enabled, catalogd waits until it syncs to the latest HMS event
regardless what the query is.

This patch reduces waiting by only checking related events and wait
until the last related event has been processed. In the ideal case, if
there are no pending events that are related, the query doesn't need to
wait.

Related pending events are determined as follows:
 - For queries that need the db list, i.e. SHOW DATABASES, check pending
   CREATE/ALTER/DROP_DATABASE events on all dbs. ALTER_DATABASE events
   are checked in case the ownership changes and impacts visibility.
 - For db statements like SHOW FUNCTIONS, CREATE/ALTER/DROP DATABASE,
   check pending CREATE/ALTER/DROP events on that db.
   - For db statements that require the table list, i.e. SHOW TABLES,
     also check CREATE_TABLE, DROP_TABLE events under that db.
 - For table statements,
   - check all database events on related db names.
   - If there are loaded transactional tables, check all the pending
     COMMIT_TXN, ABORT_TXN events. Note that these events might modify
     multiple transactional tables and we don't know their table names
     until they are processed. To be safe, wait for all transactional
     events.
   - For all the other table names,
     - if they are all missing/unloaded in the catalog, check all the
       pending CREATE_TABLE, DROP_TABLE events on them for their
       existence.
     - Otherwise, some of them are loaded, check all the table events on
       them. Note that we can fetch events on multiple tables under the
       same db in a single fetch.

TODO: Views should be expanded and underlying views/tables should be
checked recursively.

This patch leverages the HMS API to fetch events of several tables under
the same db in batch. MetastoreEventsProcessor.MetaDataFilter is
improved for this.

Improves coordinator side in collecting table names for single-table
statements. E.g. "DROP TABLE mydb.foo" previously has two candidate
table names - "mydb.foo" and "default.mydb" (assuming the session db is
"default"). Now it just produces "mydb.foo" since we are sure it's only
valid in this case. This reduces the HMS RPCs in catalogd side to fetch
notification events.

Tests:
 - Added FE tests for collecting db/table names in coordinator side.
 - Added authorization test on altering the ownership in Hive and
   running queries in Impala.
 - TODO: Add test for multiple tables in a single query, including CTAS
 - Ran CORE tests

Change-Id: Ic033b7e197cd19505653c3ff80c4857cc474bcfc
---
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CommentOnDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CommentOnTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ConvertTableToIcebergStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeHistoryStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
A fe/src/main/java/org/apache/impala/analysis/SingleTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/service/JdbcTestBase.java
M tests/authorization/test_ranger.py
M tests/common/impala_test_suite.py
M tests/metadata/test_event_processing.py
29 files changed, 649 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/22571/7
--
To view, visit http://gerrit.cloudera.org:8080/22571
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic033b7e197cd19505653c3ff80c4857cc474bcfc
Gerrit-Change-Number: 22571
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Anonymous Coward <k.venureddy2...@gmail.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Jason Fehr <jf...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Sai Hemanth Gantasala <saihema...@cloudera.com>

Reply via email to