Hello Sai Hemanth Gantasala, Michael Smith, Csaba Ringhofer, Impala Public
Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/20131
to look at the new patch set (#14).
Change subject: IMPALA-12152: Add query option to wait for events sync up
......................................................................
IMPALA-12152: Add query option to wait for events sync up
Event-processor is designed to get rid of manual RT/IM (RefreshTable /
InvalidateMetadata) commands that sync up with external HMS
modifications. However, event processing could have delay. Queries might
still see stale metadata if the event-processor is lagging behind.
This patch adds a mechanism to let query planning wait until the metadata
is synced up. To be specific, coordinator will not start planning until
the last synced event id of catalogd reaches the latest event id when
the query is submitted. A new catalogd RPC, WaitForHmsEvent, is added
for this. Catalogd will record the latest event id and return the
catalog version once it catches up with that event id. Coordinator can
start planning once its local catalog version reaches the returned
catalog version.
Note that the current implementation waits for the latest event id when
the WaitForHmsEvent RPC is received at catalogd side. We can improve it
once HIVE-27499 is resolved, so we can efficiently detect whether some
given tables have unprocessed events and just wait for the *largest* id
of them. Tables that have no unprocessed events don't need to block the
query planning.
A new query option, sync_hms_events_wait_time_s, is added to configure
the timeout for waiting. It's 0 by default, which disables the waiting
mechanism. Users can turn it on for sensitive queries that depend on
external modifications.
Another new query option, sync_hms_events_strict_mode, is added to
control the behavior on errors, e.g. timeout or event-processor in error
state. It defaults to false (non-strict mode). In the strict mode,
coordinator will fail the query if it fails to wait for HMS events to be
synced in catalogd. In the non-strict mode, coordinator will start
planning with a warning message in profile (and in client outputs if the
client consumes the get_log results, e.g. in impala-shell).
Some timeline items are added in query profile for this waiting, e.g.
A succeeded wait:
- Query submitted: 29.791us (29.791us)
- WaitForHmsEvent finished: 1s002ms (1s002ms)
- WaitForCatalogUpdate finished: 1s002ms (86.563us)
- Planning finished: 1s036ms (34.349ms)
A failed wait:
- Query submitted: 29.736us (29.736us)
- WaitForHmsEvent failed: 21.114ms (21.084ms)
- Planning finished: 1s165ms (1s143ms)
For better debuggability in tests, add logs in run_stmt_in_hive to print
the Hive statements.
Tests
- Add test to verify planning waits until catalogd is synced with HMS
changes.
- Add test on the error handling when HMS event processing is disabled
- There are some existing tests that use
EventProcessorUtils.wait_for_event_processing() to wait until events
synced. Modify them to use the new query option in queries need this.
Change-Id: I36ac941bb2c2217b09fcfa2eb567b011b38efa2a
---
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-service-client-wrapper.h
M be/src/catalog/catalog.cc
M be/src/catalog/catalog.h
M be/src/exec/catalog-op-executor.cc
M be/src/exec/catalog-op-executor.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M common/thrift/CatalogService.thrift
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/events/ExternalEventsProcessor.java
M
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M fe/src/main/java/org/apache/impala/catalog/events/NoOpEventProcessor.java
M fe/src/main/java/org/apache/impala/service/JniCatalog.java
M tests/common/impala_test_suite.py
M tests/custom_cluster/test_events_custom_configs.py
M tests/metadata/test_event_processing.py
M tests/metadata/test_event_processing_base.py
M tests/metadata/test_metadata_query_statements.py
29 files changed, 391 insertions(+), 43 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/20131/14
--
To view, visit http://gerrit.cloudera.org:8080/20131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I36ac941bb2c2217b09fcfa2eb567b011b38efa2a
Gerrit-Change-Number: 20131
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>