Hello Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/23805
to look at the new patch set (#4).
Change subject: IMPALA-14637: COMMIT_TXN events should trigger reload for
truncate ops
......................................................................
IMPALA-14637: COMMIT_TXN events should trigger reload for truncate ops
Truncate operations generate ALTER events in HMS, which trigger metadata
reloads when catalogd processes these events. However, for transactional
tables, a stale snapshot will be loaded if the corresponding transaction
is not committed yet. Catalogd should reload the metadata in processing
the corresponding COMMIT_TXN events.
Currently, when processing COMMIT_TXN events, catalogd fetches the
WriteEventInfo list for the transaction. This only includes updates from
data insertion that has new data files. Truncate operations are missing
here, which causes COMMIT_TXN events failed to reload the table.
This patch fixes the issue by tracking transactional truncate operations
when receiving ALTER_TABLE, ALTER_PARTITION and ALTER_PARTITIONS events.
A map from transaction ids to the truncation info is maintained for
this. The truncation info is represented by a new class,
TableWriteEvent, which can also be generated from WriteEventInfo
instances. When processing a COMMIT_TXN event, after fetching the
WriteEventInfo list, we convert it into a list of TableWriteEvent and
then add all the truncation items of that transaction. Reloads are
triggered based on this list and ValidWriteIds list of the table is
updated accordiingly. In case of ABORT_TXN events, the entries in this
map will be cleared and no updates happen.
Note that ALTER events have the writeIds but the transaction ids are
missing. To find the transaction ids, this patch adds a new map which
maps TableWriteId to the transaction id. It's maintained consistently
with the existing txnToWriteIds_ map, i.e. these two maps are updated
consistently when processing ALLOC_WRITE_ID_EVENT, COMMIT_TXN and
ABORT_TXN events.
Tests
- Added FE tests for ALTER_TABLE and ALTER_PARTITION events.
- Due to the dependent Hive version is missing HIVE-28668, HMS can't
generate a single ALTER_PARTITIONS event when truncating a
partitioned table. So tests for ALTER_PARTITIONS events are missing.
Change-Id: I89aac12819f08dd9ed42d5d8b21a96c04b04d75c
---
M fe/src/compat-apache-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/Hive3MetastoreShimBase.java
A fe/src/main/java/org/apache/impala/catalog/TableWriteEvent.java
M fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M
fe/src/test/java/org/apache/impala/catalog/events/MetastoreEventsProcessorTest.java
8 files changed, 344 insertions(+), 55 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/23805/4
--
To view, visit http://gerrit.cloudera.org:8080/23805
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89aac12819f08dd9ed42d5d8b21a96c04b04d75c
Gerrit-Change-Number: 23805
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>