Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21993


Change subject: IMPALA-13484: Don't call alter_table() on HMS when loading 
Iceberg table
......................................................................

IMPALA-13484: Don't call alter_table() on HMS when loading Iceberg table

When Impala loads an Iceberg table it also loads the metastore
representation from HMS. Additionally, when Impala loads the Iceberg
metadata of the table it construct another metastore representation of
that metadata. If these two metastore representations don't match,
Impala calls alter_table() on HMS to persist the differences.

In practice this behaviour is triggered for instance when one engine
creates a table with column types that Impala can't read and instead it
does an adjustment on the column type. E.g. if an engine creates a
'timestamp with local time zone' column, Impala will change the column
type in the HMS representation to 'timestamp' type.

There are some issues with this approach:
1: Impala calls HMS.alter_table() directly and doesn't change the table
   through the Iceberg API. As a result no conflict checks are
   performed. Since the metadata location is a simple table property
   this alter_table() call can overwrite the metadata pointer in case
   there had been other changes to the table since Impala started to
   load it. This can result in data correctness issues.
2: Even in use cases where Impala only reads a table, it can still
   perform table modifications that is a very weird behaviour.
3: With this approach Impala changes the table schema in HMS but doesn't
   change the Iceberg schema in the Iceberg metadata.

As a solution we can simply get rid of the logic that makes the
alter_table() call to HMS at the end of loading an Iceberg table. This
can avoid a lot of confusions and in fact persisiting the schema
adjustments Impala had done during table loading is not necessary.

Testing:
Ran the whole exhaustive test suite.

Change-Id: Icd5d1eee421f22d3853833dc571b14d4e9005ab3
---
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
1 file changed, 0 insertions(+), 16 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/21993/1
--
To view, visit http://gerrit.cloudera.org:8080/21993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icd5d1eee421f22d3853833dc571b14d4e9005ab3
Gerrit-Change-Number: 21993
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab <[email protected]>

Reply via email to