Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/22610


Change subject: IMPALA-13853: Don't adjust Iceberg field IDs for data files 
that don't have complex types
......................................................................

IMPALA-13853: Don't adjust Iceberg field IDs for data files that don't have 
complex types

In migrated Iceberg tables we can have data files with missing field
IDs. We assume that their schema corresponds to the table schema at the
point when the table migration happened. This means during runtime we
can generate the field ids. The logic is more complicated when there are
complex types in the table and the table is partitioned. In such cases
we need to do some adjustments during field ID generation, in which case
we verify that the file schema corresponds to the table schema.

This adjustments is not needed when the table doesn't have complex
types, hence we can be a bit more relaxed and skip schema verification,
because field ID generation for top-level columns are not affected.
This means Impala would still be able to read the table if there were
trivial schema changes before migration.

With this change we allow all data files that have a compatible schema
with the table schema, which was the case before IMPALA-13364. This
behavior is also aligned with Hive.

Testing:
 * e2e tests added for both Parquet and ORC files

Change-Id: Ib1f1d0cf36792d0400de346c83e999fa50c0fa67
---
M be/src/exec/orc/orc-metadata-utils.cc
M be/src/exec/parquet/parquet-metadata-utils.cc
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-table-field-id-resolution.test
M tests/query_test/test_iceberg.py
5 files changed, 98 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/22610/1
--
To view, visit http://gerrit.cloudera.org:8080/22610
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib1f1d0cf36792d0400de346c83e999fa50c0fa67
Gerrit-Change-Number: 22610
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to