Arnab Karmakar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/24088
Change subject: IMPALA-14589: Add support for Iceberg V3 default values ...................................................................... IMPALA-14589: Add support for Iceberg V3 default values Iceberg V3 introduces two types of default values to enable non-blocking schema evolution, this patch adds support for them in Impala. 1. **initial-default**: Applied when READING old data files that were written before a column was added to the schema. This allows adding columns with defaults without rewriting existing data files. The default value is materialized at read time for missing columns. 2. **write-default**: Applied when WRITING new rows that don't specify a value for a column. This ensures consistent defaults across all engines writing to the table, regardless of whether they're inserting via Spark, Flink, or Impala. How Default Values Work: - Default values are stored in the Iceberg table's schema metadata (not in files) - When scanning a file missing a column, Impala checks the schema for initial-default - If present, the value is materialized in the template tuple before scanning - For writes, InsertStmt checks for write-default on unmentioned columns - Both default types are represented as JSON-serialized literals in the schema Testing: - Updated iceberg-v3-negative.test with schema evolution expectations - Extended iceberg-v3-default-values.test with comprehensive coverage: * Read path: SELECT, WHERE, aggregations, JOINs, UNION, subqueries * Write path: INSERT with partial columns * Time travel: Schema evolution across snapshots Change-Id: I9f1be994a336b30b17b17819091417d777a39be9 --- M be/src/exec/avro/hdfs-avro-scanner.cc M be/src/exec/file-metadata-utils.cc M be/src/exec/orc/hdfs-orc-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M common/thrift/CatalogObjects.thrift M common/thrift/Descriptors.thrift M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/catalog/Column.java M fe/src/main/java/org/apache/impala/catalog/IcebergColumn.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java M fe/src/main/java/org/apache/impala/util/IcebergSchemaConverter.java A testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-default-values.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-negative.test M tests/query_test/test_iceberg.py 16 files changed, 456 insertions(+), 94 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/24088/1 -- To view, visit http://gerrit.cloudera.org:8080/24088 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9f1be994a336b30b17b17819091417d777a39be9 Gerrit-Change-Number: 24088 Gerrit-PatchSet: 1 Gerrit-Owner: Arnab Karmakar <[email protected]>
