[
https://issues.apache.org/jira/browse/IMPALA-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dániel Gábor Vankó resolved IMPALA-14290.
-----------------------------------------
Fix Version/s: Impala 5.0.0
Resolution: Fixed
> Iceberg partitioning should be case insensitive of column names
> ---------------------------------------------------------------
>
> Key: IMPALA-14290
> URL: https://issues.apache.org/jira/browse/IMPALA-14290
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Reporter: Dániel Gábor Vankó
> Assignee: Dániel Gábor Vankó
> Priority: Major
> Labels: impala-iceberg
> Fix For: Impala 5.0.0
>
>
> When creating or altering partitions of Iceberg tables, Impala only accepts
> column names if they are in lower case.
> E.g. START_TIME will throw exception in the PARTITIONED BY SPEC clause. (Also
> with YEAR, MONTH, TRUNCATE and BUCKET transforms as well.)
> {noformat}
> [localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(START_TIME))
> STORED AS ICEBERG;
> Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(START_TIME))
> STORED AS ICEBERG
> 2025-08-06 12:50:04 [Exception] ERROR: Query
> b7450d257090b184:8cb6c76000000000 failed:
> ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
> CAUSED BY: IllegalArgumentException: Cannot find source column: START_TIME
> {noformat}
> The table is successfully created if start_time is lowercase in the
> PARTITIONED BY SPEC, but still uppercase in the column definition list:
> {noformat}
> [localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(start_time))
> STORED AS ICEBERG;
> Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(start_time))
> STORED AS ICEBERG
> +-------------------------+
> | summary |
> +-------------------------+
> | Table has been created. |
> +-------------------------+
> Fetched 1 row(s) in 0.21s{noformat}
> Moreover, altering partition on an existing table also throws exception when
> column name is not lowercase:
> {noformat}
> [localhost:21050] default> ALTER TABLE tab SET PARTITION SPEC
> (MONTH(START_TIME));
> Query: ALTER TABLE tab SET PARTITION SPEC (MONTH(START_TIME))
> 2025-08-06 13:22:58 [Exception] ERROR: Query
> bd439760e55e7cd2:9211459000000000 failed:
> ImpalaRuntimeException: Failed to ALTER table 'tab': Cannot find field
> 'START_TIME' in struct: struct<1: start_time: optional timestamp>{noformat}
>
> During parsing, column name is set to lowercase:
> [https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java#L109]
> For legacy tables, partitioning columns are either handled by ColumnDef (in
> CREATE TABLE staements) or by PartitionKeyValue (in INSERT statements):
> https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java#L41-L44
> In contrast, an Iceberg partition field's name is processed as-is:
> [https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java#L52-L61]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]