[ 
https://issues.apache.org/jira/browse/IMPALA-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dániel Gábor Vankó resolved IMPALA-14290.
-----------------------------------------
    Fix Version/s: Impala 5.0.0
       Resolution: Fixed

> Iceberg partitioning should be case insensitive of column names
> ---------------------------------------------------------------
>
>                 Key: IMPALA-14290
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14290
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Dániel Gábor Vankó
>            Assignee: Dániel Gábor Vankó
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 5.0.0
>
>
> When creating or altering partitions of Iceberg tables, Impala only accepts 
> column names if they are in lower case.
> E.g. START_TIME will throw exception in the PARTITIONED BY SPEC clause. (Also 
> with YEAR, MONTH, TRUNCATE and BUCKET transforms as well.)
> {noformat}
> [localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(START_TIME))
> STORED AS ICEBERG;
> Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(START_TIME))
> STORED AS ICEBERG
> 2025-08-06 12:50:04 [Exception]  ERROR: Query 
> b7450d257090b184:8cb6c76000000000 failed:
> ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
> CAUSED BY: IllegalArgumentException: Cannot find source column: START_TIME
> {noformat}
> The table is successfully created if start_time is lowercase in the 
> PARTITIONED BY SPEC, but still uppercase in the column definition list:
> {noformat}
> [localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(start_time))
> STORED AS ICEBERG;
> Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(start_time))
> STORED AS ICEBERG
> +-------------------------+
> | summary                 |
> +-------------------------+
> | Table has been created. |
> +-------------------------+
> Fetched 1 row(s) in 0.21s{noformat}
> Moreover, altering partition on an existing table also throws exception when 
> column name is not lowercase:
> {noformat}
> [localhost:21050] default> ALTER TABLE tab SET PARTITION SPEC 
> (MONTH(START_TIME));
> Query: ALTER TABLE tab SET PARTITION SPEC (MONTH(START_TIME))
> 2025-08-06 13:22:58 [Exception]  ERROR: Query 
> bd439760e55e7cd2:9211459000000000 failed:
> ImpalaRuntimeException: Failed to ALTER table 'tab': Cannot find field 
> 'START_TIME' in struct: struct<1: start_time: optional timestamp>{noformat}
>  
> During parsing, column name is set to lowercase: 
> [https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java#L109]
> For legacy tables, partitioning columns are either handled by ColumnDef (in 
> CREATE TABLE staements) or by PartitionKeyValue (in INSERT statements): 
> https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java#L41-L44
> In contrast, an Iceberg partition field's name is processed as-is: 
> [https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java#L52-L61]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to