Dániel Gábor Vankó created IMPALA-14290:
-------------------------------------------
Summary: Iceberg partitioning should be case insensitive of column
names
Key: IMPALA-14290
URL: https://issues.apache.org/jira/browse/IMPALA-14290
Project: IMPALA
Issue Type: Bug
Reporter: Dániel Gábor Vankó
When creating or altering partitions of Iceberg tables, Impala only accepts
column names if they are in lower case.
E.g. START_TIME will throw exception in the PARTITIONED BY SPEC clause. (Also
with YEAR, MONTH, TRUNCATE and BUCKET transforms as well.)
{noformat}
[localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(START_TIME))
STORED AS ICEBERG;
Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(START_TIME))
STORED AS ICEBERG
2025-08-06 12:50:04 [Exception] ERROR: Query b7450d257090b184:8cb6c76000000000
failed:
ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
CAUSED BY: IllegalArgumentException: Cannot find source column: START_TIME
{noformat}
The table is successfully created if start_time is lowercase in the PARTITIONED
BY SPEC, but still uppercase in the column definition list:
{noformat}
[localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(start_time))
STORED AS ICEBERG;
Query: CREATE TABLE tab2 ( START_TIME TIMESTAMP )
PARTITIONED BY SPEC (DAY(start_time))
STORED AS ICEBERG
+-------------------------+
| summary |
+-------------------------+
| Table has been created. |
+-------------------------+
Fetched 1 row(s) in 0.21s{noformat}
Moreover, altering partition on an existing table also throws exception when
column name is not lowercase:
{noformat}
[localhost:21050] default> alter table tab2 set partition spec
(month(START_TIME));
Query: alter table tab2 set partition spec (month(START_TIME))
2025-08-06 13:22:58 [Exception] ERROR: Query bd439760e55e7cd2:9211459000000000
failed:
ImpalaRuntimeException: Failed to ALTER table 'tab2': Cannot find field
'START_TIME' in struct: struct<1: start_time: optional timestamp>{noformat}
During parsing, column name is set to lowercase:
[https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java#L109]
In contrast, a partition field name is processed as-is:
https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java#L52-L61
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]