[ 
https://issues.apache.org/jira/browse/IMPALA-14290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18019895#comment-18019895
 ] 

ASF subversion and git services commented on IMPALA-14290:
----------------------------------------------------------

Commit 492b2b7f46af8a2b0e0a868f3947916bb822e2a7 in impala's branch 
refs/heads/master from Daniel Vanko
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=492b2b7f4 ]

IMPALA-14406: Fix test_column_case_sensitivity for newer Iceberg versions

New test introduced in IMPALA-14290 depends on the Iceberg versions,
because newer ones (e.g. 1.5.2) will show

{"start_time_month":"646","end_time_day":"19916"}

instead of

{"start_time_day":null,"end_time_month":null,"start_time_month":"646","end_time_day":"19916"}

The test now accepts both cases.

Testing:
 * ran query_test/test_iceberg.py with both Iceberg 1.3.1 and 1.5.2

Change-Id: I17e368ac043d1fbf80a78dcac6ab1be5a297b6ea
Reviewed-on: http://gerrit.cloudera.org:8080/23389
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Iceberg partitioning should be case insensitive of column names
> ---------------------------------------------------------------
>
>                 Key: IMPALA-14290
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14290
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Dániel Gábor Vankó
>            Assignee: Dániel Gábor Vankó
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 5.0.0
>
>
> When creating or altering partitions of Iceberg tables, Impala only accepts 
> column names if they are in lower case.
> E.g. START_TIME will throw exception in the PARTITIONED BY SPEC clause. (Also 
> with YEAR, MONTH, TRUNCATE and BUCKET transforms as well.)
> {noformat}
> [localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(START_TIME))
> STORED AS ICEBERG;
> Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(START_TIME))
> STORED AS ICEBERG
> 2025-08-06 12:50:04 [Exception]  ERROR: Query 
> b7450d257090b184:8cb6c76000000000 failed:
> ImpalaRuntimeException: Error making 'createTable' RPC to Hive Metastore:
> CAUSED BY: IllegalArgumentException: Cannot find source column: START_TIME
> {noformat}
> The table is successfully created if start_time is lowercase in the 
> PARTITIONED BY SPEC, but still uppercase in the column definition list:
> {noformat}
> [localhost:21050] default> CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(start_time))
> STORED AS ICEBERG;
> Query: CREATE TABLE tab ( START_TIME TIMESTAMP )
> PARTITIONED BY SPEC (DAY(start_time))
> STORED AS ICEBERG
> +-------------------------+
> | summary                 |
> +-------------------------+
> | Table has been created. |
> +-------------------------+
> Fetched 1 row(s) in 0.21s{noformat}
> Moreover, altering partition on an existing table also throws exception when 
> column name is not lowercase:
> {noformat}
> [localhost:21050] default> ALTER TABLE tab SET PARTITION SPEC 
> (MONTH(START_TIME));
> Query: ALTER TABLE tab SET PARTITION SPEC (MONTH(START_TIME))
> 2025-08-06 13:22:58 [Exception]  ERROR: Query 
> bd439760e55e7cd2:9211459000000000 failed:
> ImpalaRuntimeException: Failed to ALTER table 'tab': Cannot find field 
> 'START_TIME' in struct: struct<1: start_time: optional timestamp>{noformat}
>  
> During parsing, column name is set to lowercase: 
> [https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java#L109]
> For legacy tables, partitioning columns are either handled by ColumnDef (in 
> CREATE TABLE staements) or by PartitionKeyValue (in INSERT statements): 
> https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java#L41-L44
> In contrast, an Iceberg partition field's name is processed as-is: 
> [https://github.com/apache/impala/blob/09a6f0e6cd912f573f0d8950abf40f498385c628/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java#L52-L61]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to