[
https://issues.apache.org/jira/browse/IMPALA-13299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055988#comment-18055988
]
ASF subversion and git services commented on IMPALA-13299:
----------------------------------------------------------
Commit 6a0eedf4af137828257529501ace360208af0a3c in impala's branch
refs/heads/master from Arnab Karmakar
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=6a0eedf4a ]
IMPALA-13299: Support CREATE TABLE LIKE for Iceberg from HDFS sources
This patch enables creating Iceberg tables from non-Iceberg HDFS source
tables (Parquet, ORC, etc.) using CREATE TABLE LIKE with STORED BY ICEBERG.
This provides a metadata-only operation to convert table schemas to Iceberg
format without copying data.
Supported source types: Parquet, ORC, Avro, Text, and other HDFS-based formats
Not supported: Kudu tables, JDBC tables, Paimon tables
Use case: This is particularly useful for Apache Hive 3.1 environments where
CTAS (CREATE TABLE AS SELECT) with STORED BY ICEBERG is not supported - that
feature requires Hive 4.0+. Users can use CREATE TABLE LIKE to create the
Iceberg schema, then use INSERT INTO to migrate data.
Testing:
- Comprehensive tests covering schema conversion with various data types,
partitioned and external tables, complex types (STRUCT, ARRAY, MAP)
- Bidirectional conversion tests (non-Iceberg → Iceberg and reverse)
- Hive interoperability tests verifying data round-trips correctly
Change-Id: Id162f217e49e9f396419b09815b92eb7f351881e
Reviewed-on: http://gerrit.cloudera.org:8080/23733
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> CreateTableLike to create Iceberg table based on non-Iceberg table
> ------------------------------------------------------------------
>
> Key: IMPALA-13299
> URL: https://issues.apache.org/jira/browse/IMPALA-13299
> Project: IMPALA
> Issue Type: New Feature
> Components: Frontend
> Reporter: Quanlong Huang
> Assignee: Arnab Karmakar
> Priority: Major
> Labels: ramp-up
>
> test_delete_complextypes_mixed_files creates an Iceberg table in Hive using
> CTAS:
> {code:sql}
> create table ice_complex_delete stored by iceberg stored as orc as
> select * from functional_parquet.complextypestbl; {code}
> When migrating the test to Apache Hive 3.1, this needs to be converted into a
> CreateTable and an INSERT statement since {{STORED BY ICEBERG}} is not
> supported in Apache Hive 3. The table should be created using {{{}STORED BY
> 'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'{}}}. However, CTAS in
> Hive 3 doesn't support the {{STORED BY}} storageHandler clause. It'd be
> helpful if Impala can create the table itself. Then we can still use Hive to
> insert the ORC files.
> The statement we need is
> {code:sql}
> create table my_ice_tbl like functional_parquet.complextypestbl stored by
> iceberg;
> AnalysisException: functional_parquet.complextypestbl cannot be cloned into
> an Iceberg table because it is not an Iceberg table.{code}
> Note that functional_parquet.complextypestbl is just a non-partitioned table.
> It'd be nice to support partitioned tables as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]