[jira] [Commented] (HIVE-28258) Use Iceberg semantics for Merge task

Sourabh Badhya (Jira) Tue, 21 May 2024 23:05:35 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-28258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848458#comment-17848458
 ]


Sourabh Badhya commented on HIVE-28258:
---------------------------------------

[~kkasa] , the following task mainly tries to reuse the existing Iceberg 
readers (IcebergRecordReader) rather than using the file-format readers 
according to the table format. This way we can use the existing code for 
handling different file formats (ORC, Parquet, Avro) within Iceberg and avoid 
writing any custom implementations to handle these file-formats.

Additionally, this will help in handling different schemas that Iceberg 
maintains (the data schema and the delete schema) within Iceberg, and not 
expose it through public APIs.

Custom hacks like changing the file format of the merge task is also removed 
which was done earlier.

The existing tests iceberg_merge_files.q should serve as an example for 
debugging the merge task used for Iceberg.

> Use Iceberg semantics for Merge task
> ------------------------------------
>
>                 Key: HIVE-28258
>                 URL: https://issues.apache.org/jira/browse/HIVE-28258
>             Project: Hive
>          Issue Type: Improvement
>          Components: Iceberg integration
>            Reporter: Sourabh Badhya
>            Assignee: Sourabh Badhya
>            Priority: Major
>              Labels: pull-request-available
>
> Use Iceberg semantics for Merge task, instead of normal ORC or parquet 
> readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28258) Use Iceberg semantics for Merge task

Reply via email to