[
https://issues.apache.org/jira/browse/HIVE-28258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848458#comment-17848458
]
Sourabh Badhya commented on HIVE-28258:
---------------------------------------
[~kkasa] , the following task mainly tries to reuse the existing Iceberg
readers (IcebergRecordReader) rather than using the file-format readers
according to the table format. This way we can use the existing code for
handling different file formats (ORC, Parquet, Avro) within Iceberg and avoid
writing any custom implementations to handle these file-formats.
Additionally, this will help in handling different schemas that Iceberg
maintains (the data schema and the delete schema) within Iceberg, and not
expose it through public APIs.
Custom hacks like changing the file format of the merge task is also removed
which was done earlier.
The existing tests iceberg_merge_files.q should serve as an example for
debugging the merge task used for Iceberg.
> Use Iceberg semantics for Merge task
> ------------------------------------
>
> Key: HIVE-28258
> URL: https://issues.apache.org/jira/browse/HIVE-28258
> Project: Hive
> Issue Type: Improvement
> Components: Iceberg integration
> Reporter: Sourabh Badhya
> Assignee: Sourabh Badhya
> Priority: Major
> Labels: pull-request-available
>
> Use Iceberg semantics for Merge task, instead of normal ORC or parquet
> readers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)