Hi,

I am trying to migrate an Iceberg Hadoop table to a table using the hive
catalog. Luckily the table is appended only, so there are no delete files.
It is not clear which APIs were used in a previous post
<https://lists.apache.org/thread.html/r39f2c773bc06889cb19d7de3729d868fccbafbafcfab1922332a4dc6%40%3Cdev.iceberg.apache.org%3E>
.

The list of ManifestFiles in the current snapshot can be obtained with the
Snapshot allManifests
<https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Snapshot.html#allManifests-->
API. However, they cannot be added to the new table's AppendFiles
<https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/AppendFiles.html>
for
committing because the snapshot id needs to be blank
<https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergeAppend.java#L55>
.

Alternatively, the table snapshots
<https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Table.html#snapshots-->
API
can be used to get all snapshots of the table. From there, data files for
each snapshot can be obtained with addedFiles
<https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Snapshot.html#addedFiles-->
API and then added to AppendFiles of the new table with hive catalog.

I am not sure the latter is correct for the migration. Any input is
appreciated.

--
Huadong

Reply via email to