FYI, I was able to do the migration by casting ManifestFile
to GenericManifestFile, resetting sequence number and snapshot id and
adding them to AppendFiles.

On Mon, Jun 28, 2021 at 3:49 PM Huadong Liu <huadong...@gmail.com> wrote:

> Hi,
>
> I am trying to migrate an Iceberg Hadoop table to a table using the hive
> catalog. Luckily the table is appended only, so there are no delete files.
> It is not clear which APIs were used in a previous post
> <https://lists.apache.org/thread.html/r39f2c773bc06889cb19d7de3729d868fccbafbafcfab1922332a4dc6%40%3Cdev.iceberg.apache.org%3E>
> .
>
> The list of ManifestFiles in the current snapshot can be obtained with the
> Snapshot allManifests
> <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Snapshot.html#allManifests-->
> API. However, they cannot be added to the new table's AppendFiles
> <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/AppendFiles.html>
>  for
> committing because the snapshot id needs to be blank
> <https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergeAppend.java#L55>
> .
>
> Alternatively, the table snapshots
> <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Table.html#snapshots-->
>  API
> can be used to get all snapshots of the table. From there, data files for
> each snapshot can be obtained with addedFiles
> <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Snapshot.html#addedFiles-->
> API and then added to AppendFiles of the new table with hive catalog.
>
> I am not sure the latter is correct for the migration. Any input is
> appreciated.
>
> --
> Huadong
>

Reply via email to