FYI, I was able to do the migration by casting ManifestFile to GenericManifestFile, resetting sequence number and snapshot id and adding them to AppendFiles.
On Mon, Jun 28, 2021 at 3:49 PM Huadong Liu <huadong...@gmail.com> wrote: > Hi, > > I am trying to migrate an Iceberg Hadoop table to a table using the hive > catalog. Luckily the table is appended only, so there are no delete files. > It is not clear which APIs were used in a previous post > <https://lists.apache.org/thread.html/r39f2c773bc06889cb19d7de3729d868fccbafbafcfab1922332a4dc6%40%3Cdev.iceberg.apache.org%3E> > . > > The list of ManifestFiles in the current snapshot can be obtained with the > Snapshot allManifests > <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Snapshot.html#allManifests--> > API. However, they cannot be added to the new table's AppendFiles > <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/AppendFiles.html> > for > committing because the snapshot id needs to be blank > <https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/MergeAppend.java#L55> > . > > Alternatively, the table snapshots > <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Table.html#snapshots--> > API > can be used to get all snapshots of the table. From there, data files for > each snapshot can be obtained with addedFiles > <https://iceberg.apache.org/javadoc/0.11.1/org/apache/iceberg/Snapshot.html#addedFiles--> > API and then added to AppendFiles of the new table with hive catalog. > > I am not sure the latter is correct for the migration. Any input is > appreciated. > > -- > Huadong >