Re: migrating Hadoop tables to tables with hive catalog

2021-07-01 Thread Huadong Liu
Thank you all. That saves rewriting all the manifest files, which is a lot. I did the following and it seems to be working fine. 1. Create an iceberg table using the hive catalog with the table schema, partition spec etc. 2. Copy the hadoop latest vd.metadata.json to the hive table metadata js

Re: migrating Hadoop tables to tables with hive catalog

2021-07-01 Thread Ryan Murray
I had a short proposal here[1] suggesting the same as Russell. I think this is probably a more broadly useful operation but I don't really know the best place for it to live. Im happy to finish the proposal if there are some opinions on where in iceberg it is appropriate to add such functionality.

Re: migrating Hadoop tables to tables with hive catalog

2021-07-01 Thread Russell Spitzer
I think you could probably also do this by just creating a Hive table and then changing the location to point to the most recent hadoop metadata.json file. > On Jul 1, 2021, at 1:42 AM, Huadong Liu wrote: > > FYI, I was able to do the migration by casting ManifestFile to > GenericManifestFile,

Re: migrating Hadoop tables to tables with hive catalog

2021-06-30 Thread Huadong Liu
FYI, I was able to do the migration by casting ManifestFile to GenericManifestFile, resetting sequence number and snapshot id and adding them to AppendFiles. On Mon, Jun 28, 2021 at 3:49 PM Huadong Liu wrote: > Hi, > > I am trying to migrate an Iceberg Hadoop table to a table using the hive > ca

migrating Hadoop tables to tables with hive catalog

2021-06-28 Thread Huadong Liu
Hi, I am trying to migrate an Iceberg Hadoop table to a table using the hive catalog. Luckily the table is appended only, so there are no delete files. It is not clear which APIs were used in a previous post