I think that would be fine, but I want to throw out a quick warning: SparkTableUtil was initially written as a few handy helpers, so it wasn't well designed as an API. It's really useful, so I can understand wanting to extend it. But should we come up with a real API for these conversion tasks instead of updating the hacks?
On Mon, Mar 18, 2019 at 11:11 AM Anton Okolnychyi <aokolnyc...@apple.com.invalid> wrote: > Hi, > > SparkTableUtil can be helpful for migrating existing Spark tables into > Iceberg. Right now, SparkTableUtil assumes that the partition information > is always tracked in Hive metastore. > > What about extending SparkTableUtil to handle Spark tables that don’t rely > on Hive metastore? I have a local prototype that makes use of Spark > InMemoryFileIndex to infer the partitioning info. > > Thanks, > Anton -- Ryan Blue Software Engineer Netflix