Re: Extend SparkTableUtil to Handle Tables Not Tracked in Hive Metastore

Ryan Blue Mon, 18 Mar 2019 12:23:57 -0700

I think that would be fine, but I want to throw out a quick warning:
SparkTableUtil was initially written as a few handy helpers, so it wasn't
well designed as an API. It's really useful, so I can understand wanting to
extend it. But should we come up with a real API for these conversion tasks
instead of updating the hacks?


On Mon, Mar 18, 2019 at 11:11 AM Anton Okolnychyi
<aokolnyc...@apple.com.invalid> wrote:

> Hi,
>
> SparkTableUtil can be helpful for migrating existing Spark tables into
> Iceberg. Right now, SparkTableUtil assumes that the partition information
> is always tracked in Hive metastore.
>
> What about extending SparkTableUtil to handle Spark tables that don’t rely
> on Hive metastore? I have a local prototype that makes use of Spark
> InMemoryFileIndex to infer the partitioning info.
>
> Thanks,
> Anton



-- 
Ryan Blue
Software Engineer
Netflix

Re: Extend SparkTableUtil to Handle Tables Not Tracked in Hive Metastore

Reply via email to