Re: Use a Spark schema util method to figure out new delta nested fields

Ryan Blue Tue, 18 Jun 2019 09:00:48 -0700

I think it makes sense that you might want to do this. I'd be happy to
review a PR with the new methods. Thanks!


On Fri, Jun 14, 2019 at 8:24 AM Filip <filip....@gmail.com> wrote:

> Hi Iceberg devs,
>
> I was thinking about a general use-case of having data arriving with a
> partial schema than that of a dataset. By having that dataset backed by an
> Iceberg table we need to make sure that before each file is committed we at
> least add the new columns to the Iceberg schema first.
>
> I was looking for
> https://github.com/apache/incubator-iceberg/blob/master/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java
>  and
> couldn't find a matching utility method.
> Do you think it would make sense to have such an utility method, i.e.
> detect the new (nested) fields in the schema of the inbound file by
> comparing it to the Iceberg current schema and generate the effective
> schema commit?
>
> --
> /Filip
>


-- 
Ryan Blue
Software Engineer
Netflix

Re: Use a Spark schema util method to figure out new delta nested fields

Reply via email to