I think it makes sense that you might want to do this. I'd be happy to review a PR with the new methods. Thanks!
On Fri, Jun 14, 2019 at 8:24 AM Filip <filip....@gmail.com> wrote: > Hi Iceberg devs, > > I was thinking about a general use-case of having data arriving with a > partial schema than that of a dataset. By having that dataset backed by an > Iceberg table we need to make sure that before each file is committed we at > least add the new columns to the Iceberg schema first. > > I was looking for > https://github.com/apache/incubator-iceberg/blob/master/spark/src/main/java/org/apache/iceberg/spark/SparkSchemaUtil.java > and > couldn't find a matching utility method. > Do you think it would make sense to have such an utility method, i.e. > detect the new (nested) fields in the schema of the inbound file by > comparing it to the Iceberg current schema and generate the effective > schema commit? > > -- > /Filip > -- Ryan Blue Software Engineer Netflix