Thank you for the details Ryan but I think I was quite vague on the initial
question so please let me try rephrasing the question by adding more
context.
Say after creating an Iceberg table with a particular schema, for which you
can define top-level REQUIRED or OPTIONAL primitives, how can one evolve
the schema with yet more REQUIRED top-level primitives?

I've worked out a tiny small test off because it seems that using
addColumn(String name, Type type) [1] has all top-level fields added to the
schema as optional. Any way I could add/ update as required fields instead?
I couldn't see an explicit solution in the add top-level field API that
accommodates the required/ optional aspect hence my question whether there
was an explicit API design choice of having top-level fields implicitly
added as optional.

[1]
https://github.com/Netflix/iceberg/blob/master/core/src/main/java/com/netflix/iceberg/SchemaUpdate.java#L64

This test fails because all fields are added as optional not required.

  @Test
  public void testAddRequiredTopLevelPrimitives() {
    Schema schema = new Schema(
        required(1, "id", Types.IntegerType.get()));

    Schema result = new SchemaUpdate(schema, 1)
        .addColumn("binary", Types.BinaryType.get())
        .addColumn("boolean", Types.BooleanType.get())
        .addColumn("date", Types.DateType.get())
        .addColumn("decimal", Types.DecimalType.of(38, 5))
        .addColumn("double", Types.DoubleType.get())
        .addColumn("fixed", Types.FixedType.ofLength(12))
        .addColumn("float", Types.FloatType.get())
        .addColumn("long", Types.LongType.get())
        .addColumn("string", Types.StringType.get())
        .addColumn("time", Types.TimeType.get())
        .addColumn("timestampz", Types.TimestampType.withoutZone())
        .addColumn("timestamp", Types.TimestampType.withZone())
        .addColumn("uuid", Types.UUIDType.get())
        .apply();

    Schema expected = new Schema(
        required(1, "id", Types.IntegerType.get()),
        required(2, "binary", Types.BinaryType.get()),
        required(3, "boolean", Types.BooleanType.get()),
        required(4, "date", Types.DateType.get()),
        required(5, "decimal", Types.DecimalType.of(38, 5)),
        required(6, "double", Types.DoubleType.get()),
        required(8, "fixed", Types.FixedType.ofLength(12)),
        required(9, "float", Types.FloatType.get()),
        required(10, "long", Types.LongType.get()),
        required(11, "string", Types.StringType.get()),
        required(12, "time", Types.TimeType.get()),
        required(13, "timestampz", Types.TimestampType.withoutZone()),
        required(14, "timestamp", Types.TimestampType.withZone()),
        required(15, "uuid", Types.UUIDType.get())
    );

    Assert.assertEquals("Should add required top level primitives and
assign column IDs",
        expected.asStruct(), result.asStruct());
  }

On Tue, Jan 29, 2019 at 9:43 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> Hi Filip,
>
> Iceberg can add fields to any struct. You can see the test case here:
>
> https://github.com/apache/incubator-iceberg/blob/master/core/src/test/java/com/netflix/iceberg/TestSchemaUpdate.java#L264-L271
>
> rb
>
> On Tue, Jan 29, 2019 at 11:34 AM filip <filip....@gmail.com> wrote:
>
> > Is it by design that the schema evolution API for adding top-level fields
> > will always create an optional field as per SchemaUpdate code [1]?
> >
> > [1]
> >
> >
> https://github.com/Netflix/iceberg/blob/master/core/src/main/java/com/netflix/iceberg/SchemaUpdate.java#L102
> >
> > --
> > Filip Bocse
> >
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Filip Bocse

Reply via email to