Thank you for the details Ryan but I think I was quite vague on the initial question so please let me try rephrasing the question by adding more context. Say after creating an Iceberg table with a particular schema, for which you can define top-level REQUIRED or OPTIONAL primitives, how can one evolve the schema with yet more REQUIRED top-level primitives?
I've worked out a tiny small test off because it seems that using addColumn(String name, Type type) [1] has all top-level fields added to the schema as optional. Any way I could add/ update as required fields instead? I couldn't see an explicit solution in the add top-level field API that accommodates the required/ optional aspect hence my question whether there was an explicit API design choice of having top-level fields implicitly added as optional. [1] https://github.com/Netflix/iceberg/blob/master/core/src/main/java/com/netflix/iceberg/SchemaUpdate.java#L64 This test fails because all fields are added as optional not required. @Test public void testAddRequiredTopLevelPrimitives() { Schema schema = new Schema( required(1, "id", Types.IntegerType.get())); Schema result = new SchemaUpdate(schema, 1) .addColumn("binary", Types.BinaryType.get()) .addColumn("boolean", Types.BooleanType.get()) .addColumn("date", Types.DateType.get()) .addColumn("decimal", Types.DecimalType.of(38, 5)) .addColumn("double", Types.DoubleType.get()) .addColumn("fixed", Types.FixedType.ofLength(12)) .addColumn("float", Types.FloatType.get()) .addColumn("long", Types.LongType.get()) .addColumn("string", Types.StringType.get()) .addColumn("time", Types.TimeType.get()) .addColumn("timestampz", Types.TimestampType.withoutZone()) .addColumn("timestamp", Types.TimestampType.withZone()) .addColumn("uuid", Types.UUIDType.get()) .apply(); Schema expected = new Schema( required(1, "id", Types.IntegerType.get()), required(2, "binary", Types.BinaryType.get()), required(3, "boolean", Types.BooleanType.get()), required(4, "date", Types.DateType.get()), required(5, "decimal", Types.DecimalType.of(38, 5)), required(6, "double", Types.DoubleType.get()), required(8, "fixed", Types.FixedType.ofLength(12)), required(9, "float", Types.FloatType.get()), required(10, "long", Types.LongType.get()), required(11, "string", Types.StringType.get()), required(12, "time", Types.TimeType.get()), required(13, "timestampz", Types.TimestampType.withoutZone()), required(14, "timestamp", Types.TimestampType.withZone()), required(15, "uuid", Types.UUIDType.get()) ); Assert.assertEquals("Should add required top level primitives and assign column IDs", expected.asStruct(), result.asStruct()); } On Tue, Jan 29, 2019 at 9:43 PM Ryan Blue <rb...@netflix.com.invalid> wrote: > Hi Filip, > > Iceberg can add fields to any struct. You can see the test case here: > > https://github.com/apache/incubator-iceberg/blob/master/core/src/test/java/com/netflix/iceberg/TestSchemaUpdate.java#L264-L271 > > rb > > On Tue, Jan 29, 2019 at 11:34 AM filip <filip....@gmail.com> wrote: > > > Is it by design that the schema evolution API for adding top-level fields > > will always create an optional field as per SchemaUpdate code [1]? > > > > [1] > > > > > https://github.com/Netflix/iceberg/blob/master/core/src/main/java/com/netflix/iceberg/SchemaUpdate.java#L102 > > > > -- > > Filip Bocse > > > > > -- > Ryan Blue > Software Engineer > Netflix > -- Filip Bocse