Max, Can you use the factory methods in Expressions <https://github.com/apache/iceberg/blob/main/api/src/main/java/org/apache/iceberg/expressions/Expressions.java#L304-L318> rather than changing visibility?
Also, I don’t think that making SchemaUpdate public is a good idea. It has a public interface that you can use and I don’t agree with exposing internal implementation classes just to write tests. I would expect test cases to be specific, much like how SchemaUpdate is itself tested. Ryan On Mon, May 19, 2025 at 9:41 AM Maximilian Michels <m...@apache.org> wrote: > Hi, > > The Flink Dynamic Iceberg Sink requires a few iceberg-core class > visibility changes, which I'd like to get your feedback on. > > Here is the diff: > > https://github.com/apache/iceberg/pull/13032/commits/1804b0ac4ff97c3c943463725e91a1e24b0f8c44 > > There are three visibility changes: > > Change 1: Make NamedReference constructor public > Change 2: Make UnboundTransform constructor public > Change 3: Make SchemaUpdate class and constructor public > > Change (1) and (2) both fall under the same feature, which is > rebinding PartitionSpec transforms. > > Background: The Dynamic Iceberg Sink can dynamically evolve Schema and > PartitionSpec of a table. The user provides schema / spec in the input > record. However, to ensure the changes are safe, we don't directly > take the user schema / spec, but we rebind it to one of the existing > schemas / specs. We do that by name-matching the fields in the > user-provided schema with the fields in the table schemas. This allows > us to directly use or evolve an existing schema and to preserve the > internal field ids. Name-based schema evolution is also what the > Iceberg Schema API offers. > > Similarly, user-provided PartitionSpecs need to be rewritten for their > transforms to get bound to the correct source id, which is one of the > field ids of the table schema. The UpdatePartitionSpec API works on > Terms which is what we generate with this code and the class > visibility changes: > > https://github.com/apache/iceberg/blob/61fedb11399ee344f9621362521e8a5071e91c05/flink/v2.0/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/PartitionSpecEvolution.java#L124 > In order to rebind the provided PartitionSpec to the id of the > name-matching field in the table schema, we need the constructor of > NamedReference and UnboundTransform to be public. > > Change (3) is used to test the schema update code. We want to test the > schema evolution code, which uses SchemaUpdate API to make the > changes. Our tests validate that the applied changes yield the > expected new schema. Example test code: > > https://github.com/apache/iceberg/blob/61fedb11399ee344f9621362521e8a5071e91c05/flink/v2.0/flink/src/test/java/org/apache/iceberg/flink/sink/dynamic/TestEvolveSchemaVisitor.java#L111 > > I hope that makes sense. I'm curious to hear what you think. Any concerns? > > Cheers, > Max >