I'm in favor of 1 since previously these inputs would have thrown an exception that wasn't really that helpful.
@Test public void testDowncastoLongToInt() { Schema currentSchema = new Schema(required(1, "aCol", LongType.get())); Schema newSchema = new Schema(required(1, "aCol", IntegerType.get())); assertThatThrownBy(() -> new SchemaUpdate(currentSchema, 1).unionByNameWith(newSchema).apply()); } I think removing states in which the API would fail is good and although we didn't document it exactly this way before, I'm having a hard time thinking of a case in which throwing an exception here would have been preferable to a noop? Does anyone else have strong feelings around this? On Thu, Oct 31, 2024 at 12:40 PM Rocco Varela <rocco.var...@gmail.com> wrote: > Hi everyone, > > Apologize if this is landing twice, my first attempt got lost somewhere in > transit :) > > I have a PR that attempts to address > https://github.com/apache/iceberg/issues/4849, basically adding logic to > ignore downcasting column types when "mergeSchema" is set when an existing > column type is long and the new schema has an int type for the same column. > > My solution involves updates to UnionByNameVisitor and this may end up > changing the behavior of our public api in a way that hasn't previously > been documented. > > Questions raised during the review is whether we should do one of the > following: > > 1. Update our docs in UpdateSchema.unionByNameWith > <http://updateschema.unionbynamewith/> and callout something like "We > ignore differences in type if the new type is narrower than the existing > type", or > 2. We add a new api UpdateSchema.unionByNameWith(Schema newSchema, > boolean ignoreTypeNarrowing) > > > Any feedback would be appreciated, thanks for your time. > > Cheers, > > --Rocco >