I'm in favor of 1 since previously these inputs would have thrown an
exception that wasn't really that helpful.

@Test
public void testDowncastoLongToInt() {
  Schema currentSchema = new Schema(required(1, "aCol", LongType.get()));
  Schema newSchema = new Schema(required(1, "aCol", IntegerType.get()));

  assertThatThrownBy(() -> new SchemaUpdate(currentSchema,
1).unionByNameWith(newSchema).apply());
}

I think removing states in which the API would fail is good and although we
didn't document it exactly this way before, I'm having a hard time thinking
of a case in which throwing an exception here would have been preferable to
a noop? Does anyone else have strong feelings around this?


On Thu, Oct 31, 2024 at 12:40 PM Rocco Varela <rocco.var...@gmail.com>
wrote:

> Hi everyone,
>
> Apologize if this is landing twice, my first attempt got lost somewhere in
> transit :)
>
> I have a PR that attempts to address
> https://github.com/apache/iceberg/issues/4849, basically adding logic to
> ignore downcasting column types when "mergeSchema" is set when an existing
> column type is long and the new schema has an int type for the same column.
>
> My solution involves updates to UnionByNameVisitor and this may end up
> changing the behavior of our public api in a way that hasn't previously
> been documented.
>
> Questions raised during the review is whether we should do one of the
> following:
>
>    1. Update our docs in UpdateSchema.unionByNameWith
>    <http://updateschema.unionbynamewith/> and callout something like "We
>    ignore differences in type if the new type is narrower than the existing
>    type", or
>    2. We add a new api UpdateSchema.unionByNameWith(Schema newSchema,
>    boolean ignoreTypeNarrowing)
>
>
> Any feedback would be appreciated, thanks for your time.
>
> Cheers,
>
> --Rocco
>

Reply via email to