Thanks Russell and Fokko. I updated my PR with the suggested updates.
Cheers, --Rocco On Fri, Nov 1, 2024 at 3:01 AM Fokko Driesprong <fo...@apache.org> wrote: > Hey Rocco, > > Thanks for raising this. I don't have any strong feelings about this, and > I agree with Russell that it should not throw an exception. > > I guess there was no strong reason behind how it is today, but it's just > because we leverage the UpdateSchema API, which raises an exception when > doing the downcast. > > Also, on the Python side, it will throw an exception when you take a union > of an int and long, but it is pretty straightforward to loosen that > requirement: https://github.com/apache/iceberg-python/pull/1283 > <https://github.com/apache/iceberg-python/pull/1283> > > Kind regards, > Fokko > > Op do 31 okt 2024 om 19:00 schreef Russell Spitzer < > russell.spit...@gmail.com>: > >> I'm in favor of 1 since previously these inputs would have thrown an >> exception that wasn't really that helpful. >> >> @Test >> public void testDowncastoLongToInt() { >> Schema currentSchema = new Schema(required(1, "aCol", LongType.get())); >> Schema newSchema = new Schema(required(1, "aCol", IntegerType.get())); >> >> assertThatThrownBy(() -> new SchemaUpdate(currentSchema, >> 1).unionByNameWith(newSchema).apply()); >> } >> >> I think removing states in which the API would fail is good and although >> we didn't document it exactly this way before, I'm having a hard time >> thinking of a case in which throwing an exception here would have been >> preferable to a noop? Does anyone else have strong feelings around this? >> >> >> On Thu, Oct 31, 2024 at 12:40 PM Rocco Varela <rocco.var...@gmail.com> >> wrote: >> >>> Hi everyone, >>> >>> Apologize if this is landing twice, my first attempt got lost somewhere >>> in transit :) >>> >>> I have a PR that attempts to address >>> https://github.com/apache/iceberg/issues/4849, basically adding logic >>> to ignore downcasting column types when "mergeSchema" is set when an >>> existing column type is long and the new schema has an int type for the >>> same column. >>> >>> My solution involves updates to UnionByNameVisitor and this may end up >>> changing the behavior of our public api in a way that hasn't previously >>> been documented. >>> >>> Questions raised during the review is whether we should do one of the >>> following: >>> >>> 1. Update our docs in UpdateSchema.unionByNameWith >>> <http://updateschema.unionbynamewith/> and callout something like >>> "We ignore differences in type if the new type is narrower than the >>> existing type", or >>> 2. We add a new api UpdateSchema.unionByNameWith(Schema newSchema, >>> boolean ignoreTypeNarrowing) >>> >>> >>> Any feedback would be appreciated, thanks for your time. >>> >>> Cheers, >>> >>> --Rocco >>> >>