Thanks Russell and Fokko.

I updated my PR with the suggested updates.

Cheers,

--Rocco

On Fri, Nov 1, 2024 at 3:01 AM Fokko Driesprong <fo...@apache.org> wrote:

> Hey Rocco,
>
> Thanks for raising this. I don't have any strong feelings about this, and
> I agree with Russell that it should not throw an exception.
>
> I guess there was no strong reason behind how it is today, but it's just
> because we leverage the UpdateSchema API, which raises an exception when
> doing the downcast.
>
> Also, on the Python side, it will throw an exception when you take a union
> of an int and long, but it is pretty straightforward to loosen that
> requirement: https://github.com/apache/iceberg-python/pull/1283
> <https://github.com/apache/iceberg-python/pull/1283>
>
> Kind regards,
> Fokko
>
> Op do 31 okt 2024 om 19:00 schreef Russell Spitzer <
> russell.spit...@gmail.com>:
>
>> I'm in favor of 1 since previously these inputs would have thrown an
>> exception that wasn't really that helpful.
>>
>> @Test
>> public void testDowncastoLongToInt() {
>>   Schema currentSchema = new Schema(required(1, "aCol", LongType.get()));
>>   Schema newSchema = new Schema(required(1, "aCol", IntegerType.get()));
>>
>>   assertThatThrownBy(() -> new SchemaUpdate(currentSchema, 
>> 1).unionByNameWith(newSchema).apply());
>> }
>>
>> I think removing states in which the API would fail is good and although
>> we didn't document it exactly this way before, I'm having a hard time
>> thinking of a case in which throwing an exception here would have been
>> preferable to a noop? Does anyone else have strong feelings around this?
>>
>>
>> On Thu, Oct 31, 2024 at 12:40 PM Rocco Varela <rocco.var...@gmail.com>
>> wrote:
>>
>>> Hi everyone,
>>>
>>> Apologize if this is landing twice, my first attempt got lost somewhere
>>> in transit :)
>>>
>>> I have a PR that attempts to address
>>> https://github.com/apache/iceberg/issues/4849, basically adding logic
>>> to ignore downcasting column types when "mergeSchema" is set when an
>>> existing column type is long and the new schema has an int type for the
>>> same column.
>>>
>>> My solution involves updates to UnionByNameVisitor and this may end up
>>> changing the behavior of our public api in a way that hasn't previously
>>> been documented.
>>>
>>> Questions raised during the review is whether we should do one of the
>>> following:
>>>
>>>    1. Update our docs in UpdateSchema.unionByNameWith
>>>    <http://updateschema.unionbynamewith/> and callout something like
>>>    "We ignore differences in type if the new type is narrower than the
>>>    existing type", or
>>>    2. We add a new api UpdateSchema.unionByNameWith(Schema newSchema,
>>>    boolean ignoreTypeNarrowing)
>>>
>>>
>>> Any feedback would be appreciated, thanks for your time.
>>>
>>> Cheers,
>>>
>>> --Rocco
>>>
>>

Reply via email to