Re: [DISCUSS] Multi-arg transforms

Russell Spitzer Tue, 25 Mar 2025 12:38:03 -0700

1. This makes sense to me, it was the only one requested in the past. It
should allow "IN" as well.
2.What's the suggestion here? To allow both source-id and source-ids in V3
but error out if the two don't match? Trying to determine how the
validation would look in both cases


On Tue, Mar 25, 2025 at 2:04 PM Fokko Driesprong <fo...@apache.org> wrote:

> Hi everyone,
>
> I wanted to get your attention to some small changes
> <https://github.com/apache/iceberg/pull/12644> to the multi-arg
> transforms that I've bumped into while working on the V3 spec for PyIceberg.
>
>    1. Up for debate. The spec does not point out an actual implementation
>    of transforms that accept multiple arguments. From the existing transforms,
>    the only contender is the bucket transform. Should we include this in the
>    V3 spec? It will only allow you to prune metadata if you do an equality
>    expression on all the fields that are part of the transform.
>    2. Along the way, we've removed something that we did not intend.
>    First we allowed to write source-id and source-ids based on the number of
>    arguments. This has been changed to only allow source-ids for V3 in a PR
>    that introduces backward compatibility. I think this makes the JSON
>    parsers/producers more complex than needed (specifically PyIceberg). Also,
>    in Java, we would need to plumb down the table version to the
>    PartitionSpecParser.java. I think it would be great to simplify this.
>
> Please let me know what you think so we can tie up the loose ends for V3.
>
> Kind regards,
> Fokko
>
>
>
>

Re: [DISCUSS] Multi-arg transforms

Reply via email to