1. This makes sense to me, it was the only one requested in the past. It should allow "IN" as well. 2.What's the suggestion here? To allow both source-id and source-ids in V3 but error out if the two don't match? Trying to determine how the validation would look in both cases
On Tue, Mar 25, 2025 at 2:04 PM Fokko Driesprong <fo...@apache.org> wrote: > Hi everyone, > > I wanted to get your attention to some small changes > <https://github.com/apache/iceberg/pull/12644> to the multi-arg > transforms that I've bumped into while working on the V3 spec for PyIceberg. > > 1. Up for debate. The spec does not point out an actual implementation > of transforms that accept multiple arguments. From the existing transforms, > the only contender is the bucket transform. Should we include this in the > V3 spec? It will only allow you to prune metadata if you do an equality > expression on all the fields that are part of the transform. > 2. Along the way, we've removed something that we did not intend. > First we allowed to write source-id and source-ids based on the number of > arguments. This has been changed to only allow source-ids for V3 in a PR > that introduces backward compatibility. I think this makes the JSON > parsers/producers more complex than needed (specifically PyIceberg). Also, > in Java, we would need to plumb down the table version to the > PartitionSpecParser.java. I think it would be great to simplify this. > > Please let me know what you think so we can tie up the loose ends for V3. > > Kind regards, > Fokko > > > >