Hi Fokko Sorry for the late reply :)
1. It sounds good to me. 2. I started to work on the core to use only source-ids. The Writer is writing only source-ids, whereas the Reader detects if source-id exists and use it (for backward compatibility). By using source-ids, it's clearly simpler and consistent. Regards JB On Tue, Mar 25, 2025 at 8:03 PM Fokko Driesprong <fo...@apache.org> wrote: > > Hi everyone, > > I wanted to get your attention to some small changes to the multi-arg > transforms that I've bumped into while working on the V3 spec for PyIceberg. > > Up for debate. The spec does not point out an actual implementation of > transforms that accept multiple arguments. From the existing transforms, the > only contender is the bucket transform. Should we include this in the V3 > spec? It will only allow you to prune metadata if you do an equality > expression on all the fields that are part of the transform. > Along the way, we've removed something that we did not intend. First we > allowed to write source-id and source-ids based on the number of arguments. > This has been changed to only allow source-ids for V3 in a PR that introduces > backward compatibility. I think this makes the JSON parsers/producers more > complex than needed (specifically PyIceberg). Also, in Java, we would need to > plumb down the table version to the PartitionSpecParser.java. I think it > would be great to simplify this. > > Please let me know what you think so we can tie up the loose ends for V3. > > Kind regards, > Fokko > > >