Thanks everyone for participating in the discussion! Created the voting thread: https://lists.apache.org/thread/tfy96bqmz1bmdxr73x17w3xxj3yzs606 And a PR is ready for review: https://github.com/apache/iceberg/pull/14045 Please vote!
Thanks, Peter Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. aug. 30., Szo, 7:00): > Sorry for the late reply, I'm on vacation. > > Seems, like we missed the 1.10.0 release, so we should aim for the 1.11.0 > then. > > Renjie Liu's suggestion about mentioning it in the implementation notes > seems like a good idea. > > When I'm back, I will start the vote thread. > > Thanks, > Peter > > On Fri, Aug 22, 2025, 21:34 Ryan Blue <rdb...@gmail.com> wrote: > >> I agree with removing support for writing row values along with position >> deletes from the Java implementation writing to v2 tables. I don't think it >> is used anywhere and is no longer allowed in v3. >> >> For other implementations, I doubt it makes sense to support writing row >> values given that they are no longer allowed in v3 and it also no longer >> makes sense to read them because nothing creates them in the first place. >> >> And for the target to remove this support, it doesn't look like there is >> anything in the API module so we can remove them after they have been >> deprecated for a minor release. Maybe we should try to get this in right >> away so we can remove it in 1.11? >> >> On Wed, Aug 20, 2025 at 11:13 AM Russell Spitzer < >> russell.spit...@gmail.com> wrote: >> >>> Ah! That I support. Thanks for Clarifying Peter >>> >>> On Wed, Aug 20, 2025 at 1:08 PM Péter Váry <peter.vary.apa...@gmail.com> >>> wrote: >>> >>>> Let me clarify the proposal: >>>> >>>> The Position Deletes With Row (PDWR) feature, introduced in the V2 >>>> Iceberg specification, has been deprecated in the V3 spec. It remains >>>> implemented in the current Java version of the Parquet/Avro/ORC writers and >>>> is accessible via the FileAppenderFactory and FileWriterFactory interfaces. >>>> The feature is also covered by several unit tests. >>>> >>>> Although the Java implementation continues to support V2, the *proposal >>>> is to drop PDWR support in the Java implementation starting with the >>>> Iceberg 2.0.0 release.* >>>> >>>> To evaluate the impact of removing PDWR, I’ve submitted a pull request: >>>> https://github.com/apache/iceberg/pull/13870. >>>> >>>> You can see that I had to modify the following unit tests: >>>> - Test(Avro/ORC/Parquet)DeleteWriters >>>> - TestFileWriterFactory - extended by Generic/Spark/Flink >>>> - TestAppenderFactory - extended by Generic/Flink >>>> - TestWriterMetrics - extended by Spark/Flink >>>> - TestGenericSortedPosDeleteWriter >>>> - TestRewriteTablePathsAction.testPositionDeleteWithRow >>>> - TestPositionDeletesTable >>>> >>>> Let me know if you think the feature is still used somewhere. >>>> >>>> Thanks, >>>> Peter >>>> >>>> >>>> Renjie Liu <liurenjie2...@gmail.com> ezt írta (időpont: 2025. aug. >>>> 20., Sze, 4:24): >>>> >>>>> I think it would make sense to mention that it's deprecated in >>>>> implementation notes. Some libraries such as iceberg-rust are currently >>>>> working on v2 support, and if we have that statement in spec we could >>>>> ignore the support of row data in position deletes. >>>>> >>>>> On Wed, Aug 20, 2025 at 12:53 AM Russell Spitzer < >>>>> russell.spit...@gmail.com> wrote: >>>>> >>>>>> Sorry! I meant to say also that I am fully in favor of completely >>>>>> removing/deprecating this. But since we deprecated Position Deletes in V3 >>>>>> we probably >>>>>> already have this covered? >>>>>> >>>>>> On Tue, Aug 19, 2025 at 11:06 AM Fokko Driesprong <fo...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> PyIceberg doesn't produce it, or uses it at the planning phase. >>>>>>> Curious if there is any library that actually uses this. >>>>>>> >>>>>>> I do agree with Russell, and maybe deprecating this at the spec >>>>>>> level makes more sense. >>>>>>> >>>>>>> Kind regards, >>>>>>> Fokko >>>>>>> >>>>>>> Op di 19 aug 2025 om 17:54 schreef Russell Spitzer < >>>>>>> russell.spit...@gmail.com>: >>>>>>> >>>>>>>> I'm not sure we can deprecate the column in a library version >>>>>>>> update, but currently it is marked as optional >>>>>>>> and I don't think the Apache Java Library even has a way of writing >>>>>>>> or reading them. >>>>>>>> >>>>>>>> On Tue, Aug 19, 2025 at 10:15 AM Péter Váry < >>>>>>>> peter.vary.apa...@gmail.com> wrote: >>>>>>>> >>>>>>>>> During the last community sync (30/07), we discussed the current >>>>>>>>> state of the File Format API proposal [1] and found that implementing >>>>>>>>> the >>>>>>>>> writers for Positional Deletes where the actual row data is provided >>>>>>>>> would >>>>>>>>> complicate things quite a bit. >>>>>>>>> >>>>>>>>> The current Iceberg V2 spec [2] defines two types of position >>>>>>>>> delete files: >>>>>>>>> >>>>>>>>> 1. Files that store only the file name and row position. >>>>>>>>> 2. Files that also store the deleted row data. >>>>>>>>> >>>>>>>>> The 1st type of the position deletes are widely used. The 2nd type >>>>>>>>> of deletes is defined in the spec and some tests exist in the Iceberg >>>>>>>>> codebase, but we’re not aware of any actual implementation using the >>>>>>>>> second >>>>>>>>> type (position delete files with row data). Supporting writing V2 >>>>>>>>> tables >>>>>>>>> via the new File Format API would be simpler if we dropped support >>>>>>>>> for this >>>>>>>>> feature. >>>>>>>>> >>>>>>>>> I would like to hear of any uses of these delete files. If we can >>>>>>>>> not find use-cases, then *I propose to deprecate position delete >>>>>>>>> files with embedded row data starting from Iceberg 2.0.* >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Peter >>>>>>>>> >>>>>>>>> [1] - >>>>>>>>> https://lists.apache.org/thread/ovyh52m2b6c1hrg4fhw3rx92bzr793n2 >>>>>>>>> [2] - Position Delete File Spec: >>>>>>>>> https://iceberg.apache.org/spec/?h=delete#position-delete-files >>>>>>>>> >>>>>>>>