Re: [DISCUSS] FileFormat API proposal

Russell Spitzer Sat, 20 Sep 2025 09:21:51 -0700

Now that I fully understand the situation I think option 0 as you've
written is probably the best thing to do as long as PositionDelete is a
class. I
think with hindsight it probably shouldn't have been a class and always
been an interface so that our internal code could produce rows which
implement PositionDelete rather than PositionDeletes that wrap rows.


On Fri, Sep 12, 2025 at 8:02 AM Péter Váry <[email protected]>
wrote:

> Let me summarize the state a bit:
>
> The FileFormat interface needs to expose two distinct methods:
>
>    - WriteBuilder<InternalRow>
>    - WriteBuilder<PositionDelete<InternalRow>>
>       - After the PDWR deprecation this will be
>       WriteBuilder<PositionDelete>
>       - After V2 deprecation this will be not needed anymore
>
> Based on the file format methods, the Registry must support four builder
> types:
>
>    - WriteBuilder<InternalRow>
>    - DataWriteBuilder<InternalRow>
>    - EqualityDeleteWriteBuilder<InternalRow>
>    - PositionDeleteWriteBuilder<InternalRow>
>
>
> *API Design Considerations*
> There is an argument that the two WriteBuilder methods provided by
> FileFormat are essentially the same, differing only in the writerFunction.
> While this is technically correct for current implementations, I believe
> the API should clearly distinguish between the two writer types to
> highlight the differences.
>
> *Discussed Approaches*
>
> *0. Two Explicit Methods on FormatModel* (removed based on previous
> comments, but I personally still prefer this)
>
> *WriteBuilder<InternalRow> writeBuilder(OutputFile outputFile);*
> *WriteBuilder<PositionDelete<InternalRow>>
> positionDeleteWriteBuilder(OutputFile outputFile); *
>
>
> Pros: Clear separation of responsibilities
>
> *1. One Builder + One Converter*
>
> *WriteBuilder<InternalRow> writeBuilder(OutputFile outputFile);*
> *Function<PositionDelete<D>, D> positionDeleteConverter(Schema schema);*
>
>
> Pros: Keeps the interface compact
> Cons: Requires additional documentation and understanding why the
> conversion logic is needed
>
> *2. Single Method with Javadoc Clarification* (most similar to the
> current approach)
>
> *WriteBuilder writeBuilder(OutputFile outputFile); *
>
>
> Pros: Minimalistic
> Cons: Least explicit; relies entirely on documentation
>
> *2/b. Single Builder with Type Parameter *(based on Russell's suggestion)
>
> *WriteBuilder writeBuilder(OutputFile outputFile);*
> *// Usage: builder.build(Class<D> inputType)*
>
>
> Pros: Flexible
> Cons: Relies on documentation to clarify the available input types
>
> *Bonus*
> Options 0 and 1 make it easier to phase out PositionDelete filtering once
> V2 tables are deprecated.
>
> Thanks,
> Peter
>
> Péter Váry <[email protected]> ezt írta (időpont: 2025. szept.
> 11., Cs, 18:36):
>
>> > Wouldn't PositionDelete<InternalRow> also be an InternalRow in this
>> example? I think that's what i'm confused about.
>>
>> With the *second approach*, the WriteBuilder doesn’t need to handle
>> PositionDelete objects directly. The conversion layer takes care of
>> that, so the WriteBuilder only needs to work with InternalRow.
>>
>> With the *first approach*, we shift that responsibility to the
>> WriteBuilder, which then has to support both InternalRow and
>> PositionDelete<InternalRow>.
>>
>> In both cases, the FormatModelRegistry API will still expose the more
>> concrete types (PositionDelete / InternalRow). However, under the *first
>> approach*, the lower-level API only needs to handle InternalRow,
>> simplifying its interface.
>> Thanks,
>> Peter
>>
>> Russell Spitzer <[email protected]> ezt írta (időpont: 2025.
>> szept. 11., Cs, 17:12):
>>
>>> Wouldn't PositionDelete<InternalRow> also be an InternalRow in this
>>> example? I think that's what i'm confused about.
>>>
>>> On Thu, Sep 11, 2025 at 5:35 AM Péter Váry <[email protected]>
>>> wrote:
>>>
>>>> Thanks, Russell, for taking a look at this!
>>>>
>>>> We need to expose four methods on the user-facing API (
>>>> FormatModelRegistry):
>>>>
>>>>    1. *writeBuilder* – for writing arbitrary files without Iceberg
>>>>    metadata. In the Iceberg codebase, this is exposed via 
>>>> FlinkAppenderFactory and
>>>>    the GenericAppenderFactory for creating FileAppender<RowData> and
>>>>    FileAppender<Record> only.
>>>>    2. *dataWriteBuilder* – for creating and collecting metadata for
>>>>    Iceberg DataFiles.
>>>>    3. *equalityDeleteWriteBuilder* – for creating and collecting
>>>>    metadata for Iceberg EqualityDeleteFiles.
>>>>    4. *positionDeleteWriteBuilder* – for creating and collecting
>>>>    metadata for Iceberg PositionDeleteFiles.
>>>>
>>>> We’d like to implement all four using a single WriteBuilder created by
>>>> the FormatModels.
>>>>
>>>> Your suggestion is a good one—it helps formalize the requirements for
>>>> the build method and also surfaces an important design question:
>>>>
>>>> *Who should be responsible for handling the differences between normal
>>>> rows (InternalRow) and position deletes (PositionDelete<InternalRow>)*?
>>>>
>>>>    - Should we have a more complex WriteBuilder class that can create
>>>>    both DataFileAppender and PositionDeleteAppender?
>>>>    - Or should we push this responsibility to the engine-specific
>>>>    code, where we already have some logic (e.g., pathTransformFunc)
>>>>    needed by each engine to create the PositionDeleteAppender?
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>>
>>>> Russell Spitzer <[email protected]> ezt írta (időpont: 2025.
>>>> szept. 11., Cs, 0:11):
>>>>
>>>>> I'm a little confused here, I think Ryan mentioned this in the comment
>>>>> here https://github.com/apache/iceberg/pull/12774/files#r2254967177
>>>>>
>>>>> From my understanding there are two options?
>>>>>
>>>>> 1) We either are producing FormatModels that take a generic row type D
>>>>> and produce writers that all take D and write files.
>>>>>
>>>>> 2) we are creating IcebergModel specific writers that take DataFile,
>>>>> PositionDeleteFile, EqualityDeleteFile etc ... and write files
>>>>>
>>>>> The PositionDelete Converter issue seems to stem from attempting to do
>>>>> both model 1 (being very generic) and 2, wanting special code to deal with
>>>>> PositionDeleteFile<R> objects.
>>>>>
>>>>> It looks like the code in #12774 is mostly doing model 1, but we are
>>>>> trying to add in a specific converter for 2?
>>>>>
>>>>> Maybe i'm totally lost here but I was assuming we would do something a
>>>>> little scala-y like
>>>>>
>>>>> public <T> FileAppender<T> build(Class<T> type) {
>>>>> if (type == DataFile.class) return (FileAppender<T>) new
>>>>> DataFileAppender();
>>>>> if (type == DeleteFile.class) return (FileAppender<T>) new
>>>>> DeleteFileAppender();
>>>>> // ...
>>>>> }
>>>>>
>>>>>
>>>>> So that we only register a single signature and if writer specific
>>>>> implementation needs to do something special it can? I'm trying to catch
>>>>> back up to speed on this PR so it may help to do a quick summary of the
>>>>> current state and intent. (At least for me)
>>>>>
>>>>> On Tue, Sep 9, 2025 at 3:42 AM Péter Váry <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi Renjie,
>>>>>> Thanks for taking a look!
>>>>>>
>>>>>> Let me clarify a few points:
>>>>>> - The converter API is only required for writing position delete
>>>>>> files for V2 tables
>>>>>> - Currently, there are no plans to support vectorized writing via the
>>>>>> java API
>>>>>> - Even if we decide to support vectorized writes, I don't think we
>>>>>> would like to implement it for Positional Deletes, which are deprecated 
>>>>>> in
>>>>>> the new spec.
>>>>>> - Also, once the positional deletes - which contain the deleted rows
>>>>>> - are deprecated (as planned), the conversion of the Position Deletes 
>>>>>> with
>>>>>> only file name and position would be trivial, even for the vectorized
>>>>>> writes.
>>>>>>
>>>>>> So from my perspective, the converter method exists purely for
>>>>>> backward compatibility, and we intend to remove it as soon as possible.
>>>>>> Sacrificing good practices for the sake of a deprecated feature doesn’t
>>>>>> seem worthwhile to me.
>>>>>>
>>>>>> Thanks,
>>>>>> Peter
>>>>>>
>>>>>> Renjie Liu <[email protected]> ezt írta (időpont: 2025. szept.
>>>>>> 8., H, 12:34):
>>>>>>
>>>>>>> Hi, Peter:
>>>>>>>
>>>>>>> I would vote for the first approach. In spite of the compromises
>>>>>>> described, the api is still cleaner. Also I think there are some 
>>>>>>> problems
>>>>>>> with the converter api. For example, for vectorized implementations 
>>>>>>> such as
>>>>>>> comet which accepts columnar batch rather than rows, the converter 
>>>>>>> method
>>>>>>> would make things more complicated.
>>>>>>>
>>>>>>> On Sat, Aug 30, 2025 at 2:49 PM Péter Váry <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> I’ve initiated a discussion thread regarding the deprecation of
>>>>>>>> Position Deletes containing row data. You can follow it here:
>>>>>>>> https://lists.apache.org/thread/8jw6pb2vq3ghmdqf1yvy8n5n6gg1fq5s
>>>>>>>>
>>>>>>>> We can proceed with the discussion about the native reader/writer
>>>>>>>> deprecation when we decided on the final API, as the chosen design may
>>>>>>>> influence our approach.
>>>>>>>>
>>>>>>>> Since then, one more question has come up - hopefully the last:
>>>>>>>> *How should we handle Position Delete Writers?*
>>>>>>>> The File Format API should return builders for either rows or
>>>>>>>> PositionDelete objects. Currently the method
>>>>>>>> `WriteBuilder.createWriterFunc(Function<MessageType,
>>>>>>>> ParquetValueWriter<?>>)` defines the accepted input parameters for the
>>>>>>>> writer. Users are responsible for ensuring that the writer function 
>>>>>>>> and the
>>>>>>>> return type of the `WriteBuilder.build()` are compatible. In the new 
>>>>>>>> API,
>>>>>>>> we no longer expose writer functions. We still expose FileContent, 
>>>>>>>> since
>>>>>>>> writer configurations vary by content type, but we don’t expose the 
>>>>>>>> types.
>>>>>>>>
>>>>>>>> There are two proposals for handling types for the WriteBuilders:
>>>>>>>>
>>>>>>>>    1. *Implicit Type Definition via FileContent* - the builder
>>>>>>>>    parameter for FileContent would implicitly define the input type 
>>>>>>>> for the
>>>>>>>>    writer returned by build(), or
>>>>>>>>    2. *Engine level conversion* - Engines would convert
>>>>>>>>    PositionDelete objects to their native types.
>>>>>>>>
>>>>>>>> In code:
>>>>>>>>
>>>>>>>>    - In the 1st proposal, the FormatModel.writeBuilder(OutputFile
>>>>>>>>    outputFile) can return anything:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *    WriteBuilder builder =
>>>>>>>>    FormatModelRegistry.writeBuilder(PARQUET, InternalRow.class, 
>>>>>>>> outputFile);
>>>>>>>>      FileAppender<InternalRow> appender =         
>>>>>>>> .schema(table.schema())
>>>>>>>>        .content(FileContent.DATA)         ....         .build(); *
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *   // Exposed, but FormatModelRegistry.positionDeleteWriteBuilder
>>>>>>>>    should be used instead    WriteBuilder builder
>>>>>>>>    = FormatModelRegistry.writeBuilder(PARQUET, InternalRow.class, 
>>>>>>>> outputFile);
>>>>>>>>       FileAppender<PositionDelete<InternalRow>> appender =
>>>>>>>>    .schema(table.schema())         
>>>>>>>> .content(FileContent.POSITION_DELETES)
>>>>>>>>        ....         .build();*
>>>>>>>>
>>>>>>>>
>>>>>>>>    - In the 2nd proposal, the FormatModel needs another method:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Function<PositionDelete<D>, D> positionDeleteConverter(Schema
>>>>>>>> schema);    *example implementation:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *    return delete -> {      deleteRecord.update(0,
>>>>>>>> UTF8String.fromString(delete.path().toString()));
>>>>>>>> deleteRecord.update(1, delete.pos());      deleteRecord.update(2,
>>>>>>>> delete.row());      return deleteRecord;    };*
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *    // Content is only used for writer property configuration
>>>>>>>>    WriteBuilder<InternalRow> builder =
>>>>>>>>    sparkFormatModel.writeBuilder(outputFile);
>>>>>>>>      FileAppender<InternalRow> appender =         
>>>>>>>> .schema(table.schema())
>>>>>>>>        .content(FileContent.DATA)         ....         .build();*
>>>>>>>>
>>>>>>>>
>>>>>>>> Drawbacks
>>>>>>>>
>>>>>>>>    - Proposal 1:
>>>>>>>>       - Type checking for the FileAppenders occurs only at
>>>>>>>>       runtime, so user errors surface late.
>>>>>>>>       - File Format specification must clearly specify which
>>>>>>>>       builder type corresponds to which file content 
>>>>>>>> parameter—generics would
>>>>>>>>       offer better clarity.
>>>>>>>>       - Inconsistent patterns between WriteBuilder and
>>>>>>>>       ReadBuilder, as the latter can define output types via generics.
>>>>>>>>    - Proposal 2:
>>>>>>>>       - Requires FormatModels to implement a converter method to
>>>>>>>>       transform PositionDelete<InternalRow> into InternalRow.
>>>>>>>>
>>>>>>>> Since we deprecated writing position delete files in the V3 spec,
>>>>>>>> this extra method in the 2nd proposal will be deprecated too. As a 
>>>>>>>> result,
>>>>>>>> in the long run, we will have a nice, clean API.
>>>>>>>> OTOH, if we accept the compromise described in the 1st proposal,
>>>>>>>> the results of our decision will remain, even when the functions are
>>>>>>>> removed.
>>>>>>>>
>>>>>>>> Looking forward to your thoughts.
>>>>>>>> Thanks, Peter
>>>>>>>>
>>>>>>>> On Thu, Aug 14, 2025, 14:12 Péter Váry <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Team,
>>>>>>>>>
>>>>>>>>> During yesterday’s community sync, we discussed the current state
>>>>>>>>> of the File Format API proposal and identified two key questions that
>>>>>>>>> require input from the broader community:
>>>>>>>>>
>>>>>>>>> *1. Dropping support for Position Delete files with Row Data*
>>>>>>>>>
>>>>>>>>> The current Iceberg V2 spec [1] defines two types of position
>>>>>>>>> delete files:
>>>>>>>>>
>>>>>>>>>    - Files that store only the file name and row position.
>>>>>>>>>    - Files that also store the deleted row data.
>>>>>>>>>
>>>>>>>>> Although this feature is defined in the spec and some tests exist
>>>>>>>>> in the Iceberg codebase, we’re not aware of any actual implementation 
>>>>>>>>> using
>>>>>>>>> the second type (with row data). Supporting V2 table writing via the 
>>>>>>>>> new
>>>>>>>>> File Format API would be simpler if we dropped support for this 
>>>>>>>>> feature.
>>>>>>>>> If you know of any use case or reason to retain support for
>>>>>>>>> position deletes with row data, please let us know.
>>>>>>>>>
>>>>>>>>> *2. Deprecating Native File Format Readers/Writers in the API*
>>>>>>>>>
>>>>>>>>> The current API contains format-specific readers/writers for
>>>>>>>>> Parquet, Avro, and ORC. With the introduction of the InternalData and 
>>>>>>>>> File
>>>>>>>>> Format APIs, Iceberg users can now write files using:
>>>>>>>>>
>>>>>>>>>    - InternalData API for metadata files (manifest, manifest
>>>>>>>>>    list, partition stats).
>>>>>>>>>    - File Format API for data and delete files.
>>>>>>>>>
>>>>>>>>> I propose we deprecate the original format-specific writers and
>>>>>>>>> guide users to use the new APIs based on the target file type. If 
>>>>>>>>> you’re
>>>>>>>>> aware of any use cases that still require the original format-specific
>>>>>>>>> writers, please share them.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>>> [1] - Position Delete File Spec:
>>>>>>>>> https://iceberg.apache.org/spec/?h=delete#position-delete-files
>>>>>>>>>
>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>>>> júl. 22., K, 16:09):
>>>>>>>>>
>>>>>>>>>> Also put together a solution where the Engine specific format
>>>>>>>>>> transformation is separated from the writer, and the engines need to 
>>>>>>>>>> take
>>>>>>>>>> care of it separately.
>>>>>>>>>> This is somewhat complicated on the implementation side (see:
>>>>>>>>>> [RowDataTransformer](
>>>>>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-562fa4cc369c908a157f59a9235fd3f389096451e7901686fba37c87b53dee08),
>>>>>>>>>> and [InternalRowTransformer](
>>>>>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-546f9dc30e3207d1d2bc0a2722976b55f5a04dcf85a22855e4f400500c317140)),
>>>>>>>>>> but simplifies the API.
>>>>>>>>>>
>>>>>>>>>> @rdblue: Please check the proposed solution. I think this is what
>>>>>>>>>> you have suggested
>>>>>>>>>>
>>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont:
>>>>>>>>>> 2025. jún. 30., H, 18:42):
>>>>>>>>>>
>>>>>>>>>>> During the PR review [1], we began exploring what could we use
>>>>>>>>>>> as an intermediate layer to reduce the need for engines and file 
>>>>>>>>>>> formats to
>>>>>>>>>>> implement the full matrix of file format - object model conversions.
>>>>>>>>>>>
>>>>>>>>>>> To support this discussion, I’ve created and run a set of
>>>>>>>>>>> performance benchmarks and compiled a document outlining the 
>>>>>>>>>>> potential
>>>>>>>>>>> benefits and trade-offs [2].
>>>>>>>>>>>
>>>>>>>>>>> Feedback is welcome, feel free to comment on the document, the
>>>>>>>>>>> PR, or directly in this thread.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Peter
>>>>>>>>>>>
>>>>>>>>>>> [1] - PR discussion -
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774#discussion_r2093626096
>>>>>>>>>>> [2] - File Format and engine object model transformation
>>>>>>>>>>> performance -
>>>>>>>>>>> https://docs.google.com/document/d/1GdA8IowKMtS3QVdm8s-0X-ZRYetcHv2bhQ9mrSd3fd4
>>>>>>>>>>>
>>>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont:
>>>>>>>>>>> 2025. máj. 7., Sze, 13:15):
>>>>>>>>>>>
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>> The proposed API part is reviewed and ready to go. See:
>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>>>>> Thanks to everyone who reviewed it already!
>>>>>>>>>>>>
>>>>>>>>>>>> Many of you wanted to review, but I know that the time
>>>>>>>>>>>> constraints are there for everyone. I still very much would like 
>>>>>>>>>>>> to hear
>>>>>>>>>>>> your voices, so I will not merge the PR this week. Please review 
>>>>>>>>>>>> it if you.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Peter
>>>>>>>>>>>>
>>>>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont:
>>>>>>>>>>>> 2025. ápr. 16., Sze, 7:02):
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Renjie,
>>>>>>>>>>>>> The first one for the proposed new API is here:
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>>>>>> Thanks, Peter
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 16, 2025, 05:40 Renjie Liu <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi, Peter:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the effort. I totally agree with splitting them
>>>>>>>>>>>>>> into smaller prs to move forward.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm quite interested in this topic, and please ping me in
>>>>>>>>>>>>>> those splitted prs and I'll help to review.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Apr 14, 2025 at 11:22 PM Jean-Baptiste Onofré <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Peter
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Awesome ! Thank you so much !
>>>>>>>>>>>>>>> I will do a new pass.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>> JB
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Apr 11, 2025 at 3:48 PM Péter Váry <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Hi JB,
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Separated out the proposed interfaces to a new PR:
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774.
>>>>>>>>>>>>>>> > Reviewers can check that out if they are only interested
>>>>>>>>>>>>>>> in how the new API would look like.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Thanks,
>>>>>>>>>>>>>>> > Peter
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > Jean-Baptiste Onofré <[email protected]> ezt írta (időpont:
>>>>>>>>>>>>>>> 2025. ápr. 10., Cs, 18:25):
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Hi Peter
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Thanks for the ping about the PR.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Maybe, to facilitate the review and move forward faster,
>>>>>>>>>>>>>>> we should
>>>>>>>>>>>>>>> >> split the PR in smaller PRs:
>>>>>>>>>>>>>>> >> - one with the interfaces (ReadBuilder, AppenderBuilder,
>>>>>>>>>>>>>>> ObjectModel,
>>>>>>>>>>>>>>> >> AppenderBuilder, DataWriterBuilder, ...)
>>>>>>>>>>>>>>> >> - one for each file providers (Parquet, Avro, ORC)
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Thoughts ? I can help on the split if needed.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Regards
>>>>>>>>>>>>>>> >> JB
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> On Thu, Apr 10, 2025 at 5:16 AM Péter Váry <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Since the 1.9.0 release candidate has been created, I
>>>>>>>>>>>>>>> would like to resurrect this PR:
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298 to ensure that
>>>>>>>>>>>>>>> we have as long a testing period as possible for it.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > To recap, here is what the PR does after the review
>>>>>>>>>>>>>>> rounds:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Created 3 interface classes which are implemented by
>>>>>>>>>>>>>>> the file formats:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > ReadBuilder - Builder for reading data from data files
>>>>>>>>>>>>>>> >> > AppenderBuilder - Builder for writing data to data files
>>>>>>>>>>>>>>> >> > ObjectModel - Providing ReadBuilders, and
>>>>>>>>>>>>>>> AppenderBuilders for the specific data file format and object 
>>>>>>>>>>>>>>> model pair
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Updated the Parquet, Avro, ORC implementation for this
>>>>>>>>>>>>>>> interfaces, and deprecated the old reader/writer APIs
>>>>>>>>>>>>>>> >> > Created interface classes which will be used by the
>>>>>>>>>>>>>>> actual readers/writers of the data files:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > AppenderBuilder - Builder for writing a file
>>>>>>>>>>>>>>> >> > DataWriterBuilder - Builder for generating a data file
>>>>>>>>>>>>>>> >> > PositionDeleteWriterBuilder - Builder for generating a
>>>>>>>>>>>>>>> position delete file
>>>>>>>>>>>>>>> >> > EqualityDeleteWriterBuilder - Builder for generating an
>>>>>>>>>>>>>>> equality delete file
>>>>>>>>>>>>>>> >> > No ReadBuilder here - the file format reader builder is
>>>>>>>>>>>>>>> reused
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Created a WriterBuilder class which implements the
>>>>>>>>>>>>>>> interfaces above
>>>>>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>>>>>> based on a provided file format specific AppenderBuilder
>>>>>>>>>>>>>>> >> > Created an ObjectModelRegistry which stores the
>>>>>>>>>>>>>>> available ObjectModels, and engines and users could request the 
>>>>>>>>>>>>>>> readers
>>>>>>>>>>>>>>> (ReadBuilder) and writers
>>>>>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>>>>>> from.
>>>>>>>>>>>>>>> >> > Created the appropriate ObjectModels:
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > GenericObjectModels - for reading and writing Iceberg
>>>>>>>>>>>>>>> Records
>>>>>>>>>>>>>>> >> > SparkObjectModels - for reading (vectorized and
>>>>>>>>>>>>>>> non-vectorized) and writing Spark InternalRow/ColumnarBatch 
>>>>>>>>>>>>>>> objects
>>>>>>>>>>>>>>> >> > FlinkObjectModels - for reading and writing Flink
>>>>>>>>>>>>>>> RowData objects
>>>>>>>>>>>>>>> >> > An arrow object model is also registered for vectorized
>>>>>>>>>>>>>>> reads of Parquet files into Arrow ColumnarBatch objects
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Updated the production code where the reading and
>>>>>>>>>>>>>>> writing happens to use the ObjectModelRegistry and the new 
>>>>>>>>>>>>>>> reader/writer
>>>>>>>>>>>>>>> interfaces to access data files
>>>>>>>>>>>>>>> >> > Kept the testing code intact to ensure that the new
>>>>>>>>>>>>>>> API/code is not breaking anything
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > The original change was not small, and grew
>>>>>>>>>>>>>>> substantially during the review rounds. So if you have 
>>>>>>>>>>>>>>> questions, or I can
>>>>>>>>>>>>>>> do anything to make the review easier, don't hesitate to ask. I 
>>>>>>>>>>>>>>> am happy to
>>>>>>>>>>>>>>> do anything to move this forward.
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>>>>>> >> > Peter
>>>>>>>>>>>>>>> >> >
>>>>>>>>>>>>>>> >> > Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>>>> (időpont: 2025. márc. 26., Sze, 14:54):
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Hi everyone,
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> I have updated the File Format API PR (
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) based on the
>>>>>>>>>>>>>>> answers and review comments.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> I would like to merge this only after the 1.9.0
>>>>>>>>>>>>>>> release so we have more time finding any issues and solving 
>>>>>>>>>>>>>>> them before
>>>>>>>>>>>>>>> this goes to a release for the users.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> For this I have updated the deprecation comments
>>>>>>>>>>>>>>> accordingly.
>>>>>>>>>>>>>>> >> >> I would like to ask you to review the PR, so we iron
>>>>>>>>>>>>>>> out any possible requested changes and be ready for the merge 
>>>>>>>>>>>>>>> as soon as
>>>>>>>>>>>>>>> possible after the 1.9.0 release.
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Thanks,
>>>>>>>>>>>>>>> >> >> Peter
>>>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>>>> >> >> Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:32):
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> Hi Renije,
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> > 1. File format filters
>>>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>>>> >> >>> > Do the filters include both filter expressions from
>>>>>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> The current discussion is about the filters from the
>>>>>>>>>>>>>>> user query.
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> About the delete filter:
>>>>>>>>>>>>>>> >> >>> Based on the suggestions on the PR, I have moved the
>>>>>>>>>>>>>>> delete filter out from the main API. Created a 
>>>>>>>>>>>>>>> `SupportsDeleteFilter`
>>>>>>>>>>>>>>> interface for it which would allow pushing down to the filter 
>>>>>>>>>>>>>>> to Parquet
>>>>>>>>>>>>>>> vectorized readers in Spark, as this is the only place where we 
>>>>>>>>>>>>>>> currently
>>>>>>>>>>>>>>> implemented this feature.
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>>>> >> >>> Renjie Liu <[email protected]> ezt írta
>>>>>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:11):
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> Hi, Peter:
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> Thanks for the effort on this.
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> 1. File format filters
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> Do the filters include both filter expressions from
>>>>>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> For filters from user query, I agree with you that
>>>>>>>>>>>>>>> we should keep the current behavior.
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> For delete filters associated with data files, at
>>>>>>>>>>>>>>> first I thought file format readers should not care about this. 
>>>>>>>>>>>>>>> But now I
>>>>>>>>>>>>>>> realized that maybe we need to also push it to file reader, 
>>>>>>>>>>>>>>> this is useful
>>>>>>>>>>>>>>> when `IS_DELETED` metadata column is not necessary and we could 
>>>>>>>>>>>>>>> use these
>>>>>>>>>>>>>>> filters (position deletes, etc) to further prune data.
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> But anyway, I agree that we could postpone it in
>>>>>>>>>>>>>>> follow up pr.
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> 2. Batch size configuration
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> I'm leaning toward option 2.
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> 3. Spark configuration
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> I'm leaning towards using different configuration
>>>>>>>>>>>>>>> objects.
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>>>> >> >>>> On Thu, Mar 20, 2025 at 10:23 PM Péter Váry <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> Hi Team,
>>>>>>>>>>>>>>> >> >>>>> Thanks everyone for the reviews on
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298!
>>>>>>>>>>>>>>> >> >>>>> I have addressed most of comments, but a few
>>>>>>>>>>>>>>> questions still remain which might merit a bit wider audience:
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> We should decide on the expected filtering behavior
>>>>>>>>>>>>>>> when the filters are pushed down to the readers. Currently the 
>>>>>>>>>>>>>>> filters are
>>>>>>>>>>>>>>> applied as best effort for the file format readers. Some 
>>>>>>>>>>>>>>> readers (Avro)
>>>>>>>>>>>>>>> just skip them altogether. There was a suggestion on the PR 
>>>>>>>>>>>>>>> that we might
>>>>>>>>>>>>>>> enforce more strict requirements and the readers either reject 
>>>>>>>>>>>>>>> part of the
>>>>>>>>>>>>>>> filters, or they could apply them fully.
>>>>>>>>>>>>>>> >> >>>>> Batch sizes are currently parameters for the reader
>>>>>>>>>>>>>>> builders which could be set for non-vectorized readers too 
>>>>>>>>>>>>>>> which could be
>>>>>>>>>>>>>>> confusing.
>>>>>>>>>>>>>>> >> >>>>> Currently the Spark batch reader uses different
>>>>>>>>>>>>>>> configuration objects for ParquetBatchReadConf and 
>>>>>>>>>>>>>>> OrcBatchReadConf as
>>>>>>>>>>>>>>> requested by the reviewers of the Comet PR. There was a 
>>>>>>>>>>>>>>> suggestion on the
>>>>>>>>>>>>>>> current PR to use a common configuration instead.
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> I would be interested in hearing your thoughts
>>>>>>>>>>>>>>> about these topics.
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> My current take:
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> File format filters: I am leaning towards keeping
>>>>>>>>>>>>>>> the current laninet behavior. Especially since Bloom filters 
>>>>>>>>>>>>>>> are not able
>>>>>>>>>>>>>>> to do a full filtering, and are often used as a way to filter 
>>>>>>>>>>>>>>> out unwanted
>>>>>>>>>>>>>>> records. Another option would be to implement a secondary 
>>>>>>>>>>>>>>> filtering inside
>>>>>>>>>>>>>>> the file formats themselves which I think would cause extra 
>>>>>>>>>>>>>>> complexity, and
>>>>>>>>>>>>>>> possible code duplication. Whatever the decision here, I would 
>>>>>>>>>>>>>>> suggest
>>>>>>>>>>>>>>> moving this out to a next PR as the current changeset is big 
>>>>>>>>>>>>>>> enough as it
>>>>>>>>>>>>>>> is.
>>>>>>>>>>>>>>> >> >>>>> Batch size configuration: Currently this is the
>>>>>>>>>>>>>>> only property which is different in the batch readers and the
>>>>>>>>>>>>>>> non-vectorized readers. I see 3 possible solutions:
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> Create different builders for vectorized and
>>>>>>>>>>>>>>> non-vectorized reads - I don't think the current solution is 
>>>>>>>>>>>>>>> confusing
>>>>>>>>>>>>>>> enough to worth the extra class
>>>>>>>>>>>>>>> >> >>>>> We could put this to the reader configuration
>>>>>>>>>>>>>>> property set - This could work, but "hide" the possible 
>>>>>>>>>>>>>>> configuration mode
>>>>>>>>>>>>>>> which is valid for both Parquet and ORC readers
>>>>>>>>>>>>>>> >> >>>>> We could keep things as it is now - I would chose
>>>>>>>>>>>>>>> this one, but I don't have a strong opinion here
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> Spark configuration: TBH, I'm open to bot solution
>>>>>>>>>>>>>>> and happy to move to the direction the community decides on
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> Thanks,
>>>>>>>>>>>>>>> >> >>>>> Peter
>>>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>>>> >> >>>>> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>>>>>> (időpont: 2025. márc. 14., P, 16:31):
>>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>>> >> >>>>>> Hi Peter
>>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>>> >> >>>>>> Thanks for the update. I will do a new pass on the
>>>>>>>>>>>>>>> PR.
>>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>>> >> >>>>>> Regards
>>>>>>>>>>>>>>> >> >>>>>> JB
>>>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>>>> >> >>>>>> On Thu, Mar 13, 2025 at 1:16 PM Péter Váry <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>>>>>> >> >>>>>> > Hi Team,
>>>>>>>>>>>>>>> >> >>>>>> > I have rebased the File Format API proposal (
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) to include
>>>>>>>>>>>>>>> the new changes needed for the Variant types. I would love to 
>>>>>>>>>>>>>>> hear your
>>>>>>>>>>>>>>> feedback, especially Dan and Ryan, as you were the most active 
>>>>>>>>>>>>>>> during our
>>>>>>>>>>>>>>> discussions. If I can help in any way to make the review 
>>>>>>>>>>>>>>> easier, please let
>>>>>>>>>>>>>>> me know.
>>>>>>>>>>>>>>> >> >>>>>> > Thanks,
>>>>>>>>>>>>>>> >> >>>>>> > Peter
>>>>>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>>>>>> >> >>>>>> > Péter Váry <[email protected]> ezt
>>>>>>>>>>>>>>> írta (időpont: 2025. febr. 28., P, 17:50):
>>>>>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>>>>>> >> >>>>>> >> Hi everyone,
>>>>>>>>>>>>>>> >> >>>>>> >> Thanks for all of the actionable, relevant
>>>>>>>>>>>>>>> feedback on the PR (
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298).
>>>>>>>>>>>>>>> >> >>>>>> >> Updated the code to address most of them.
>>>>>>>>>>>>>>> Please check if you agree with the general approach.
>>>>>>>>>>>>>>> >> >>>>>> >> If there is a consensus about the general
>>>>>>>>>>>>>>> approach, I could. separate out the PR to smaller pieces so we 
>>>>>>>>>>>>>>> can have an
>>>>>>>>>>>>>>> easier time to review and merge those step-by-step.
>>>>>>>>>>>>>>> >> >>>>>> >> Thanks,
>>>>>>>>>>>>>>> >> >>>>>> >> Peter
>>>>>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>>>>>> >> >>>>>> >> Jean-Baptiste Onofré <[email protected]> ezt
>>>>>>>>>>>>>>> írta (időpont: 2025. febr. 20., Cs, 14:14):
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> Hi Peter
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> sorry for the late reply on this.
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> I did a pass on the proposal, it's very
>>>>>>>>>>>>>>> interesting and well written.
>>>>>>>>>>>>>>> >> >>>>>> >>> I like the DataFile API and definitely worth
>>>>>>>>>>>>>>> to discuss all together.
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> Maybe we can schedule a specific meeting to
>>>>>>>>>>>>>>> discuss about DataFile API ?
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> Thoughts ?
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> Regards
>>>>>>>>>>>>>>> >> >>>>>> >>> JB
>>>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>>>> >> >>>>>> >>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>> >> >>>>>> >>> > Hi Team,
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>> >> >>>>>> >>> > As mentioned earlier on our Community Sync I
>>>>>>>>>>>>>>> am exploring the possibility to define a FileFormat API for 
>>>>>>>>>>>>>>> accessing
>>>>>>>>>>>>>>> different file formats. I have put together a proposal based on 
>>>>>>>>>>>>>>> my findings.
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>>>>>> >> >>>>>> >>> > Iceberg currently supports 3 different file
>>>>>>>>>>>>>>> formats: Avro, Parquet, ORC. With the introduction of Iceberg V3
>>>>>>>>>>>>>>> specification many new features are added to Iceberg. Some of 
>>>>>>>>>>>>>>> these
>>>>>>>>>>>>>>> features like new column types, default values require changes 
>>>>>>>>>>>>>>> at the file
>>>>>>>>>>>>>>> format level. The changes are added by individual developers 
>>>>>>>>>>>>>>> with different
>>>>>>>>>>>>>>> focus on the different file formats. As a result not all of the 
>>>>>>>>>>>>>>> features
>>>>>>>>>>>>>>> are available for every supported file format.
>>>>>>>>>>>>>>> >> >>>>>> >>> > Also there are emerging file formats like
>>>>>>>>>>>>>>> Vortex [1] or Lance [2] which either by specialization, or by 
>>>>>>>>>>>>>>> applying
>>>>>>>>>>>>>>> newer research results could provide better alternatives for 
>>>>>>>>>>>>>>> certain
>>>>>>>>>>>>>>> use-cases like random access for data, or storing ML models.
>>>>>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>> >> >>>>>> >>> > Please check the detailed proposal [3] and
>>>>>>>>>>>>>>> the google document [4], and comment there or reply on the dev 
>>>>>>>>>>>>>>> list if you
>>>>>>>>>>>>>>> have any suggestions.
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>> >> >>>>>> >>> > Thanks,
>>>>>>>>>>>>>>> >> >>>>>> >>> > Peter
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>> >> >>>>>> >>> > [1] - https://github.com/spiraldb/vortex
>>>>>>>>>>>>>>> >> >>>>>> >>> > [2] - https://lancedb.github.io/lance/
>>>>>>>>>>>>>>> >> >>>>>> >>> > [3] -
>>>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/12225
>>>>>>>>>>>>>>> >> >>>>>> >>> > [4] -
>>>>>>>>>>>>>>> https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds
>>>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: [DISCUSS] FileFormat API proposal

Reply via email to