Re: [DISCUSS] FileFormat API proposal

Péter Váry Thu, 11 Sep 2025 23:22:07 -0700

Thanks, Russell, for taking a look at this!

We need to expose four methods on the user-facing API (FormatModelRegistry):


   1. *writeBuilder* – for writing arbitrary files without Iceberg
   metadata. In the Iceberg codebase, this is exposed via
FlinkAppenderFactory and
   the GenericAppenderFactory for creating FileAppender<RowData> and
   FileAppender<Record> only.
   2. *dataWriteBuilder* – for creating and collecting metadata for Iceberg
   DataFiles.
   3. *equalityDeleteWriteBuilder* – for creating and collecting metadata
   for Iceberg EqualityDeleteFiles.
   4. *positionDeleteWriteBuilder* – for creating and collecting metadata
   for Iceberg PositionDeleteFiles.

We’d like to implement all four using a single WriteBuilder created by the
FormatModels.

Your suggestion is a good one—it helps formalize the requirements for the
build method and also surfaces an important design question:

*Who should be responsible for handling the differences between normal rows
(InternalRow) and position deletes (PositionDelete<InternalRow>)*?

   - Should we have a more complex WriteBuilder class that can create both
   DataFileAppender and PositionDeleteAppender?
   - Or should we push this responsibility to the engine-specific code,
   where we already have some logic (e.g., pathTransformFunc) needed by
   each engine to create the PositionDeleteAppender?

Thanks,
Peter


Russell Spitzer <[email protected]> ezt írta (időpont: 2025. szept.
11., Cs, 0:11):

> I'm a little confused here, I think Ryan mentioned this in the comment
> here https://github.com/apache/iceberg/pull/12774/files#r2254967177
>
> From my understanding there are two options?
>
> 1) We either are producing FormatModels that take a generic row type D and
> produce writers that all take D and write files.
>
> 2) we are creating IcebergModel specific writers that take DataFile,
> PositionDeleteFile, EqualityDeleteFile etc ... and write files
>
> The PositionDelete Converter issue seems to stem from attempting to do
> both model 1 (being very generic) and 2, wanting special code to deal with
> PositionDeleteFile<R> objects.
>
> It looks like the code in #12774 is mostly doing model 1, but we are
> trying to add in a specific converter for 2?
>
> Maybe i'm totally lost here but I was assuming we would do something a
> little scala-y like
>
> public <T> FileAppender<T> build(Class<T> type) {
> if (type == DataFile.class) return (FileAppender<T>) new DataFileAppender
> ();
> if (type == DeleteFile.class) return (FileAppender<T>) new
> DeleteFileAppender();
> // ...
> }
>
>
> So that we only register a single signature and if writer specific
> implementation needs to do something special it can? I'm trying to catch
> back up to speed on this PR so it may help to do a quick summary of the
> current state and intent. (At least for me)
>
> On Tue, Sep 9, 2025 at 3:42 AM Péter Váry <[email protected]>
> wrote:
>
>> Hi Renjie,
>> Thanks for taking a look!
>>
>> Let me clarify a few points:
>> - The converter API is only required for writing position delete files
>> for V2 tables
>> - Currently, there are no plans to support vectorized writing via the
>> java API
>> - Even if we decide to support vectorized writes, I don't think we would
>> like to implement it for Positional Deletes, which are deprecated in the
>> new spec.
>> - Also, once the positional deletes - which contain the deleted rows -
>> are deprecated (as planned), the conversion of the Position Deletes with
>> only file name and position would be trivial, even for the vectorized
>> writes.
>>
>> So from my perspective, the converter method exists purely for backward
>> compatibility, and we intend to remove it as soon as possible. Sacrificing
>> good practices for the sake of a deprecated feature doesn’t seem worthwhile
>> to me.
>>
>> Thanks,
>> Peter
>>
>> Renjie Liu <[email protected]> ezt írta (időpont: 2025. szept. 8.,
>> H, 12:34):
>>
>>> Hi, Peter:
>>>
>>> I would vote for the first approach. In spite of the compromises
>>> described, the api is still cleaner. Also I think there are some problems
>>> with the converter api. For example, for vectorized implementations such as
>>> comet which accepts columnar batch rather than rows, the converter method
>>> would make things more complicated.
>>>
>>> On Sat, Aug 30, 2025 at 2:49 PM Péter Váry <[email protected]>
>>> wrote:
>>>
>>>> I’ve initiated a discussion thread regarding the deprecation of
>>>> Position Deletes containing row data. You can follow it here:
>>>> https://lists.apache.org/thread/8jw6pb2vq3ghmdqf1yvy8n5n6gg1fq5s
>>>>
>>>> We can proceed with the discussion about the native reader/writer
>>>> deprecation when we decided on the final API, as the chosen design may
>>>> influence our approach.
>>>>
>>>> Since then, one more question has come up - hopefully the last:
>>>> *How should we handle Position Delete Writers?*
>>>> The File Format API should return builders for either rows or
>>>> PositionDelete objects. Currently the method
>>>> `WriteBuilder.createWriterFunc(Function<MessageType,
>>>> ParquetValueWriter<?>>)` defines the accepted input parameters for the
>>>> writer. Users are responsible for ensuring that the writer function and the
>>>> return type of the `WriteBuilder.build()` are compatible. In the new API,
>>>> we no longer expose writer functions. We still expose FileContent, since
>>>> writer configurations vary by content type, but we don’t expose the types.
>>>>
>>>> There are two proposals for handling types for the WriteBuilders:
>>>>
>>>>    1. *Implicit Type Definition via FileContent* - the builder
>>>>    parameter for FileContent would implicitly define the input type for the
>>>>    writer returned by build(), or
>>>>    2. *Engine level conversion* - Engines would convert PositionDelete
>>>>    objects to their native types.
>>>>
>>>> In code:
>>>>
>>>>    - In the 1st proposal, the FormatModel.writeBuilder(OutputFile
>>>>    outputFile) can return anything:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *    WriteBuilder builder = FormatModelRegistry.writeBuilder(PARQUET,
>>>>    InternalRow.class, outputFile);     FileAppender<InternalRow> appender =
>>>>          .schema(table.schema())         .content(FileContent.DATA)
>>>>    ....         .build(); *
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *   // Exposed, but FormatModelRegistry.positionDeleteWriteBuilder
>>>>    should be used instead    WriteBuilder builder
>>>>    = FormatModelRegistry.writeBuilder(PARQUET, InternalRow.class, 
>>>> outputFile);
>>>>       FileAppender<PositionDelete<InternalRow>> appender =
>>>>    .schema(table.schema())         .content(FileContent.POSITION_DELETES)
>>>>        ....         .build();*
>>>>
>>>>
>>>>    - In the 2nd proposal, the FormatModel needs another method:
>>>>
>>>>
>>>>
>>>> *Function<PositionDelete<D>, D> positionDeleteConverter(Schema
>>>> schema);    *example implementation:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *    return delete -> {      deleteRecord.update(0,
>>>> UTF8String.fromString(delete.path().toString()));
>>>> deleteRecord.update(1, delete.pos());      deleteRecord.update(2,
>>>> delete.row());      return deleteRecord;    };*
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *    // Content is only used for writer property configuration
>>>>    WriteBuilder<InternalRow> builder =
>>>>    sparkFormatModel.writeBuilder(outputFile);
>>>>      FileAppender<InternalRow> appender =         .schema(table.schema())
>>>>        .content(FileContent.DATA)         ....         .build();*
>>>>
>>>>
>>>> Drawbacks
>>>>
>>>>    - Proposal 1:
>>>>       - Type checking for the FileAppenders occurs only at runtime, so
>>>>       user errors surface late.
>>>>       - File Format specification must clearly specify which builder
>>>>       type corresponds to which file content parameter—generics would offer
>>>>       better clarity.
>>>>       - Inconsistent patterns between WriteBuilder and ReadBuilder, as
>>>>       the latter can define output types via generics.
>>>>    - Proposal 2:
>>>>       - Requires FormatModels to implement a converter method to
>>>>       transform PositionDelete<InternalRow> into InternalRow.
>>>>
>>>> Since we deprecated writing position delete files in the V3 spec, this
>>>> extra method in the 2nd proposal will be deprecated too. As a result, in
>>>> the long run, we will have a nice, clean API.
>>>> OTOH, if we accept the compromise described in the 1st proposal, the
>>>> results of our decision will remain, even when the functions are removed.
>>>>
>>>> Looking forward to your thoughts.
>>>> Thanks, Peter
>>>>
>>>> On Thu, Aug 14, 2025, 14:12 Péter Váry <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Team,
>>>>>
>>>>> During yesterday’s community sync, we discussed the current state of
>>>>> the File Format API proposal and identified two key questions that require
>>>>> input from the broader community:
>>>>>
>>>>> *1. Dropping support for Position Delete files with Row Data*
>>>>>
>>>>> The current Iceberg V2 spec [1] defines two types of position delete
>>>>> files:
>>>>>
>>>>>    - Files that store only the file name and row position.
>>>>>    - Files that also store the deleted row data.
>>>>>
>>>>> Although this feature is defined in the spec and some tests exist in
>>>>> the Iceberg codebase, we’re not aware of any actual implementation using
>>>>> the second type (with row data). Supporting V2 table writing via the new
>>>>> File Format API would be simpler if we dropped support for this feature.
>>>>> If you know of any use case or reason to retain support for position
>>>>> deletes with row data, please let us know.
>>>>>
>>>>> *2. Deprecating Native File Format Readers/Writers in the API*
>>>>>
>>>>> The current API contains format-specific readers/writers for Parquet,
>>>>> Avro, and ORC. With the introduction of the InternalData and File Format
>>>>> APIs, Iceberg users can now write files using:
>>>>>
>>>>>    - InternalData API for metadata files (manifest, manifest list,
>>>>>    partition stats).
>>>>>    - File Format API for data and delete files.
>>>>>
>>>>> I propose we deprecate the original format-specific writers and guide
>>>>> users to use the new APIs based on the target file type. If you’re aware 
>>>>> of
>>>>> any use cases that still require the original format-specific writers,
>>>>> please share them.
>>>>>
>>>>> Thanks,
>>>>> Peter
>>>>>
>>>>> [1] - Position Delete File Spec:
>>>>> https://iceberg.apache.org/spec/?h=delete#position-delete-files
>>>>>
>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>> júl. 22., K, 16:09):
>>>>>
>>>>>> Also put together a solution where the Engine specific format
>>>>>> transformation is separated from the writer, and the engines need to take
>>>>>> care of it separately.
>>>>>> This is somewhat complicated on the implementation side (see:
>>>>>> [RowDataTransformer](
>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-562fa4cc369c908a157f59a9235fd3f389096451e7901686fba37c87b53dee08),
>>>>>> and [InternalRowTransformer](
>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-546f9dc30e3207d1d2bc0a2722976b55f5a04dcf85a22855e4f400500c317140)),
>>>>>> but simplifies the API.
>>>>>>
>>>>>> @rdblue: Please check the proposed solution. I think this is what you
>>>>>> have suggested
>>>>>>
>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>> jún. 30., H, 18:42):
>>>>>>
>>>>>>> During the PR review [1], we began exploring what could we use as an
>>>>>>> intermediate layer to reduce the need for engines and file formats to
>>>>>>> implement the full matrix of file format - object model conversions.
>>>>>>>
>>>>>>> To support this discussion, I’ve created and run a set of
>>>>>>> performance benchmarks and compiled a document outlining the potential
>>>>>>> benefits and trade-offs [2].
>>>>>>>
>>>>>>> Feedback is welcome, feel free to comment on the document, the PR,
>>>>>>> or directly in this thread.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Peter
>>>>>>>
>>>>>>> [1] - PR discussion -
>>>>>>> https://github.com/apache/iceberg/pull/12774#discussion_r2093626096
>>>>>>> [2] - File Format and engine object model transformation performance
>>>>>>> -
>>>>>>> https://docs.google.com/document/d/1GdA8IowKMtS3QVdm8s-0X-ZRYetcHv2bhQ9mrSd3fd4
>>>>>>>
>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>> máj. 7., Sze, 13:15):
>>>>>>>
>>>>>>>> Hi everyone,
>>>>>>>> The proposed API part is reviewed and ready to go. See:
>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>> Thanks to everyone who reviewed it already!
>>>>>>>>
>>>>>>>> Many of you wanted to review, but I know that the time constraints
>>>>>>>> are there for everyone. I still very much would like to hear your 
>>>>>>>> voices,
>>>>>>>> so I will not merge the PR this week. Please review it if you.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Peter
>>>>>>>>
>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>>> ápr. 16., Sze, 7:02):
>>>>>>>>
>>>>>>>>> Hi Renjie,
>>>>>>>>> The first one for the proposed new API is here:
>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>> Thanks, Peter
>>>>>>>>>
>>>>>>>>> On Wed, Apr 16, 2025, 05:40 Renjie Liu <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi, Peter:
>>>>>>>>>>
>>>>>>>>>> Thanks for the effort. I totally agree with splitting them into
>>>>>>>>>> smaller prs to move forward.
>>>>>>>>>>
>>>>>>>>>> I'm quite interested in this topic, and please ping me in those
>>>>>>>>>> splitted prs and I'll help to review.
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 14, 2025 at 11:22 PM Jean-Baptiste Onofré <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Peter
>>>>>>>>>>>
>>>>>>>>>>> Awesome ! Thank you so much !
>>>>>>>>>>> I will do a new pass.
>>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> JB
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 11, 2025 at 3:48 PM Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > Hi JB,
>>>>>>>>>>> >
>>>>>>>>>>> > Separated out the proposed interfaces to a new PR:
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774.
>>>>>>>>>>> > Reviewers can check that out if they are only interested in
>>>>>>>>>>> how the new API would look like.
>>>>>>>>>>> >
>>>>>>>>>>> > Thanks,
>>>>>>>>>>> > Peter
>>>>>>>>>>> >
>>>>>>>>>>> > Jean-Baptiste Onofré <[email protected]> ezt írta (időpont:
>>>>>>>>>>> 2025. ápr. 10., Cs, 18:25):
>>>>>>>>>>> >>
>>>>>>>>>>> >> Hi Peter
>>>>>>>>>>> >>
>>>>>>>>>>> >> Thanks for the ping about the PR.
>>>>>>>>>>> >>
>>>>>>>>>>> >> Maybe, to facilitate the review and move forward faster, we
>>>>>>>>>>> should
>>>>>>>>>>> >> split the PR in smaller PRs:
>>>>>>>>>>> >> - one with the interfaces (ReadBuilder, AppenderBuilder,
>>>>>>>>>>> ObjectModel,
>>>>>>>>>>> >> AppenderBuilder, DataWriterBuilder, ...)
>>>>>>>>>>> >> - one for each file providers (Parquet, Avro, ORC)
>>>>>>>>>>> >>
>>>>>>>>>>> >> Thoughts ? I can help on the split if needed.
>>>>>>>>>>> >>
>>>>>>>>>>> >> Regards
>>>>>>>>>>> >> JB
>>>>>>>>>>> >>
>>>>>>>>>>> >> On Thu, Apr 10, 2025 at 5:16 AM Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Since the 1.9.0 release candidate has been created, I would
>>>>>>>>>>> like to resurrect this PR:
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298 to ensure that we
>>>>>>>>>>> have as long a testing period as possible for it.
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > To recap, here is what the PR does after the review rounds:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Created 3 interface classes which are implemented by the
>>>>>>>>>>> file formats:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > ReadBuilder - Builder for reading data from data files
>>>>>>>>>>> >> > AppenderBuilder - Builder for writing data to data files
>>>>>>>>>>> >> > ObjectModel - Providing ReadBuilders, and AppenderBuilders
>>>>>>>>>>> for the specific data file format and object model pair
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Updated the Parquet, Avro, ORC implementation for this
>>>>>>>>>>> interfaces, and deprecated the old reader/writer APIs
>>>>>>>>>>> >> > Created interface classes which will be used by the actual
>>>>>>>>>>> readers/writers of the data files:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > AppenderBuilder - Builder for writing a file
>>>>>>>>>>> >> > DataWriterBuilder - Builder for generating a data file
>>>>>>>>>>> >> > PositionDeleteWriterBuilder - Builder for generating a
>>>>>>>>>>> position delete file
>>>>>>>>>>> >> > EqualityDeleteWriterBuilder - Builder for generating an
>>>>>>>>>>> equality delete file
>>>>>>>>>>> >> > No ReadBuilder here - the file format reader builder is
>>>>>>>>>>> reused
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Created a WriterBuilder class which implements the
>>>>>>>>>>> interfaces above
>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>> based on a provided file format specific AppenderBuilder
>>>>>>>>>>> >> > Created an ObjectModelRegistry which stores the available
>>>>>>>>>>> ObjectModels, and engines and users could request the readers 
>>>>>>>>>>> (ReadBuilder)
>>>>>>>>>>> and writers
>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>> from.
>>>>>>>>>>> >> > Created the appropriate ObjectModels:
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > GenericObjectModels - for reading and writing Iceberg
>>>>>>>>>>> Records
>>>>>>>>>>> >> > SparkObjectModels - for reading (vectorized and
>>>>>>>>>>> non-vectorized) and writing Spark InternalRow/ColumnarBatch objects
>>>>>>>>>>> >> > FlinkObjectModels - for reading and writing Flink RowData
>>>>>>>>>>> objects
>>>>>>>>>>> >> > An arrow object model is also registered for vectorized
>>>>>>>>>>> reads of Parquet files into Arrow ColumnarBatch objects
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Updated the production code where the reading and writing
>>>>>>>>>>> happens to use the ObjectModelRegistry and the new reader/writer 
>>>>>>>>>>> interfaces
>>>>>>>>>>> to access data files
>>>>>>>>>>> >> > Kept the testing code intact to ensure that the new
>>>>>>>>>>> API/code is not breaking anything
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > The original change was not small, and grew substantially
>>>>>>>>>>> during the review rounds. So if you have questions, or I can do 
>>>>>>>>>>> anything to
>>>>>>>>>>> make the review easier, don't hesitate to ask. I am happy to do 
>>>>>>>>>>> anything to
>>>>>>>>>>> move this forward.
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>> >> > Peter
>>>>>>>>>>> >> >
>>>>>>>>>>> >> > Péter Váry <[email protected]> ezt írta
>>>>>>>>>>> (időpont: 2025. márc. 26., Sze, 14:54):
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> Hi everyone,
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> I have updated the File Format API PR (
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) based on the
>>>>>>>>>>> answers and review comments.
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> I would like to merge this only after the 1.9.0 release so
>>>>>>>>>>> we have more time finding any issues and solving them before this 
>>>>>>>>>>> goes to a
>>>>>>>>>>> release for the users.
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> For this I have updated the deprecation comments
>>>>>>>>>>> accordingly.
>>>>>>>>>>> >> >> I would like to ask you to review the PR, so we iron out
>>>>>>>>>>> any possible requested changes and be ready for the merge as soon as
>>>>>>>>>>> possible after the 1.9.0 release.
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> Thanks,
>>>>>>>>>>> >> >> Peter
>>>>>>>>>>> >> >>
>>>>>>>>>>> >> >> Péter Váry <[email protected]> ezt írta
>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:32):
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> Hi Renije,
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> > 1. File format filters
>>>>>>>>>>> >> >>> >
>>>>>>>>>>> >> >>> > Do the filters include both filter expressions from
>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> The current discussion is about the filters from the user
>>>>>>>>>>> query.
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> About the delete filter:
>>>>>>>>>>> >> >>> Based on the suggestions on the PR, I have moved the
>>>>>>>>>>> delete filter out from the main API. Created a 
>>>>>>>>>>> `SupportsDeleteFilter`
>>>>>>>>>>> interface for it which would allow pushing down to the filter to 
>>>>>>>>>>> Parquet
>>>>>>>>>>> vectorized readers in Spark, as this is the only place where we 
>>>>>>>>>>> currently
>>>>>>>>>>> implemented this feature.
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>>
>>>>>>>>>>> >> >>> Renjie Liu <[email protected]> ezt írta (időpont:
>>>>>>>>>>> 2025. márc. 21., P, 14:11):
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> Hi, Peter:
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> Thanks for the effort on this.
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> 1. File format filters
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> Do the filters include both filter expressions from both
>>>>>>>>>>> user query and delete filter?
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> For filters from user query, I agree with you that we
>>>>>>>>>>> should keep the current behavior.
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> For delete filters associated with data files, at first
>>>>>>>>>>> I thought file format readers should not care about this. But now I
>>>>>>>>>>> realized that maybe we need to also push it to file reader, this is 
>>>>>>>>>>> useful
>>>>>>>>>>> when `IS_DELETED` metadata column is not necessary and we could use 
>>>>>>>>>>> these
>>>>>>>>>>> filters (position deletes, etc) to further prune data.
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> But anyway, I agree that we could postpone it in follow
>>>>>>>>>>> up pr.
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> 2. Batch size configuration
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> I'm leaning toward option 2.
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> 3. Spark configuration
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> I'm leaning towards using different configuration
>>>>>>>>>>> objects.
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>>
>>>>>>>>>>> >> >>>> On Thu, Mar 20, 2025 at 10:23 PM Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> Hi Team,
>>>>>>>>>>> >> >>>>> Thanks everyone for the reviews on
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298!
>>>>>>>>>>> >> >>>>> I have addressed most of comments, but a few questions
>>>>>>>>>>> still remain which might merit a bit wider audience:
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> We should decide on the expected filtering behavior
>>>>>>>>>>> when the filters are pushed down to the readers. Currently the 
>>>>>>>>>>> filters are
>>>>>>>>>>> applied as best effort for the file format readers. Some readers 
>>>>>>>>>>> (Avro)
>>>>>>>>>>> just skip them altogether. There was a suggestion on the PR that we 
>>>>>>>>>>> might
>>>>>>>>>>> enforce more strict requirements and the readers either reject part 
>>>>>>>>>>> of the
>>>>>>>>>>> filters, or they could apply them fully.
>>>>>>>>>>> >> >>>>> Batch sizes are currently parameters for the reader
>>>>>>>>>>> builders which could be set for non-vectorized readers too which 
>>>>>>>>>>> could be
>>>>>>>>>>> confusing.
>>>>>>>>>>> >> >>>>> Currently the Spark batch reader uses different
>>>>>>>>>>> configuration objects for ParquetBatchReadConf and OrcBatchReadConf 
>>>>>>>>>>> as
>>>>>>>>>>> requested by the reviewers of the Comet PR. There was a suggestion 
>>>>>>>>>>> on the
>>>>>>>>>>> current PR to use a common configuration instead.
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> I would be interested in hearing your thoughts about
>>>>>>>>>>> these topics.
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> My current take:
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> File format filters: I am leaning towards keeping the
>>>>>>>>>>> current laninet behavior. Especially since Bloom filters are not 
>>>>>>>>>>> able to do
>>>>>>>>>>> a full filtering, and are often used as a way to filter out unwanted
>>>>>>>>>>> records. Another option would be to implement a secondary filtering 
>>>>>>>>>>> inside
>>>>>>>>>>> the file formats themselves which I think would cause extra 
>>>>>>>>>>> complexity, and
>>>>>>>>>>> possible code duplication. Whatever the decision here, I would 
>>>>>>>>>>> suggest
>>>>>>>>>>> moving this out to a next PR as the current changeset is big enough 
>>>>>>>>>>> as it
>>>>>>>>>>> is.
>>>>>>>>>>> >> >>>>> Batch size configuration: Currently this is the only
>>>>>>>>>>> property which is different in the batch readers and the 
>>>>>>>>>>> non-vectorized
>>>>>>>>>>> readers. I see 3 possible solutions:
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> Create different builders for vectorized and
>>>>>>>>>>> non-vectorized reads - I don't think the current solution is 
>>>>>>>>>>> confusing
>>>>>>>>>>> enough to worth the extra class
>>>>>>>>>>> >> >>>>> We could put this to the reader configuration property
>>>>>>>>>>> set - This could work, but "hide" the possible configuration mode 
>>>>>>>>>>> which is
>>>>>>>>>>> valid for both Parquet and ORC readers
>>>>>>>>>>> >> >>>>> We could keep things as it is now - I would chose this
>>>>>>>>>>> one, but I don't have a strong opinion here
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> Spark configuration: TBH, I'm open to bot solution and
>>>>>>>>>>> happy to move to the direction the community decides on
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> Thanks,
>>>>>>>>>>> >> >>>>> Peter
>>>>>>>>>>> >> >>>>>
>>>>>>>>>>> >> >>>>> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>> (időpont: 2025. márc. 14., P, 16:31):
>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>> >> >>>>>> Hi Peter
>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>> >> >>>>>> Thanks for the update. I will do a new pass on the PR.
>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>> >> >>>>>> Regards
>>>>>>>>>>> >> >>>>>> JB
>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>> >> >>>>>> On Thu, Mar 13, 2025 at 1:16 PM Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>> >> >>>>>> > Hi Team,
>>>>>>>>>>> >> >>>>>> > I have rebased the File Format API proposal (
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) to include the
>>>>>>>>>>> new changes needed for the Variant types. I would love to hear your
>>>>>>>>>>> feedback, especially Dan and Ryan, as you were the most active 
>>>>>>>>>>> during our
>>>>>>>>>>> discussions. If I can help in any way to make the review easier, 
>>>>>>>>>>> please let
>>>>>>>>>>> me know.
>>>>>>>>>>> >> >>>>>> > Thanks,
>>>>>>>>>>> >> >>>>>> > Peter
>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>> >> >>>>>> > Péter Váry <[email protected]> ezt írta
>>>>>>>>>>> (időpont: 2025. febr. 28., P, 17:50):
>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>> >> >>>>>> >> Hi everyone,
>>>>>>>>>>> >> >>>>>> >> Thanks for all of the actionable, relevant feedback
>>>>>>>>>>> on the PR (https://github.com/apache/iceberg/pull/12298).
>>>>>>>>>>> >> >>>>>> >> Updated the code to address most of them. Please
>>>>>>>>>>> check if you agree with the general approach.
>>>>>>>>>>> >> >>>>>> >> If there is a consensus about the general approach,
>>>>>>>>>>> I could. separate out the PR to smaller pieces so we can have an 
>>>>>>>>>>> easier
>>>>>>>>>>> time to review and merge those step-by-step.
>>>>>>>>>>> >> >>>>>> >> Thanks,
>>>>>>>>>>> >> >>>>>> >> Peter
>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>> >> >>>>>> >> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>> (időpont: 2025. febr. 20., Cs, 14:14):
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> Hi Peter
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> sorry for the late reply on this.
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> I did a pass on the proposal, it's very
>>>>>>>>>>> interesting and well written.
>>>>>>>>>>> >> >>>>>> >>> I like the DataFile API and definitely worth to
>>>>>>>>>>> discuss all together.
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> Maybe we can schedule a specific meeting to
>>>>>>>>>>> discuss about DataFile API ?
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> Thoughts ?
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> Regards
>>>>>>>>>>> >> >>>>>> >>> JB
>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>> >> >>>>>> >>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>> >> >>>>>> >>> > Hi Team,
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>> >> >>>>>> >>> > As mentioned earlier on our Community Sync I am
>>>>>>>>>>> exploring the possibility to define a FileFormat API for accessing
>>>>>>>>>>> different file formats. I have put together a proposal based on my 
>>>>>>>>>>> findings.
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>> >> >>>>>> >>> > Iceberg currently supports 3 different file
>>>>>>>>>>> formats: Avro, Parquet, ORC. With the introduction of Iceberg V3
>>>>>>>>>>> specification many new features are added to Iceberg. Some of these
>>>>>>>>>>> features like new column types, default values require changes at 
>>>>>>>>>>> the file
>>>>>>>>>>> format level. The changes are added by individual developers with 
>>>>>>>>>>> different
>>>>>>>>>>> focus on the different file formats. As a result not all of the 
>>>>>>>>>>> features
>>>>>>>>>>> are available for every supported file format.
>>>>>>>>>>> >> >>>>>> >>> > Also there are emerging file formats like Vortex
>>>>>>>>>>> [1] or Lance [2] which either by specialization, or by applying 
>>>>>>>>>>> newer
>>>>>>>>>>> research results could provide better alternatives for certain 
>>>>>>>>>>> use-cases
>>>>>>>>>>> like random access for data, or storing ML models.
>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>> >> >>>>>> >>> > Please check the detailed proposal [3] and the
>>>>>>>>>>> google document [4], and comment there or reply on the dev list if 
>>>>>>>>>>> you have
>>>>>>>>>>> any suggestions.
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>> >> >>>>>> >>> > Thanks,
>>>>>>>>>>> >> >>>>>> >>> > Peter
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>> >> >>>>>> >>> > [1] - https://github.com/spiraldb/vortex
>>>>>>>>>>> >> >>>>>> >>> > [2] - https://lancedb.github.io/lance/
>>>>>>>>>>> >> >>>>>> >>> > [3] -
>>>>>>>>>>> https://github.com/apache/iceberg/issues/12225
>>>>>>>>>>> >> >>>>>> >>> > [4] -
>>>>>>>>>>> https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds
>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>
>>>>>>>>>>

Re: [DISCUSS] FileFormat API proposal

Reply via email to