Re: [DISCUSS] FileFormat API proposal

Péter Váry Sat, 20 Sep 2025 16:24:40 -0700

> Wouldn't PositionDelete<InternalRow> also be an InternalRow in this
example? I think that's what i'm confused about.


With the *second approach*, the WriteBuilder doesn’t need to handle
PositionDelete objects directly. The conversion layer takes care of that,
so the WriteBuilder only needs to work with InternalRow.

With the *first approach*, we shift that responsibility to the WriteBuilder,
which then has to support both InternalRow and PositionDelete<InternalRow>.

In both cases, the FormatModelRegistry API will still expose the more
concrete types (PositionDelete / InternalRow). However, under the *first
approach*, the lower-level API only needs to handle InternalRow,
simplifying its interface.
Thanks,
Peter

Russell Spitzer <[email protected]> ezt írta (időpont: 2025. szept.
11., Cs, 17:12):

> Wouldn't PositionDelete<InternalRow> also be an InternalRow in this
> example? I think that's what i'm confused about.
>
> On Thu, Sep 11, 2025 at 5:35 AM Péter Váry <[email protected]>
> wrote:
>
>> Thanks, Russell, for taking a look at this!
>>
>> We need to expose four methods on the user-facing API (
>> FormatModelRegistry):
>>
>>    1. *writeBuilder* – for writing arbitrary files without Iceberg
>>    metadata. In the Iceberg codebase, this is exposed via 
>> FlinkAppenderFactory and
>>    the GenericAppenderFactory for creating FileAppender<RowData> and
>>    FileAppender<Record> only.
>>    2. *dataWriteBuilder* – for creating and collecting metadata for
>>    Iceberg DataFiles.
>>    3. *equalityDeleteWriteBuilder* – for creating and collecting
>>    metadata for Iceberg EqualityDeleteFiles.
>>    4. *positionDeleteWriteBuilder* – for creating and collecting
>>    metadata for Iceberg PositionDeleteFiles.
>>
>> We’d like to implement all four using a single WriteBuilder created by
>> the FormatModels.
>>
>> Your suggestion is a good one—it helps formalize the requirements for the
>> build method and also surfaces an important design question:
>>
>> *Who should be responsible for handling the differences between normal
>> rows (InternalRow) and position deletes (PositionDelete<InternalRow>)*?
>>
>>    - Should we have a more complex WriteBuilder class that can create
>>    both DataFileAppender and PositionDeleteAppender?
>>    - Or should we push this responsibility to the engine-specific code,
>>    where we already have some logic (e.g., pathTransformFunc) needed by
>>    each engine to create the PositionDeleteAppender?
>>
>> Thanks,
>> Peter
>>
>>
>> Russell Spitzer <[email protected]> ezt írta (időpont: 2025.
>> szept. 11., Cs, 0:11):
>>
>>> I'm a little confused here, I think Ryan mentioned this in the comment
>>> here https://github.com/apache/iceberg/pull/12774/files#r2254967177
>>>
>>> From my understanding there are two options?
>>>
>>> 1) We either are producing FormatModels that take a generic row type D
>>> and produce writers that all take D and write files.
>>>
>>> 2) we are creating IcebergModel specific writers that take DataFile,
>>> PositionDeleteFile, EqualityDeleteFile etc ... and write files
>>>
>>> The PositionDelete Converter issue seems to stem from attempting to do
>>> both model 1 (being very generic) and 2, wanting special code to deal with
>>> PositionDeleteFile<R> objects.
>>>
>>> It looks like the code in #12774 is mostly doing model 1, but we are
>>> trying to add in a specific converter for 2?
>>>
>>> Maybe i'm totally lost here but I was assuming we would do something a
>>> little scala-y like
>>>
>>> public <T> FileAppender<T> build(Class<T> type) {
>>> if (type == DataFile.class) return (FileAppender<T>) new
>>> DataFileAppender();
>>> if (type == DeleteFile.class) return (FileAppender<T>) new
>>> DeleteFileAppender();
>>> // ...
>>> }
>>>
>>>
>>> So that we only register a single signature and if writer specific
>>> implementation needs to do something special it can? I'm trying to catch
>>> back up to speed on this PR so it may help to do a quick summary of the
>>> current state and intent. (At least for me)
>>>
>>> On Tue, Sep 9, 2025 at 3:42 AM Péter Váry <[email protected]>
>>> wrote:
>>>
>>>> Hi Renjie,
>>>> Thanks for taking a look!
>>>>
>>>> Let me clarify a few points:
>>>> - The converter API is only required for writing position delete files
>>>> for V2 tables
>>>> - Currently, there are no plans to support vectorized writing via the
>>>> java API
>>>> - Even if we decide to support vectorized writes, I don't think we
>>>> would like to implement it for Positional Deletes, which are deprecated in
>>>> the new spec.
>>>> - Also, once the positional deletes - which contain the deleted rows -
>>>> are deprecated (as planned), the conversion of the Position Deletes with
>>>> only file name and position would be trivial, even for the vectorized
>>>> writes.
>>>>
>>>> So from my perspective, the converter method exists purely for backward
>>>> compatibility, and we intend to remove it as soon as possible. Sacrificing
>>>> good practices for the sake of a deprecated feature doesn’t seem worthwhile
>>>> to me.
>>>>
>>>> Thanks,
>>>> Peter
>>>>
>>>> Renjie Liu <[email protected]> ezt írta (időpont: 2025. szept.
>>>> 8., H, 12:34):
>>>>
>>>>> Hi, Peter:
>>>>>
>>>>> I would vote for the first approach. In spite of the compromises
>>>>> described, the api is still cleaner. Also I think there are some problems
>>>>> with the converter api. For example, for vectorized implementations such 
>>>>> as
>>>>> comet which accepts columnar batch rather than rows, the converter method
>>>>> would make things more complicated.
>>>>>
>>>>> On Sat, Aug 30, 2025 at 2:49 PM Péter Váry <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> I’ve initiated a discussion thread regarding the deprecation of
>>>>>> Position Deletes containing row data. You can follow it here:
>>>>>> https://lists.apache.org/thread/8jw6pb2vq3ghmdqf1yvy8n5n6gg1fq5s
>>>>>>
>>>>>> We can proceed with the discussion about the native reader/writer
>>>>>> deprecation when we decided on the final API, as the chosen design may
>>>>>> influence our approach.
>>>>>>
>>>>>> Since then, one more question has come up - hopefully the last:
>>>>>> *How should we handle Position Delete Writers?*
>>>>>> The File Format API should return builders for either rows or
>>>>>> PositionDelete objects. Currently the method
>>>>>> `WriteBuilder.createWriterFunc(Function<MessageType,
>>>>>> ParquetValueWriter<?>>)` defines the accepted input parameters for the
>>>>>> writer. Users are responsible for ensuring that the writer function and 
>>>>>> the
>>>>>> return type of the `WriteBuilder.build()` are compatible. In the new API,
>>>>>> we no longer expose writer functions. We still expose FileContent, since
>>>>>> writer configurations vary by content type, but we don’t expose the 
>>>>>> types.
>>>>>>
>>>>>> There are two proposals for handling types for the WriteBuilders:
>>>>>>
>>>>>>    1. *Implicit Type Definition via FileContent* - the builder
>>>>>>    parameter for FileContent would implicitly define the input type for 
>>>>>> the
>>>>>>    writer returned by build(), or
>>>>>>    2. *Engine level conversion* - Engines would convert
>>>>>>    PositionDelete objects to their native types.
>>>>>>
>>>>>> In code:
>>>>>>
>>>>>>    - In the 1st proposal, the FormatModel.writeBuilder(OutputFile
>>>>>>    outputFile) can return anything:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *    WriteBuilder builder = FormatModelRegistry.writeBuilder(PARQUET,
>>>>>>    InternalRow.class, outputFile);     FileAppender<InternalRow> 
>>>>>> appender =
>>>>>>          .schema(table.schema())         .content(FileContent.DATA)
>>>>>>    ....         .build(); *
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *   // Exposed, but FormatModelRegistry.positionDeleteWriteBuilder
>>>>>>    should be used instead    WriteBuilder builder
>>>>>>    = FormatModelRegistry.writeBuilder(PARQUET, InternalRow.class, 
>>>>>> outputFile);
>>>>>>       FileAppender<PositionDelete<InternalRow>> appender =
>>>>>>    .schema(table.schema())         .content(FileContent.POSITION_DELETES)
>>>>>>        ....         .build();*
>>>>>>
>>>>>>
>>>>>>    - In the 2nd proposal, the FormatModel needs another method:
>>>>>>
>>>>>>
>>>>>>
>>>>>> *Function<PositionDelete<D>, D> positionDeleteConverter(Schema
>>>>>> schema);    *example implementation:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *    return delete -> {      deleteRecord.update(0,
>>>>>> UTF8String.fromString(delete.path().toString()));
>>>>>> deleteRecord.update(1, delete.pos());      deleteRecord.update(2,
>>>>>> delete.row());      return deleteRecord;    };*
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *    // Content is only used for writer property configuration
>>>>>>    WriteBuilder<InternalRow> builder =
>>>>>>    sparkFormatModel.writeBuilder(outputFile);
>>>>>>      FileAppender<InternalRow> appender =         .schema(table.schema())
>>>>>>        .content(FileContent.DATA)         ....         .build();*
>>>>>>
>>>>>>
>>>>>> Drawbacks
>>>>>>
>>>>>>    - Proposal 1:
>>>>>>       - Type checking for the FileAppenders occurs only at runtime,
>>>>>>       so user errors surface late.
>>>>>>       - File Format specification must clearly specify which builder
>>>>>>       type corresponds to which file content parameter—generics would 
>>>>>> offer
>>>>>>       better clarity.
>>>>>>       - Inconsistent patterns between WriteBuilder and ReadBuilder,
>>>>>>       as the latter can define output types via generics.
>>>>>>    - Proposal 2:
>>>>>>       - Requires FormatModels to implement a converter method to
>>>>>>       transform PositionDelete<InternalRow> into InternalRow.
>>>>>>
>>>>>> Since we deprecated writing position delete files in the V3 spec,
>>>>>> this extra method in the 2nd proposal will be deprecated too. As a 
>>>>>> result,
>>>>>> in the long run, we will have a nice, clean API.
>>>>>> OTOH, if we accept the compromise described in the 1st proposal, the
>>>>>> results of our decision will remain, even when the functions are removed.
>>>>>>
>>>>>> Looking forward to your thoughts.
>>>>>> Thanks, Peter
>>>>>>
>>>>>> On Thu, Aug 14, 2025, 14:12 Péter Váry <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Team,
>>>>>>>
>>>>>>> During yesterday’s community sync, we discussed the current state of
>>>>>>> the File Format API proposal and identified two key questions that 
>>>>>>> require
>>>>>>> input from the broader community:
>>>>>>>
>>>>>>> *1. Dropping support for Position Delete files with Row Data*
>>>>>>>
>>>>>>> The current Iceberg V2 spec [1] defines two types of position delete
>>>>>>> files:
>>>>>>>
>>>>>>>    - Files that store only the file name and row position.
>>>>>>>    - Files that also store the deleted row data.
>>>>>>>
>>>>>>> Although this feature is defined in the spec and some tests exist in
>>>>>>> the Iceberg codebase, we’re not aware of any actual implementation using
>>>>>>> the second type (with row data). Supporting V2 table writing via the new
>>>>>>> File Format API would be simpler if we dropped support for this feature.
>>>>>>> If you know of any use case or reason to retain support for position
>>>>>>> deletes with row data, please let us know.
>>>>>>>
>>>>>>> *2. Deprecating Native File Format Readers/Writers in the API*
>>>>>>>
>>>>>>> The current API contains format-specific readers/writers for
>>>>>>> Parquet, Avro, and ORC. With the introduction of the InternalData and 
>>>>>>> File
>>>>>>> Format APIs, Iceberg users can now write files using:
>>>>>>>
>>>>>>>    - InternalData API for metadata files (manifest, manifest list,
>>>>>>>    partition stats).
>>>>>>>    - File Format API for data and delete files.
>>>>>>>
>>>>>>> I propose we deprecate the original format-specific writers and
>>>>>>> guide users to use the new APIs based on the target file type. If you’re
>>>>>>> aware of any use cases that still require the original format-specific
>>>>>>> writers, please share them.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Peter
>>>>>>>
>>>>>>> [1] - Position Delete File Spec:
>>>>>>> https://iceberg.apache.org/spec/?h=delete#position-delete-files
>>>>>>>
>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>> júl. 22., K, 16:09):
>>>>>>>
>>>>>>>> Also put together a solution where the Engine specific format
>>>>>>>> transformation is separated from the writer, and the engines need to 
>>>>>>>> take
>>>>>>>> care of it separately.
>>>>>>>> This is somewhat complicated on the implementation side (see:
>>>>>>>> [RowDataTransformer](
>>>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-562fa4cc369c908a157f59a9235fd3f389096451e7901686fba37c87b53dee08),
>>>>>>>> and [InternalRowTransformer](
>>>>>>>> https://github.com/apache/iceberg/pull/12298/files#diff-546f9dc30e3207d1d2bc0a2722976b55f5a04dcf85a22855e4f400500c317140)),
>>>>>>>> but simplifies the API.
>>>>>>>>
>>>>>>>> @rdblue: Please check the proposed solution. I think this is what
>>>>>>>> you have suggested
>>>>>>>>
>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>>> jún. 30., H, 18:42):
>>>>>>>>
>>>>>>>>> During the PR review [1], we began exploring what could we use as
>>>>>>>>> an intermediate layer to reduce the need for engines and file formats 
>>>>>>>>> to
>>>>>>>>> implement the full matrix of file format - object model conversions.
>>>>>>>>>
>>>>>>>>> To support this discussion, I’ve created and run a set of
>>>>>>>>> performance benchmarks and compiled a document outlining the potential
>>>>>>>>> benefits and trade-offs [2].
>>>>>>>>>
>>>>>>>>> Feedback is welcome, feel free to comment on the document, the PR,
>>>>>>>>> or directly in this thread.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Peter
>>>>>>>>>
>>>>>>>>> [1] - PR discussion -
>>>>>>>>> https://github.com/apache/iceberg/pull/12774#discussion_r2093626096
>>>>>>>>> [2] - File Format and engine object model transformation
>>>>>>>>> performance -
>>>>>>>>> https://docs.google.com/document/d/1GdA8IowKMtS3QVdm8s-0X-ZRYetcHv2bhQ9mrSd3fd4
>>>>>>>>>
>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont: 2025.
>>>>>>>>> máj. 7., Sze, 13:15):
>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>> The proposed API part is reviewed and ready to go. See:
>>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>>> Thanks to everyone who reviewed it already!
>>>>>>>>>>
>>>>>>>>>> Many of you wanted to review, but I know that the time
>>>>>>>>>> constraints are there for everyone. I still very much would like to 
>>>>>>>>>> hear
>>>>>>>>>> your voices, so I will not merge the PR this week. Please review it 
>>>>>>>>>> if you.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Peter
>>>>>>>>>>
>>>>>>>>>> Péter Váry <[email protected]> ezt írta (időpont:
>>>>>>>>>> 2025. ápr. 16., Sze, 7:02):
>>>>>>>>>>
>>>>>>>>>>> Hi Renjie,
>>>>>>>>>>> The first one for the proposed new API is here:
>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774
>>>>>>>>>>> Thanks, Peter
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 16, 2025, 05:40 Renjie Liu <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi, Peter:
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the effort. I totally agree with splitting them into
>>>>>>>>>>>> smaller prs to move forward.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm quite interested in this topic, and please ping me in those
>>>>>>>>>>>> splitted prs and I'll help to review.
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Apr 14, 2025 at 11:22 PM Jean-Baptiste Onofré <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Peter
>>>>>>>>>>>>>
>>>>>>>>>>>>> Awesome ! Thank you so much !
>>>>>>>>>>>>> I will do a new pass.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> JB
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Apr 11, 2025 at 3:48 PM Péter Váry <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Hi JB,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Separated out the proposed interfaces to a new PR:
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12774.
>>>>>>>>>>>>> > Reviewers can check that out if they are only interested in
>>>>>>>>>>>>> how the new API would look like.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Thanks,
>>>>>>>>>>>>> > Peter
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Jean-Baptiste Onofré <[email protected]> ezt írta (időpont:
>>>>>>>>>>>>> 2025. ápr. 10., Cs, 18:25):
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Hi Peter
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Thanks for the ping about the PR.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Maybe, to facilitate the review and move forward faster, we
>>>>>>>>>>>>> should
>>>>>>>>>>>>> >> split the PR in smaller PRs:
>>>>>>>>>>>>> >> - one with the interfaces (ReadBuilder, AppenderBuilder,
>>>>>>>>>>>>> ObjectModel,
>>>>>>>>>>>>> >> AppenderBuilder, DataWriterBuilder, ...)
>>>>>>>>>>>>> >> - one for each file providers (Parquet, Avro, ORC)
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Thoughts ? I can help on the split if needed.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Regards
>>>>>>>>>>>>> >> JB
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> On Thu, Apr 10, 2025 at 5:16 AM Péter Váry <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Since the 1.9.0 release candidate has been created, I
>>>>>>>>>>>>> would like to resurrect this PR:
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298 to ensure that
>>>>>>>>>>>>> we have as long a testing period as possible for it.
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > To recap, here is what the PR does after the review
>>>>>>>>>>>>> rounds:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Created 3 interface classes which are implemented by the
>>>>>>>>>>>>> file formats:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > ReadBuilder - Builder for reading data from data files
>>>>>>>>>>>>> >> > AppenderBuilder - Builder for writing data to data files
>>>>>>>>>>>>> >> > ObjectModel - Providing ReadBuilders, and
>>>>>>>>>>>>> AppenderBuilders for the specific data file format and object 
>>>>>>>>>>>>> model pair
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Updated the Parquet, Avro, ORC implementation for this
>>>>>>>>>>>>> interfaces, and deprecated the old reader/writer APIs
>>>>>>>>>>>>> >> > Created interface classes which will be used by the
>>>>>>>>>>>>> actual readers/writers of the data files:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > AppenderBuilder - Builder for writing a file
>>>>>>>>>>>>> >> > DataWriterBuilder - Builder for generating a data file
>>>>>>>>>>>>> >> > PositionDeleteWriterBuilder - Builder for generating a
>>>>>>>>>>>>> position delete file
>>>>>>>>>>>>> >> > EqualityDeleteWriterBuilder - Builder for generating an
>>>>>>>>>>>>> equality delete file
>>>>>>>>>>>>> >> > No ReadBuilder here - the file format reader builder is
>>>>>>>>>>>>> reused
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Created a WriterBuilder class which implements the
>>>>>>>>>>>>> interfaces above
>>>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>>>> based on a provided file format specific AppenderBuilder
>>>>>>>>>>>>> >> > Created an ObjectModelRegistry which stores the available
>>>>>>>>>>>>> ObjectModels, and engines and users could request the readers 
>>>>>>>>>>>>> (ReadBuilder)
>>>>>>>>>>>>> and writers
>>>>>>>>>>>>> (AppenderBuilder/DataWriterBuilder/PositionDeleteWriterBuilder/EqualityDeleteWriterBuilder)
>>>>>>>>>>>>> from.
>>>>>>>>>>>>> >> > Created the appropriate ObjectModels:
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > GenericObjectModels - for reading and writing Iceberg
>>>>>>>>>>>>> Records
>>>>>>>>>>>>> >> > SparkObjectModels - for reading (vectorized and
>>>>>>>>>>>>> non-vectorized) and writing Spark InternalRow/ColumnarBatch 
>>>>>>>>>>>>> objects
>>>>>>>>>>>>> >> > FlinkObjectModels - for reading and writing Flink RowData
>>>>>>>>>>>>> objects
>>>>>>>>>>>>> >> > An arrow object model is also registered for vectorized
>>>>>>>>>>>>> reads of Parquet files into Arrow ColumnarBatch objects
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Updated the production code where the reading and writing
>>>>>>>>>>>>> happens to use the ObjectModelRegistry and the new reader/writer 
>>>>>>>>>>>>> interfaces
>>>>>>>>>>>>> to access data files
>>>>>>>>>>>>> >> > Kept the testing code intact to ensure that the new
>>>>>>>>>>>>> API/code is not breaking anything
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > The original change was not small, and grew substantially
>>>>>>>>>>>>> during the review rounds. So if you have questions, or I can do 
>>>>>>>>>>>>> anything to
>>>>>>>>>>>>> make the review easier, don't hesitate to ask. I am happy to do 
>>>>>>>>>>>>> anything to
>>>>>>>>>>>>> move this forward.
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Thanks,
>>>>>>>>>>>>> >> > Peter
>>>>>>>>>>>>> >> >
>>>>>>>>>>>>> >> > Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2025. márc. 26., Sze, 14:54):
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> Hi everyone,
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> I have updated the File Format API PR (
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) based on the
>>>>>>>>>>>>> answers and review comments.
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> I would like to merge this only after the 1.9.0 release
>>>>>>>>>>>>> so we have more time finding any issues and solving them before 
>>>>>>>>>>>>> this goes
>>>>>>>>>>>>> to a release for the users.
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> For this I have updated the deprecation comments
>>>>>>>>>>>>> accordingly.
>>>>>>>>>>>>> >> >> I would like to ask you to review the PR, so we iron out
>>>>>>>>>>>>> any possible requested changes and be ready for the merge as soon 
>>>>>>>>>>>>> as
>>>>>>>>>>>>> possible after the 1.9.0 release.
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> Thanks,
>>>>>>>>>>>>> >> >> Peter
>>>>>>>>>>>>> >> >>
>>>>>>>>>>>>> >> >> Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:32):
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> Hi Renije,
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> > 1. File format filters
>>>>>>>>>>>>> >> >>> >
>>>>>>>>>>>>> >> >>> > Do the filters include both filter expressions from
>>>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> The current discussion is about the filters from the
>>>>>>>>>>>>> user query.
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> About the delete filter:
>>>>>>>>>>>>> >> >>> Based on the suggestions on the PR, I have moved the
>>>>>>>>>>>>> delete filter out from the main API. Created a 
>>>>>>>>>>>>> `SupportsDeleteFilter`
>>>>>>>>>>>>> interface for it which would allow pushing down to the filter to 
>>>>>>>>>>>>> Parquet
>>>>>>>>>>>>> vectorized readers in Spark, as this is the only place where we 
>>>>>>>>>>>>> currently
>>>>>>>>>>>>> implemented this feature.
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>>
>>>>>>>>>>>>> >> >>> Renjie Liu <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2025. márc. 21., P, 14:11):
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> Hi, Peter:
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> Thanks for the effort on this.
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> 1. File format filters
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> Do the filters include both filter expressions from
>>>>>>>>>>>>> both user query and delete filter?
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> For filters from user query, I agree with you that we
>>>>>>>>>>>>> should keep the current behavior.
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> For delete filters associated with data files, at
>>>>>>>>>>>>> first I thought file format readers should not care about this. 
>>>>>>>>>>>>> But now I
>>>>>>>>>>>>> realized that maybe we need to also push it to file reader, this 
>>>>>>>>>>>>> is useful
>>>>>>>>>>>>> when `IS_DELETED` metadata column is not necessary and we could 
>>>>>>>>>>>>> use these
>>>>>>>>>>>>> filters (position deletes, etc) to further prune data.
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> But anyway, I agree that we could postpone it in
>>>>>>>>>>>>> follow up pr.
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> 2. Batch size configuration
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> I'm leaning toward option 2.
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> 3. Spark configuration
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> I'm leaning towards using different configuration
>>>>>>>>>>>>> objects.
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>>
>>>>>>>>>>>>> >> >>>> On Thu, Mar 20, 2025 at 10:23 PM Péter Váry <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> Hi Team,
>>>>>>>>>>>>> >> >>>>> Thanks everyone for the reviews on
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298!
>>>>>>>>>>>>> >> >>>>> I have addressed most of comments, but a few
>>>>>>>>>>>>> questions still remain which might merit a bit wider audience:
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> We should decide on the expected filtering behavior
>>>>>>>>>>>>> when the filters are pushed down to the readers. Currently the 
>>>>>>>>>>>>> filters are
>>>>>>>>>>>>> applied as best effort for the file format readers. Some readers 
>>>>>>>>>>>>> (Avro)
>>>>>>>>>>>>> just skip them altogether. There was a suggestion on the PR that 
>>>>>>>>>>>>> we might
>>>>>>>>>>>>> enforce more strict requirements and the readers either reject 
>>>>>>>>>>>>> part of the
>>>>>>>>>>>>> filters, or they could apply them fully.
>>>>>>>>>>>>> >> >>>>> Batch sizes are currently parameters for the reader
>>>>>>>>>>>>> builders which could be set for non-vectorized readers too which 
>>>>>>>>>>>>> could be
>>>>>>>>>>>>> confusing.
>>>>>>>>>>>>> >> >>>>> Currently the Spark batch reader uses different
>>>>>>>>>>>>> configuration objects for ParquetBatchReadConf and 
>>>>>>>>>>>>> OrcBatchReadConf as
>>>>>>>>>>>>> requested by the reviewers of the Comet PR. There was a 
>>>>>>>>>>>>> suggestion on the
>>>>>>>>>>>>> current PR to use a common configuration instead.
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> I would be interested in hearing your thoughts about
>>>>>>>>>>>>> these topics.
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> My current take:
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> File format filters: I am leaning towards keeping the
>>>>>>>>>>>>> current laninet behavior. Especially since Bloom filters are not 
>>>>>>>>>>>>> able to do
>>>>>>>>>>>>> a full filtering, and are often used as a way to filter out 
>>>>>>>>>>>>> unwanted
>>>>>>>>>>>>> records. Another option would be to implement a secondary 
>>>>>>>>>>>>> filtering inside
>>>>>>>>>>>>> the file formats themselves which I think would cause extra 
>>>>>>>>>>>>> complexity, and
>>>>>>>>>>>>> possible code duplication. Whatever the decision here, I would 
>>>>>>>>>>>>> suggest
>>>>>>>>>>>>> moving this out to a next PR as the current changeset is big 
>>>>>>>>>>>>> enough as it
>>>>>>>>>>>>> is.
>>>>>>>>>>>>> >> >>>>> Batch size configuration: Currently this is the only
>>>>>>>>>>>>> property which is different in the batch readers and the 
>>>>>>>>>>>>> non-vectorized
>>>>>>>>>>>>> readers. I see 3 possible solutions:
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> Create different builders for vectorized and
>>>>>>>>>>>>> non-vectorized reads - I don't think the current solution is 
>>>>>>>>>>>>> confusing
>>>>>>>>>>>>> enough to worth the extra class
>>>>>>>>>>>>> >> >>>>> We could put this to the reader configuration
>>>>>>>>>>>>> property set - This could work, but "hide" the possible 
>>>>>>>>>>>>> configuration mode
>>>>>>>>>>>>> which is valid for both Parquet and ORC readers
>>>>>>>>>>>>> >> >>>>> We could keep things as it is now - I would chose
>>>>>>>>>>>>> this one, but I don't have a strong opinion here
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> Spark configuration: TBH, I'm open to bot solution
>>>>>>>>>>>>> and happy to move to the direction the community decides on
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> Thanks,
>>>>>>>>>>>>> >> >>>>> Peter
>>>>>>>>>>>>> >> >>>>>
>>>>>>>>>>>>> >> >>>>> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2025. márc. 14., P, 16:31):
>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>> >> >>>>>> Hi Peter
>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>> >> >>>>>> Thanks for the update. I will do a new pass on the
>>>>>>>>>>>>> PR.
>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>> >> >>>>>> Regards
>>>>>>>>>>>>> >> >>>>>> JB
>>>>>>>>>>>>> >> >>>>>>
>>>>>>>>>>>>> >> >>>>>> On Thu, Mar 13, 2025 at 1:16 PM Péter Váry <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>>>> >> >>>>>> > Hi Team,
>>>>>>>>>>>>> >> >>>>>> > I have rebased the File Format API proposal (
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298) to include the
>>>>>>>>>>>>> new changes needed for the Variant types. I would love to hear 
>>>>>>>>>>>>> your
>>>>>>>>>>>>> feedback, especially Dan and Ryan, as you were the most active 
>>>>>>>>>>>>> during our
>>>>>>>>>>>>> discussions. If I can help in any way to make the review easier, 
>>>>>>>>>>>>> please let
>>>>>>>>>>>>> me know.
>>>>>>>>>>>>> >> >>>>>> > Thanks,
>>>>>>>>>>>>> >> >>>>>> > Peter
>>>>>>>>>>>>> >> >>>>>> >
>>>>>>>>>>>>> >> >>>>>> > Péter Váry <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2025. febr. 28., P, 17:50):
>>>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>>>> >> >>>>>> >> Hi everyone,
>>>>>>>>>>>>> >> >>>>>> >> Thanks for all of the actionable, relevant
>>>>>>>>>>>>> feedback on the PR (
>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/12298).
>>>>>>>>>>>>> >> >>>>>> >> Updated the code to address most of them. Please
>>>>>>>>>>>>> check if you agree with the general approach.
>>>>>>>>>>>>> >> >>>>>> >> If there is a consensus about the general
>>>>>>>>>>>>> approach, I could. separate out the PR to smaller pieces so we 
>>>>>>>>>>>>> can have an
>>>>>>>>>>>>> easier time to review and merge those step-by-step.
>>>>>>>>>>>>> >> >>>>>> >> Thanks,
>>>>>>>>>>>>> >> >>>>>> >> Peter
>>>>>>>>>>>>> >> >>>>>> >>
>>>>>>>>>>>>> >> >>>>>> >> Jean-Baptiste Onofré <[email protected]> ezt írta
>>>>>>>>>>>>> (időpont: 2025. febr. 20., Cs, 14:14):
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> Hi Peter
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> sorry for the late reply on this.
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> I did a pass on the proposal, it's very
>>>>>>>>>>>>> interesting and well written.
>>>>>>>>>>>>> >> >>>>>> >>> I like the DataFile API and definitely worth to
>>>>>>>>>>>>> discuss all together.
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> Maybe we can schedule a specific meeting to
>>>>>>>>>>>>> discuss about DataFile API ?
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> Thoughts ?
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> Regards
>>>>>>>>>>>>> >> >>>>>> >>> JB
>>>>>>>>>>>>> >> >>>>>> >>>
>>>>>>>>>>>>> >> >>>>>> >>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>> >> >>>>>> >>> > Hi Team,
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>> >> >>>>>> >>> > As mentioned earlier on our Community Sync I
>>>>>>>>>>>>> am exploring the possibility to define a FileFormat API for 
>>>>>>>>>>>>> accessing
>>>>>>>>>>>>> different file formats. I have put together a proposal based on 
>>>>>>>>>>>>> my findings.
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>>>> >> >>>>>> >>> > Iceberg currently supports 3 different file
>>>>>>>>>>>>> formats: Avro, Parquet, ORC. With the introduction of Iceberg V3
>>>>>>>>>>>>> specification many new features are added to Iceberg. Some of 
>>>>>>>>>>>>> these
>>>>>>>>>>>>> features like new column types, default values require changes at 
>>>>>>>>>>>>> the file
>>>>>>>>>>>>> format level. The changes are added by individual developers with 
>>>>>>>>>>>>> different
>>>>>>>>>>>>> focus on the different file formats. As a result not all of the 
>>>>>>>>>>>>> features
>>>>>>>>>>>>> are available for every supported file format.
>>>>>>>>>>>>> >> >>>>>> >>> > Also there are emerging file formats like
>>>>>>>>>>>>> Vortex [1] or Lance [2] which either by specialization, or by 
>>>>>>>>>>>>> applying
>>>>>>>>>>>>> newer research results could provide better alternatives for 
>>>>>>>>>>>>> certain
>>>>>>>>>>>>> use-cases like random access for data, or storing ML models.
>>>>>>>>>>>>> >> >>>>>> >>> > -------------------
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>> >> >>>>>> >>> > Please check the detailed proposal [3] and the
>>>>>>>>>>>>> google document [4], and comment there or reply on the dev list 
>>>>>>>>>>>>> if you have
>>>>>>>>>>>>> any suggestions.
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>> >> >>>>>> >>> > Thanks,
>>>>>>>>>>>>> >> >>>>>> >>> > Peter
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>> >> >>>>>> >>> > [1] - https://github.com/spiraldb/vortex
>>>>>>>>>>>>> >> >>>>>> >>> > [2] - https://lancedb.github.io/lance/
>>>>>>>>>>>>> >> >>>>>> >>> > [3] -
>>>>>>>>>>>>> https://github.com/apache/iceberg/issues/12225
>>>>>>>>>>>>> >> >>>>>> >>> > [4] -
>>>>>>>>>>>>> https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds
>>>>>>>>>>>>> >> >>>>>> >>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: [DISCUSS] FileFormat API proposal

Reply via email to