Hi, Peter:

Sorry for the late reply. I took a review of the code again and left some
minor comments. Generally I'm fine with the current approach, looking
forward to seeing it moving forward.

If we see success in the java library, I'm looking forward to introducing
similar things in the iceberg-rust library so that we can adapt to more
formats.

On Thu, Mar 13, 2025 at 8:17 PM Péter Váry <peter.vary.apa...@gmail.com>
wrote:

> Hi Team,
> I have rebased the File Format API proposal (
> https://github.com/apache/iceberg/pull/12298) to include the new changes
> needed for the Variant types. I would love to hear your feedback,
> especially Dan and Ryan, as you were the most active during our
> discussions. If I can help in any way to make the review easier, please let
> me know.
> Thanks,
> Peter
>
> Péter Váry <peter.vary.apa...@gmail.com> ezt írta (időpont: 2025. febr.
> 28., P, 17:50):
>
>> Hi everyone,
>> Thanks for all of the actionable, relevant feedback on the PR (
>> https://github.com/apache/iceberg/pull/12298).
>> Updated the code to address most of them. Please check if you agree with
>> the general approach.
>> If there is a consensus about the general approach, I could. separate out
>> the PR to smaller pieces so we can have an easier time to review and merge
>> those step-by-step.
>> Thanks,
>> Peter
>>
>> Jean-Baptiste Onofré <j...@nanthrax.net> ezt írta (időpont: 2025. febr.
>> 20., Cs, 14:14):
>>
>>> Hi Peter
>>>
>>> sorry for the late reply on this.
>>>
>>> I did a pass on the proposal, it's very interesting and well written.
>>> I like the DataFile API and definitely worth to discuss all together.
>>>
>>> Maybe we can schedule a specific meeting to discuss about DataFile API ?
>>>
>>> Thoughts ?
>>>
>>> Regards
>>> JB
>>>
>>> On Tue, Feb 11, 2025 at 5:46 PM Péter Váry <peter.vary.apa...@gmail.com>
>>> wrote:
>>> >
>>> > Hi Team,
>>> >
>>> > As mentioned earlier on our Community Sync I am exploring the
>>> possibility to define a FileFormat API for accessing different file
>>> formats. I have put together a proposal based on my findings.
>>> >
>>> > -------------------
>>> > Iceberg currently supports 3 different file formats: Avro, Parquet,
>>> ORC. With the introduction of Iceberg V3 specification many new features
>>> are added to Iceberg. Some of these features like new column types, default
>>> values require changes at the file format level. The changes are added by
>>> individual developers with different focus on the different file formats.
>>> As a result not all of the features are available for every supported file
>>> format.
>>> > Also there are emerging file formats like Vortex [1] or Lance [2]
>>> which either by specialization, or by applying newer research results could
>>> provide better alternatives for certain use-cases like random access for
>>> data, or storing ML models.
>>> > -------------------
>>> >
>>> > Please check the detailed proposal [3] and the google document [4],
>>> and comment there or reply on the dev list if you have any suggestions.
>>> >
>>> > Thanks,
>>> > Peter
>>> >
>>> > [1] - https://github.com/spiraldb/vortex
>>> > [2] - https://lancedb.github.io/lance/
>>> > [3] - https://github.com/apache/iceberg/issues/12225
>>> > [4] -
>>> https://docs.google.com/document/d/1sF_d4tFxJsZWsZFCyCL9ZE7YuI7-P3VrzMLIrrTIxds
>>> >
>>>
>>

Reply via email to