Re: [DISCUSS] FileFormat API proposal

2025-05-07 Thread Péter Váry
Hi everyone, The proposed API part is reviewed and ready to go. See: https://github.com/apache/iceberg/pull/12774 Thanks to everyone who reviewed it already! Many of you wanted to review, but I know that the time constraints are there for everyone. I still very much would like to hear your voices,

Re: [DISCUSS] FileFormat API proposal

2025-04-15 Thread Péter Váry
Hi Renjie, The first one for the proposed new API is here: https://github.com/apache/iceberg/pull/12774 Thanks, Peter On Wed, Apr 16, 2025, 05:40 Renjie Liu wrote: > Hi, Peter: > > Thanks for the effort. I totally agree with splitting them into smaller > prs to move forward. > > I'm quite intere

Re: [DISCUSS] FileFormat API proposal

2025-04-15 Thread Renjie Liu
Hi, Peter: Thanks for the effort. I totally agree with splitting them into smaller prs to move forward. I'm quite interested in this topic, and please ping me in those splitted prs and I'll help to review. On Mon, Apr 14, 2025 at 11:22 PM Jean-Baptiste Onofré wrote: > Hi Peter > > Awesome ! Th

Re: [DISCUSS] FileFormat API proposal

2025-04-14 Thread Jean-Baptiste Onofré
Hi Peter Awesome ! Thank you so much ! I will do a new pass. Regards JB On Fri, Apr 11, 2025 at 3:48 PM Péter Váry wrote: > > Hi JB, > > Separated out the proposed interfaces to a new PR: > https://github.com/apache/iceberg/pull/12774. > Reviewers can check that out if they are only interested

Re: [DISCUSS] FileFormat API proposal

2025-04-11 Thread Péter Váry
Hi JB, Separated out the proposed interfaces to a new PR: https://github.com/apache/iceberg/pull/12774. Reviewers can check that out if they are only interested in how the new API would look like. Thanks, Peter Jean-Baptiste Onofré ezt írta (időpont: 2025. ápr. 10., Cs, 18:25): > Hi Peter > >

Re: [DISCUSS] FileFormat API proposal

2025-04-10 Thread Jean-Baptiste Onofré
Hi Peter Thanks for the ping about the PR. Maybe, to facilitate the review and move forward faster, we should split the PR in smaller PRs: - one with the interfaces (ReadBuilder, AppenderBuilder, ObjectModel, AppenderBuilder, DataWriterBuilder, ...) - one for each file providers (Parquet, Avro, O

Re: [DISCUSS] FileFormat API proposal

2025-04-10 Thread Péter Váry
Since the 1.9.0 release candidate has been created, I would like to resurrect this PR: https://github.com/apache/iceberg/pull/12298 to ensure that we have as long a testing period as possible for it. To recap, here is what the PR does after the review rounds: - *Created 3 interface classes whi

Re: [DISCUSS] FileFormat API proposal

2025-04-05 Thread Péter Váry
Hi Renije, *> 1. **File format filters* *>* > Do the filters include both filter expressions from both user query and delete filter? The current discussion is about the filters from the user query. About the delete filter: Based on the suggestions on the PR, I have moved the delete filter out fr

Re: [DISCUSS] FileFormat API proposal

2025-04-05 Thread Péter Váry
Hi everyone, I have updated the File Format API PR ( https://github.com/apache/iceberg/pull/12298) based on the answers and review comments. I would like to merge this only after the 1.9.0 release so we have more time finding any issues and solving them before this goes to a release for the users

Re: [DISCUSS] FileFormat API proposal

2025-03-21 Thread Renjie Liu
Hi, Peter: Thanks for the effort on this. *1. **File format filters* Do the filters include both filter expressions from both user query and delete filter? For filters from user query, I agree with you that we should keep the current behavior. For delete filters associated with data files, at

Re: [DISCUSS] FileFormat API proposal

2025-03-20 Thread Péter Váry
Hi Team, Thanks everyone for the reviews on https://github.com/apache/iceberg/pull/12298! I have addressed most of comments, but a few questions still remain which might merit a bit wider audience: 1. We should decide on the expected filtering behavior when the filters are pushed down to the

Re: [DISCUSS] FileFormat API proposal

2025-03-14 Thread Jean-Baptiste Onofré
Hi Peter Thanks for the update. I will do a new pass on the PR. Regards JB On Thu, Mar 13, 2025 at 1:16 PM Péter Váry wrote: > > Hi Team, > I have rebased the File Format API proposal > (https://github.com/apache/iceberg/pull/12298) to include the new changes > needed for the Variant types. I

Re: [DISCUSS] FileFormat API proposal

2025-03-14 Thread Renjie Liu
Hi, Peter: Sorry for the late reply. I took a review of the code again and left some minor comments. Generally I'm fine with the current approach, looking forward to seeing it moving forward. If we see success in the java library, I'm looking forward to introducing similar things in the iceberg-r

Re: [DISCUSS] FileFormat API proposal

2025-03-13 Thread Péter Váry
Hi Team, I have rebased the File Format API proposal ( https://github.com/apache/iceberg/pull/12298) to include the new changes needed for the Variant types. I would love to hear your feedback, especially Dan and Ryan, as you were the most active during our discussions. If I can help in any way to

Re: [DISCUSS] FileFormat API proposal

2025-02-28 Thread Péter Váry
Hi everyone, Thanks for all of the actionable, relevant feedback on the PR ( https://github.com/apache/iceberg/pull/12298). Updated the code to address most of them. Please check if you agree with the general approach. If there is a consensus about the general approach, I could. separate out the PR

Re: [DISCUSS] FileFormat API proposal

2025-02-20 Thread Jean-Baptiste Onofré
Hi Peter sorry for the late reply on this. I did a pass on the proposal, it's very interesting and well written. I like the DataFile API and definitely worth to discuss all together. Maybe we can schedule a specific meeting to discuss about DataFile API ? Thoughts ? Regards JB On Tue, Feb 11,

Re: [DISCUSS] FileFormat API proposal

2025-02-18 Thread Péter Váry
Accidentally force-pushed :( The new links are here: - https://github.com/apache/iceberg/pull/12298/commits/583cccb6e036323ee74a74bf3b06a40bf16f8982 - The API Interface classes - https://github.com/apache/iceberg/pull/12298/commits/217e68caa61667032da3d710401078bb50b0a99f - Mov

Re: [DISCUSS] FileFormat API proposal

2025-02-18 Thread Péter Váry
Hi Renjie, Based on your feedback, I have created a PR which separates out the different logical parts to different commits: https://github.com/apache/iceberg/pull/12298 The following parts are separated: - https://github.com/apache/iceberg/pull/12298/commits/1ad230f67df014b424c3547603831f

Re: [DISCUSS] FileFormat API proposal

2025-02-14 Thread Péter Váry
Hi Renjie, Here is the WIP PR for the readers: https://github.com/apache/iceberg/pull/12069 Here is the WIP PR for the writers: https://github.com/apache/iceberg/pull/12164 If you want to concentrate on the proposed new API, maybe this is the best place to start: https://github.com/apache/iceberg/

Re: [DISCUSS] FileFormat API proposal

2025-02-14 Thread Renjie Liu
Hi, Peter: Thanks for raising this, and this proposal sounds quite interesting to me. I've reviewed the doc but it still seems too abstract to understand, do you mind to submit a pr so that it would be more clear what's changed? On Wed, Feb 12, 2025 at 12:46 AM Péter Váry wrote: > Hi Team, > >