Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-09 Thread Sung Yun
Agreed Fokko :) I've cherry-picked the commits in the 0.7.1 milestone into a new branch, and created this PR: https://github.com/apache/iceberg-python/pull/1031 If this looks good, I can start the release for 0.7.1. Sung On Fri, Aug 9, 2024 at 5:05 AM Fokko Driesprong wrote: > Hey Sung, > > T

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-09 Thread Fokko Driesprong
Hey Sung, That's a great find. I just merged the PR, and it would be good to get the release process rolling to get #1026 out to the users. Kind regards, Fokko Op do 8 aug 2024 om 23:20 schreef Sung Yun : > Thank you for reporting the issues

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread Sung Yun
Thank you for reporting the issues and putting in the fixes Fokko and André. We also identified a correctness issue with applying positional deletes on merge-on-read tables that I think also must be included into this release. Here's the PR that resolves the issue: https://github.com/apache/iceber

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread André Luis Anastácio
I fixed an overwrite error that, I think, would be good to include in the 0.7.1 release https://github.com/apache/iceberg-python/pull/1023 André Anastácio On Thursday, August 8th, 2024 at 4:29 AM, Fokko Driesprong wrote: > Thanks everyone for the input here, and I agree that the aforementione

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-08 Thread Fokko Driesprong
Thanks everyone for the input here, and I agree that the aforementioned #995 and #997 by Sung, and #526 by André would also be good to includ

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-06 Thread André Luis Anastácio
What do you think about adding the fix that excludes PyIceberg support for Python 3.9.7 in the 0.7.1 release?[1] It already doesn't work, so this is just to avoid any new issues. - [1]: https://github.com/apache/iceberg-python/pull/526 André Anastácio On Tuesday, August 6th, 2024 at 4:06 PM,

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-06 Thread Sung Yun
Sounds good folks! Thank you for sharing your thoughts. We'll work on getting the patch release out, and continue the discussion on upgrading the PyArrow version to 17.0.0 in time for 0.8.0 release. Just adding these two more fixes that were introduced that I think we should pull into the patch

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-06 Thread Kevin Liu
I'm +1 for getting 0.7.1 out fast with patches. And target Arrow 17.0.0 upgrade as part of the 0.8.0 release. Thanks, Kevin Liu On Tue, Aug 6, 2024 at 11:43 AM Fokko Driesprong wrote: > The issues above sound like patches to me, fixing issues discovered during >> the 0.7.0 release. Is there a r

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-06 Thread Fokko Driesprong
> > The issues above sound like patches to me, fixing issues discovered during > the 0.7.0 release. Is there a reason to move to 0.8.0? > That would also allow us to add new features :) I'm also okay with a 0.7.1 release +1 on this concern. Is it possible to make the Arrow 17.0.0 upgrade > option

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-06 Thread Kevin Liu
> Typically we only push patches into the minor versions, we could also go to version 0.8.0 immediately. The issues above sound like patches to me, fixing issues discovered during the 0.7.0 release. Is there a reason to move to 0.8.0? > I'm still on the fence regarding 17.0.0 upgrade. There are c

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-02 Thread Sung Yun
Hi Fokko, That makes sense, thank you for the suggestion! The issue was quite severe for us that we had to fork the repo and have a fix ourselves in order to run PyIceberg without our applications going OOM. So I think there will be value in getting the proposed config property out as early as

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-02 Thread Fokko Driesprong
Hey Sung, Typically we only push patches into the minor versions, we could also go to version 0.8.0 immediately. Regarding the memory consumption, thanks for putting those numbers together! I would also love to get #929 , so we can push down the

Re: [DISCUSS] PyIceberg 0.7.1 release

2024-08-02 Thread Sung Yun
Hi folks, We identified inefficient memory usage hikes with the current way of upcasting pyarrow types to large_ on read, when reading tables with certain characteristics. A detailed set of example benchmarks of this issue is on the google document linked on PR #986: https://github.com/apache/

[DISCUSS] PyIceberg 0.7.1 release

2024-07-30 Thread Sung Yun
Hi folks, We are starting to compile the list of issues to fix and port into the 0.7.1 release. The current list of known issues is as follows: Fix pydantic warning on table commit: #972 (thanks for the quick fix ndrluis!) Issue when rewriting