> > The issues above sound like patches to me, fixing issues discovered during > the 0.7.0 release. Is there a reason to move to 0.8.0? >
That would also allow us to add new features :) I'm also okay with a 0.7.1 release +1 on this concern. Is it possible to make the Arrow 17.0.0 upgrade > optional first? So that folks who want the upgrade can test it out. If we go with a 0.7.1. How about targeting Arrow 17.0.0 to PyIceberg 0.8.0? Kind regards, Fokko Op di 6 aug 2024 om 20:30 schreef Kevin Liu <kevin.jq....@gmail.com>: > > Typically we only push patches into the minor versions, we could also > go to version 0.8.0 immediately. > > The issues above sound like patches to me, fixing issues discovered during > the 0.7.0 release. Is there a reason to move to 0.8.0? > > > I'm still on the fence regarding 17.0.0 upgrade. There are clear > functional upsides, but I feel that constraining PyIceberg to just one > published version would make the adoption of PyIceberg difficult for our > users. > > +1 on this concern. Is it possible to make the Arrow 17.0.0 upgrade > optional first? So that folks who want the upgrade can test it out. > > Thanks, > Kevin Liu > > > > On Fri, Aug 2, 2024 at 11:33 AM Sung Yun <sun...@apache.org> wrote: > >> Hi Fokko, >> >> That makes sense, thank you for the suggestion! The issue was quite >> severe for us that we had to fork the repo and have a fix ourselves in >> order to run PyIceberg without our applications going OOM. So I think there >> will be value in getting the proposed config property out as early as >> possible for the larger community. >> >> I'm still on the fence regarding 17.0.0 upgrade. There are clear >> functional upsides, but I feel that constraining PyIceberg to just one >> published version would make the adoption of PyIceberg difficult for our >> users. Users writing new applications won't have trouble with it, but users >> intending to use PyIceberg in an existing application may have to upgrade >> their PyArrow versions which could be a deterrent (or a welcome nudge). >> Would it be worth starting that discussion on a separate thread? >> >> Sung >> >> On 2024/08/02 17:57:17 Fokko Driesprong wrote: >> > Hey Sung, >> > >> > Typically we only push patches into the minor versions, we could also >> go to >> > version 0.8.0 immediately. >> > >> > Regarding the memory consumption, thanks for putting those numbers >> > together! I would also love to get #929 >> > <https://github.com/apache/iceberg-python/pull/929>, so we can push >> down >> > the large/small type to PyArrow (only for to_arrow), and apply #986 >> > <https://github.com/apache/iceberg-python/pull/986> on top if you want >> to >> > force it to either small or large types. >> > >> > WDYT? >> > >> > Kind regards, >> > Fokko >> > >> > >> > Op vr 2 aug 2024 om 19:46 schreef Sung Yun <sun...@apache.org>: >> > >> > > Hi folks, >> > > >> > > We identified inefficient memory usage hikes with the current way of >> > > upcasting pyarrow types to large_<type> on read, when reading tables >> with >> > > certain characteristics. A detailed set of example benchmarks of this >> issue >> > > is on the google document linked on PR #986: >> > > https://github.com/apache/iceberg-python/pull/986 >> > > >> > > The proposed solution introduces a config to override this behavior >> to use >> > > small types instead, and I'd like to add this into the patch release >> to >> > > give users better control over their memory usage. >> > > >> > > Also, this is just a gentle reminder that this DISCUSS thread is still >> > > open for any new issues that are identified from 0.7.0 release, that >> we >> > > should fix in the patch release. >> > > >> > > Thank you, >> > > Sung >> > > >> > > On 2024/07/30 23:57:04 Sung Yun wrote: >> > > > Hi folks, >> > > > >> > > > We are starting to compile the list of issues to fix and port into >> the >> > > > 0.7.1 release. >> > > > >> > > > The current list of known issues is as follows: >> > > > >> > > > Fix pydantic warning on table commit: #972 >> > > > <https://github.com/apache/iceberg-python/pull/972> (thanks for the >> > > quick >> > > > fix ndrluis!) >> > > > Issue when rewriting an unpartitioned table: #979 >> > > > <https://github.com/apache/iceberg-python/issues/979> >> > > > Issue when evolving and writing in the same transaction: #980 >> > > > <https://github.com/apache/iceberg-python/issues/980> >> > > > >> > > > Please feel free to respond to this thread with any issues that >> should be >> > > > tracked for the patch release. >> > > > >> > > > Thank you! >> > > > Sung >> > > > >> > > >> > >> >