Hi folks,

We identified inefficient memory usage hikes with the current way of upcasting 
pyarrow types to large_<type> on read, when reading tables with certain 
characteristics. A detailed set of example benchmarks of this issue is on the 
google document linked on PR #986: 
https://github.com/apache/iceberg-python/pull/986

The proposed solution introduces a config to override this behavior to use 
small types instead, and I'd like to add this into the patch release to give 
users better control over their memory usage.

Also, this is just a gentle reminder that this DISCUSS thread is still open for 
any new issues that are identified from 0.7.0 release, that we should fix in 
the patch release.

Thank you,
Sung

On 2024/07/30 23:57:04 Sung Yun wrote:
> Hi folks,
> 
> We are starting to compile the list of issues to fix and port into the
> 0.7.1 release.
> 
> The current list of known issues is as follows:
> 
> Fix pydantic warning on table commit: #972
> <https://github.com/apache/iceberg-python/pull/972> (thanks for the quick
> fix ndrluis!)
> Issue when rewriting an unpartitioned table: #979
> <https://github.com/apache/iceberg-python/issues/979>
> Issue when evolving and writing in the same transaction: #980
> <https://github.com/apache/iceberg-python/issues/980>
> 
> Please feel free to respond to this thread with any issues that should be
> tracked for the patch release.
> 
> Thank you!
> Sung
> 

Reply via email to