[ https://issues.apache.org/jira/browse/ARROW-18164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche resolved ARROW-18164. ------------------------------------------- Fix Version/s: 11.0.0 Resolution: Fixed Issue resolved by pull request 14516 [https://github.com/apache/arrow/pull/14516] > [Python] Dataset scanner does not follow default memory pool setting > -------------------------------------------------------------------- > > Key: ARROW-18164 > URL: https://issues.apache.org/jira/browse/ARROW-18164 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Joris Van den Bossche > Assignee: Joris Van den Bossche > Priority: Major > Labels: pull-request-available > Fix For: 11.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Even if I set the system memory pool as default, it still uses the jemalloc > one (running this on Ubuntu where jemalloc is the default if not set by the > user): > {code} > import pyarrow as pa > import pyarrow.dataset as ds > import pyarrow.parquet as pq > pq.write_table(pa.table({'a': [1, 2, 3]}), "test.parquet") > In [2]: pa.set_memory_pool(pa.system_memory_pool()) > In [3]: pa.total_allocated_bytes() > Out[3]: 0 > In [4]: table = ds.dataset("test.parquet").to_table() > In [5]: pa.total_allocated_bytes() > Out[5]: 0 > In [6]: pa.set_memory_pool(pa.jemalloc_memory_pool()) > In [7]: pa.total_allocated_bytes() > Out[7]: 128 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)