Re: pyarrow Table.from_pylist doesn;t release memory

2023-06-15 Thread Weston Pace
Note that you can ask pyarrow how much memory it thinks it is using with the pyarrow.total_allocated_bytes[1] function. This can be very useful for tracking memory leaks. I see that memory-profiler now has support for different backends. Sadly, it doesn't look like you can register a custom backe

Re: pyarrow Table.from_pylist doesn;t release memory

2023-06-15 Thread Antoine Pitrou
Hi Alex, I think you're misinterpreting the results. Yes, the RSS memory (as reported by memory_profiler) doesn't seem to decrease. No, it doesn't mean that Arrow doesn't release memory. It's actually common for memory allocators (such as jemalloc, or the system allocator) to keep deallocat

Re: pyarrow Table.from_pylist doesn;t release memory

2023-06-15 Thread Jerald Alex
Hi Experts, I have come across the memory pool configurations using an environment variable *ARROW_DEFAULT_MEMORY_POOL* and I tried to make use of them and test it. I could observe improvements on macOS with the *system* memory pool but no change on linux os. I have captured more details on GH is

pyarrow Table.from_pylist doesn;t release memory

2023-06-14 Thread Jerald Alex
Hi Experts, Pyarrow *Table.from_pylist* does not release memory until the program terminates. I created a sample script to highlight the issue. I have also tried setting up `pa.jemalloc_set_decay_ms(0)` but it didn't help much. Could you please check this and let me know if there are potential iss