Hi
This is c++ specific, but imo the question applies more broadly.
I understood that the rationale for stats in compressed+encoded formats
like parquet is that computing those stats has a high cost (io + decompress
+ decode + aggregate). This motivates the materialization of aggregates.
In arro
Generally I think this is a good idea that has been proposed before but I
don't think we could ever make progress on design.
On Sun, Jun 2, 2024 at 7:17 PM Sutou Kouhei wrote:
> Hi,
>
> Related GitHub issue:
> https://github.com/apache/arrow/issues/41909
>
> How about adding arrow::ArrayStatisti
I did want to start off by acknowledging that all of the pros you listed
for mimalloc are accurate.
I did want to contribute the times that people have been caught off-guard
by the perceived increased memory allocation of mimalloc compared to the
alternatives:
E.g. https://github.com/microsoft/mim
Hello,
Arrow C++ features a MemoryPool abstraction that allows using different
allocators interchangeably. Several MemoryPool implementations are
provided with Arrow C++ (though one can also build their own):
- a jemalloc-based implementation, currently the default on Linux
- a mimalloc-bas