Hi,
I’m experiencing problem reading parquet files written with the
`use_dictionary=[]` option in pyarrow 2.0.0. If I write a parquet file in
2.0.0 reading it in 8.0.0 gives:
>>> pd.read_parquet(‘dataset.parq')
>
Traceback (most recent call last):
>
File "", line 1, in
>
File
> "/Library/Fra
With 6 binding +1s (and 2 non-binding), we've approved marking the C stream
interface as stable. I will move forward with the pull requests to update
the documentation.
On Thu, Jun 9, 2022 at 2:19 PM Neal Richardson
wrote:
> +1
>
> On Wed, Jun 8, 2022 at 7:44 PM Sutou Kouhei wrote:
>
> > +1
> >
Hello,
This comment is regarding installation with `apt` on ubuntu 18.04 ...
`libarrow-dev/bionic,now 8.0.0-1 amd64`
I'm a bit confused about the memory pool situation:
* I run with `ARROW_DEFAULT_MEMORY_POOL=system` and check that
`arrow::default_memory_pool()->backend_name() ==
arrow::system_m
A code review has demonstrated that Arrow uses posix_memalign ... I do
believe mimalloc preload is "catching" this but I didn't tool it with my
customization. Still interested in any guidance on the other points
raised, and sorry for some of this being noise.
-John
On Tue, Jun 14, 2022 at 9:06 A
I can try and give a more detailed answer later in the week but the
gist of it is that Arrow manages all "buffer allocations" with a
memory pool. These are the allocations for the actual data in the
arrays. These are the allocations that use the memory pool configured
by ARROW_DEFAULT_MEMORY_POOL
Sorry, that should have said "when Arrow builds jemalloc". Here is
the command we send down (from ThirdPartyToolchain.cmake):
```
JEMALLOC_CONFIGURE_COMMAND
"--prefix=${JEMALLOC_PREFIX}"
"--libdir=${JEMALLOC_LIB_DIR}"
"--with-jemalloc-prefix=je_arrow_"
"--with-private-namespace=je_arrow_private_"
I take that back... the preload is not intercepting memory_pool.cc
-> SystemAllocator -> AllocateAligned -> posix_memalign (if indeed this is
the system allocator path), although it is intercepting posix_memalign from
a different .so
On Tue, Jun 14, 2022 at 10:27 AM John Muehlhausen wrote:
> A c
I'm using ARROW_DEFAULT_MEMORY_POOL=system
Based on a review of memory_pool.cc I expect this to become posix_memalign
calls on Linux
When I call posiix_memalign in a .so that I created and linked with my app,
using LD_PRELOAD=/usr/local/lib/libmimalloc.so to run the app, these calls
get forwarded
My best guess at this moment is that the Arrow lib I'm using was built with
a compiler that had something like __builtin_posix_memalign in effect ??
I say this because deploying __builtin_malloc has the same deleterious
effect on my own .so
On Tue, Jun 14, 2022 at 10:53 AM John Muehlhausen wrote
A minimal build using the following seems to have solved my problem. The
various no-builtin params are guesswork based largely on alloc-override.c
from mimalloc. It would be nice if someone documented somewhere how to
turn off classes of builtins for each popular compiler or if this received
comp
Hi,
posix_memalign() in memory_pool.cc of libarrow-dev uses
jemalloc's posix_memalign() (je_posix_memalign()). Because
it's built with ARROW_JEMALLOC=ON (default) and
JEMALLOC_MANGLE
https://github.com/apache/arrow/blob/master/cpp/src/arrow/memory_pool.cc#L53
. So we can't use mimalloc with LD_PRE
Hi,
There is no objection. I'll remove
cpp/src/arrow/dbi/hiveserver2/:
https://issues.apache.org/jira/browse/ARROW-16832
Thanks,
--
kou
In <20220607.145634.286204450295433958@clear-code.com>
"Re: [C++] Can we remove cpp/src/arrow/dbi/hiveserver2?" on Tue, 07 Jun 2022
14:56:34 +0900 (JS
Thanks for the reply. I had disabled jemalloc
via ARROW_DEFAULT_MEMORY_POOL so that was not the issue.
The issue was (I think) that the arrow lib I was using was built with
compiler builtins (such as __builtin_posix_memalign) so that even the
system default allocator wasn't able to be intercepted
Hi,
I think that compiler builtins aren't related. Could you try
only with -DARROW_JEMALLOC=OFF?
Thanks,
--
kou
In
"Re: Custom default C++ memory pool on Linux, and/or interception/auditing of
system pool" on Tue, 14 Jun 2022 18:32:00 -0500,
John Muehlhausen wrote:
> Thanks for the reply
Hi,
Could you try https://github.com/apache/arrow/pull/13373 ?
This will work with -DARROW_JEMALLOC=ON because it doesn't
override posix_memalign() in the system memory pool even
when -DARROW_JEMALLOC=ON is specified.
Thanks,
--
kou
In <20220615.083854.1117478143326800670@clear-code.com>
Hi all,
I drafted a second PR [1] drafting a design for storing parsed information
obtained from a struct ArrowSchema (i.e., parsing the format string into
usable C structures). There are some unsolved problems that could use a
fresh perspective...all comments welcome!
[1] https://github.com/pale
16 matches
Mail list logo