Wes McKinney created ARROW-1142:
---
Summary: [C++] Move over compression library toolchain from
parquet-cpp
Key: ARROW-1142
URL: https://issues.apache.org/jira/browse/ARROW-1142
Project: Apache Arrow
Kevin Grealish created ARROW-1141:
-
Summary: on import get libjemalloc.so.2: cannot allocate memory in
static TLS block
Key: ARROW-1141
URL: https://issues.apache.org/jira/browse/ARROW-1141
Project: A
Phillip Cloud created ARROW-1140:
Summary: [C++] Allow optional build of plasma
Key: ARROW-1140
URL: https://issues.apache.org/jira/browse/ARROW-1140
Project: Apache Arrow
Issue Type: Improve
Phillip Cloud created ARROW-1139:
Summary: dlmalloc doesn't allow arrow to be built with clang 4 or
gcc 7.1.1
Key: ARROW-1139
URL: https://issues.apache.org/jira/browse/ARROW-1139
Project: Apache Arro
Hello,
I am writing some code that interacts with the Arrow Java library in Apache
Spark and trying to understand the best way to use/manage buffer
allocators. I am wondering:
(1) Is buffer allocator thread safe?
(2) Should I create a root allocator (maybe one per jvm?) to allocate all
memory?
T
If you want to use pure Python, you should probably just use the s3fs
package. We should be able to get better throughput using C++ (and making
using multithreading to make multiple requests for larger reads) -- the AWS
C++ SDK probably has everything we need to make a really strong
implementation.
I am using a pa.PythonFile() wrapping the file-like object provided by s3fs
package. I am able to write parquet files directly to S3 this way. I am not
reading using pyarrow (reading gzipped csvs with python) but I imagine it would
work much the same.
-- sent from my phone --
> On Jun 22, 201