Re: Flight benchmark question

2020-06-16 Thread Yibo Cai
Find a way to achieve reasonable benchmark result with multiple threads. Diff pasted below for a quick review or try. Tested on E5-2650, with this change: num_threads = 1, speed = 1996 num_threads = 2, speed = 3555 num_threads = 4, speed = 5828 When running `arrow_flight_benchmark`, I find there

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-16 Thread Rares Vernica
Thanks a lot, Wes! That was the issue. Good catch! On Tue, Jun 16, 2020 at 9:39 AM Wes McKinney wrote: > It looks like on Python 2.7 that the open_stream/open_file functions > are treating the file name that you are passing as a binary buffer > rather than a file path (inferring from the fact th

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-16 Thread Wes McKinney
It looks like on Python 2.7 that the open_stream/open_file functions are treating the file name that you are passing as a binary buffer rather than a file path (inferring from the fact that '1' is one byte in Py2.7 and 'foo' is 3 bytes). Try passing an open file handle instead On Tue, Jun 16, 2020

Re: C++ Write Schema with RecordBatchStreamWriter

2020-06-16 Thread Rares Vernica
Thank you for your help in getting to the bottom of this. It seems that there is no problem with the C++ code, but the PyArrow/Python 2.7 combination. Here are more details. I have two C++ programs writing two Arrow files. The first one is the bigger plugin I'm attempting to port and the second o

Re: [DISCUSS] [C++] custom allocator for large objects

2020-06-16 Thread Rémi Dettai
Hi Antoine and all ! Sorry for the delay, I wanted to understand things a bit better before getting back to you. As discussed, I focussed on the Parquet case. I've looked into parquet/encoding.cc to see what could be done to have a better memory reservation with ByteArrays. On my journey, I noti

[NIGHTLY] Arrow Build Report for Job nightly-2020-06-16-0

2020-06-16 Thread Crossbow
Arrow Build Report for Job nightly-2020-06-16-0 All tasks: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-06-16-0 Failed Tasks: - test-conda-python-3.7-dask-latest: URL: https://github.com/ursa-labs/crossbow/branches/all?query=nightly-2020-06-16-0-github-test-conda-py