Thanks a lot, Wes! That was the issue. Good catch!
On Tue, Jun 16, 2020 at 9:39 AM Wes McKinney wrote:
> It looks like on Python 2.7 that the open_stream/open_file functions
> are treating the file name that you are passing as a binary buffer
> rather than a file path (inferring from the fact th
It looks like on Python 2.7 that the open_stream/open_file functions
are treating the file name that you are passing as a binary buffer
rather than a file path (inferring from the fact that '1' is one byte
in Py2.7 and 'foo' is 3 bytes). Try passing an open file handle
instead
On Tue, Jun 16, 2020
Thank you for your help in getting to the bottom of this. It seems that
there is no problem with the C++ code, but the PyArrow/Python 2.7
combination.
Here are more details. I have two C++ programs writing two Arrow files. The
first one is the bigger plugin I'm attempting to port and the second o
Hi Rares,
This last issue sounds like you are trying to write data from 0.16.0
version of the library and read it from a pre-0.15.0 version of the python
library. If you want to do this you need to set "bool
write_legacy_ipc_format" to true on IpcWriterOptions/IpcOptions object and
construct the
With open_stream I get a different error:
> python -c "import pyarrow; pyarrow.ipc.open_stream('/tmp/foo')"
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python2.7/dist-packages/pyarrow/ipc.py", line 137,
in open_stream
return RecordBatchStreamReader(source)
On Mon, Jun 15, 2020 at 11:24 PM Rares Vernica wrote:
>
> I was able to reproduce my issue in a small, fully-contained, program. Here
> is the source code:
>
> #include
> #include
> #include
> #include
>
> arrow::Status foo() {
> std::shared_ptr arrowStream;
> std::shared_ptr arrowWriter;
I was able to reproduce my issue in a small, fully-contained, program. Here
is the source code:
#include
#include
#include
#include
arrow::Status foo() {
std::shared_ptr arrowStream;
std::shared_ptr arrowWriter;
std::shared_ptr arrowBatch;
std::shared_ptr arrowReader;
std::vector>
This is the compiler:
> g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
And this is how I compile the code:
g++ -W -Wextra -Wall -Wno-unused-parameter -Wno-variadic-macros
-Wno-strict-aliasing -Wno-long-long -Wno-unused -fPIC -D_STDC_FORMAT_MACROS
-Wno-system-headers -O3 -g -DN
What compiler are you using?
In 0.16.0 (what you said you were targeting, though it would be better
for you to upgrade to 0.17.1) schema is written in the CheckStarted
function here
https://github.com/apache/arrow/blob/apache-arrow-0.16.0/cpp/src/arrow/ipc/writer.cc#L972
Status CheckStarted() {
Sure, here is briefly what I'm doing:
bool append = false;
std::shared_ptr arrowStream;
auto arrowResult = arrow::io::FileOutputStream::Open(fileName, append);
arrowStream = arrowResult.ValueOrDie();
std::shared_ptr arrowWriter;
std::shared_ptr arrowBatch;
std::shared_
Can you show the code you are writing? The first thing the stream writer
does before writing any record batch is write the schema. It sounds like
you are using arrow::ipc::WriteRecordBatch somewhere.
On Sun, Jun 14, 2020, 11:44 PM Rares Vernica wrote:
> Hello,
>
> I have a RecordBatch that I wou
Hello,
I have a RecordBatch that I would like to write to a file. I'm using
FileOutputStream::Open to open the file and RecordBatchStreamWriter::Open
to open the stream. I write a record batch with WriteRecordBatch. Finally,
I close the RecordBatchWriter and OutputStream.
The resulting file size
12 matches
Mail list logo