hi Rares, Like I said the files should be forward compatible. Can you open a JIRA issue and give code to reproduce the issue?
Thanks On Mon, May 13, 2019 at 7:43 AM Rares Vernica <rvern...@gmail.com> wrote: > > Hi Wes, > > Thanks for your answer. I finally got to test this out. To recap, I'm > writing Arrow files from C++ using Arrow 0.9.0. > > Then, I'm trying to read these files from Python. I tried Python 2.7.15 and > PyArrow 0.10.0 to 0.13.0. In all these cases I get an error. (PyArrow 0.9.0 > works fine, as expected) > > > python2 -c "import pyarrow; > pyarrow.ipc.open_stream('/tmp/foo').read_all()" > Traceback (most recent call last): > File "<string>", line 1, in <module> > File "/home/foo/.local/lib/python2.7/site-packages/pyarrow/ipc.py", line > 123, in open_stream > return RecordBatchStreamReader(source) > File "/home/foo/.local/lib/python2.7/site-packages/pyarrow/ipc.py", line > 58, in __init__ > self._open(source) > File "pyarrow/ipc.pxi", line 312, in > pyarrow.lib._RecordBatchStreamReader._open > File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status > pyarrow.lib.ArrowInvalid: Expected to read 1886221359 metadata bytes, but > only read 8 > > > python2 -c "import pyarrow; > pyarrow.RecordBatchStreamReader('/tmp/foo').read_all()" > Traceback (most recent call last): > File "<string>", line 1, in <module> > File "/home/foo/.local/lib/python2.7/site-packages/pyarrow/ipc.py", line > 58, in __init__ > self._open(source) > File "pyarrow/ipc.pxi", line 312, in > pyarrow.lib._RecordBatchStreamReader._open > File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status > pyarrow.lib.ArrowInvalid: Expected to read 1886221359 metadata bytes, but > only read 8 > > On the other hand on Python 3 they all these cases work fine. > > Thanks! > Rares > > > On Mon, Mar 11, 2019 at 7:16 AM Wes McKinney <wesmck...@gmail.com> wrote: > > > hi Rares -- IPC messages produced by 0.9.0 should be forward > > compatible. I opened https://issues.apache.org/jira/browse/ARROW-921 > > some time ago about adding some tools to integration test one version > > versus another to obtain hard proof of this, but this work has not > > been completed yet (any takers?). > > > > Have you encountered any problems? > > > > Thanks, > > Wes > > > > On Sun, Mar 10, 2019 at 11:49 PM Rares Vernica <rvern...@gmail.com> wrote: > > > > > > Hello, > > > > > > I have a C++ library using Arrow 0.9.0 to serialize data The code looks > > > like this: > > > > > > std::shared_ptr<arrow::RecordBatch> arrowBatch; > > > arrowBatch = arrow::RecordBatch::Make(_arrowSchema, nCells, > > _arrowArrays); > > > > > > std::shared_ptr<arrow::PoolBuffer> arrowBuffer(new > > > arrow::PoolBuffer(_arrowPool)); > > > arrow::io::BufferOutputStream arrowStream(arrowBuffer); > > > > > > std::shared_ptr<arrow::ipc::RecordBatchWriter> arrowWriter; > > > arrow::ipc::RecordBatchStreamWriter::Open(&arrowStream, _arrowSchema, > > > &arrowWriter); > > > > > > arrowWriter->WriteRecordBatch(*arrowBatch); > > > ... > > > reinterpret_cast<const char*>(arrowBuffer->data()), arrowBuffer->size()) > > > ... > > > > > > The output bytes are then read in Python using pyarrow: > > > > > > pyarrow.RecordBatchStreamReader(pyarrow.BufferReader(buf)).read_pandas() > > > > > > Since the C++ side uses Arrow 0.9.0 I have been using pyarrow==0.9.0. > > When > > > using Python 3.7, getting pyarrow=0.9.0 is not easy since there are no > > > per-compiled .whl packages on PyPI. > > > > > > I wonder if I could use newer pyarrow versions to parse the Arrow 0.9.0 > > > ouput? Is the format compatible? > > > > > > Thanks! > > > Rares > >