hi Rares,

Like I said the files should be forward compatible. Can you open a
JIRA issue and give code to reproduce the issue?

Thanks

On Mon, May 13, 2019 at 7:43 AM Rares Vernica <rvern...@gmail.com> wrote:
>
> Hi Wes,
>
> Thanks for your answer. I finally got to test this out. To recap, I'm
> writing Arrow files from C++ using Arrow 0.9.0.
>
> Then, I'm trying to read these files from Python. I tried Python 2.7.15 and
> PyArrow 0.10.0 to 0.13.0. In all these cases I get an error. (PyArrow 0.9.0
> works fine, as expected)
>
> > python2 -c "import pyarrow;
> pyarrow.ipc.open_stream('/tmp/foo').read_all()"
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
>   File "/home/foo/.local/lib/python2.7/site-packages/pyarrow/ipc.py", line
> 123, in open_stream
>     return RecordBatchStreamReader(source)
>   File "/home/foo/.local/lib/python2.7/site-packages/pyarrow/ipc.py", line
> 58, in __init__
>     self._open(source)
>   File "pyarrow/ipc.pxi", line 312, in
> pyarrow.lib._RecordBatchStreamReader._open
>   File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status
> pyarrow.lib.ArrowInvalid: Expected to read 1886221359 metadata bytes, but
> only read 8
>
> > python2 -c "import pyarrow;
> pyarrow.RecordBatchStreamReader('/tmp/foo').read_all()"
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
>   File "/home/foo/.local/lib/python2.7/site-packages/pyarrow/ipc.py", line
> 58, in __init__
>     self._open(source)
>   File "pyarrow/ipc.pxi", line 312, in
> pyarrow.lib._RecordBatchStreamReader._open
>   File "pyarrow/error.pxi", line 81, in pyarrow.lib.check_status
> pyarrow.lib.ArrowInvalid: Expected to read 1886221359 metadata bytes, but
> only read 8
>
> On the other hand on Python 3 they all these cases work fine.
>
> Thanks!
> Rares
>
>
> On Mon, Mar 11, 2019 at 7:16 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > hi Rares -- IPC messages produced by 0.9.0 should be forward
> > compatible. I opened https://issues.apache.org/jira/browse/ARROW-921
> > some time ago about adding some tools to integration test one version
> > versus another to obtain hard proof of this, but this work has not
> > been completed yet (any takers?).
> >
> > Have you encountered any problems?
> >
> > Thanks,
> > Wes
> >
> > On Sun, Mar 10, 2019 at 11:49 PM Rares Vernica <rvern...@gmail.com> wrote:
> > >
> > > Hello,
> > >
> > > I have a C++ library using Arrow 0.9.0 to serialize data The code looks
> > > like this:
> > >
> > > std::shared_ptr<arrow::RecordBatch> arrowBatch;
> > > arrowBatch = arrow::RecordBatch::Make(_arrowSchema, nCells,
> > _arrowArrays);
> > >
> > > std::shared_ptr<arrow::PoolBuffer> arrowBuffer(new
> > > arrow::PoolBuffer(_arrowPool));
> > > arrow::io::BufferOutputStream arrowStream(arrowBuffer);
> > >
> > > std::shared_ptr<arrow::ipc::RecordBatchWriter> arrowWriter;
> > > arrow::ipc::RecordBatchStreamWriter::Open(&arrowStream, _arrowSchema,
> > > &arrowWriter);
> > >
> > > arrowWriter->WriteRecordBatch(*arrowBatch);
> > > ...
> > > reinterpret_cast<const char*>(arrowBuffer->data()), arrowBuffer->size())
> > > ...
> > >
> > > The output bytes are then read in Python using pyarrow:
> > >
> > > pyarrow.RecordBatchStreamReader(pyarrow.BufferReader(buf)).read_pandas()
> > >
> > > Since the C++ side uses Arrow 0.9.0 I have been using pyarrow==0.9.0.
> > When
> > > using Python 3.7, getting pyarrow=0.9.0 is not easy since there are no
> > > per-compiled .whl packages on PyPI.
> > >
> > > I wonder if I could use newer pyarrow versions to parse the Arrow 0.9.0
> > > ouput? Is the format compatible?
> > >
> > > Thanks!
> > > Rares
> >

Reply via email to