Thanks Wes,

I confirmed this is fixed in master. In the future, I'll check against the 
master if we come across anything. We would be interested to use the nightly 
builds, for sure. We do not use conda as of now, so may be best to become more 
familiar with conda. I needed to get home to use my mac because I could not get 
the build working properly on both CentOS/Fedora, possibly b/c I used pip. 
Failed when trying to run py.test.

Also, thanks very much for the planning document posted in December. That has 
been an excellent resource.

Best, Ryan


-----Original Message-----
From: Wes McKinney [mailto:wesmck...@gmail.com] 
Sent: Tuesday, January 8, 2019 3:16 PM
To: dev@arrow.apache.org
Subject: Re: RecordBatchFile with no batches, Error: Pyarrow.lib.ArrowInvalid: 
File is smaller than indicated metadata size.

I think I fixed this in master. Are you able to build from source to try it out?

I am hopeful that sometime this year my team and I can provide a conda
channel with nightly Arrow builds to help with testing and development

On Tue, Jan 8, 2019 at 1:49 PM White4, Ryan (STATCAN)
<ryan.whi...@canada.ca> wrote:
>
> Hi,
>
> I get an error when writing a file with no record batches. I came across this 
> when implementing a simple way to spill the buffer to disk automatically 
> (this is potentially coming in release 0.12???).
>
> I'm using pyarrow 0.11.
> Is there a JIRA related to this, or is there a problem in this simple example 
> below:
>
> my_schema = pa.schema([('field0', pa.int32())])
> sink = pa.BufferOutputStream()
> writer = pa.RecordBatchFileWriter(sink, my_schema)
> writer.close()
> buf = sink.getvalue()
>
> reader = pa.open_file(buf)
> print(reader.schema)
> print(reader.num_record_batches)
>
> Traceback...
> Reader = pa.open_file(buf)
> Pyarrow/ipc.py, line142, in open_file
> Return RecordBatchFileReader(source, fotter_offset=footer_offset)
> Pyarrow/ipc.py, line 89, in __init__
> Self._open(source, footer_offset=fotter_offset)
> Pyarrow/ipc.pxi, line 352
> Pyarrow/error.pxi, line 81
> Pyarrow.lib.ArrowInvalid: File is smaller than indicated metadata size.
>
> Thanks,
> Ryan
>
>
> Ryan Mackenzie White, Ph. D.
>
> Senior Research Analyst - Administrative Data Division, Analytical Studies, 
> Methodology and Statistical Infrastructure Field
> Statistics Canada / Government of Canada
> ryan.whi...@canada.ca<mailto:ryan.whi...@canada.ca> / Tel: 613-608-0015
>
> Analyste principal de recherche- Division des données administratives, 
> Secteur des études analytiques, de la méthodologie et de l'infrastructure 
> statistique
> Statistique Canada / Gouvernement du Canada
> ryan.whi...@canada.ca<mailto:ryan.whi...@canada.ca> / Tél. : 613-608-0015
>
>
>
>

Reply via email to