Hi, I get an error when writing a file with no record batches. I came across this when implementing a simple way to spill the buffer to disk automatically (this is potentially coming in release 0.12???).
I'm using pyarrow 0.11. Is there a JIRA related to this, or is there a problem in this simple example below: my_schema = pa.schema([('field0', pa.int32())]) sink = pa.BufferOutputStream() writer = pa.RecordBatchFileWriter(sink, my_schema) writer.close() buf = sink.getvalue() reader = pa.open_file(buf) print(reader.schema) print(reader.num_record_batches) Traceback... Reader = pa.open_file(buf) Pyarrow/ipc.py, line142, in open_file Return RecordBatchFileReader(source, fotter_offset=footer_offset) Pyarrow/ipc.py, line 89, in __init__ Self._open(source, footer_offset=fotter_offset) Pyarrow/ipc.pxi, line 352 Pyarrow/error.pxi, line 81 Pyarrow.lib.ArrowInvalid: File is smaller than indicated metadata size. Thanks, Ryan Ryan Mackenzie White, Ph. D. Senior Research Analyst - Administrative Data Division, Analytical Studies, Methodology and Statistical Infrastructure Field Statistics Canada / Government of Canada ryan.whi...@canada.ca<mailto:ryan.whi...@canada.ca> / Tel: 613-608-0015 Analyste principal de recherche- Division des données administratives, Secteur des études analytiques, de la méthodologie et de l'infrastructure statistique Statistique Canada / Gouvernement du Canada ryan.whi...@canada.ca<mailto:ryan.whi...@canada.ca> / Tél. : 613-608-0015