Chase Slater created ARROW-1067: ----------------------------------- Summary: Write to parquet with InMemoryOutputStream Key: ARROW-1067 URL: https://issues.apache.org/jira/browse/ARROW-1067 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.4.0 Environment: Debian 8.5, Anaconda Python 3.6, pyarrow 0.4.0a0 Reporter: Chase Slater Priority: Minor
When I run the following (from the docs) python crashes during the pq.write_table statement. How would I go about writing a parquet file to a file buffer (e.g. for use with Azure Data Lake)? {code} import pyarrow as pa import pyarrow.parquet as pq table = pa.Table.from_pandas(df, timestamps_to_ms=True) with adl.open(my_file_path, 'wb') as f: output = pa.InMemoryOutputStream() pq.write_table(table, output) # crashes here f.write(output.get_result().to_pybytes()) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)