[ https://issues.apache.org/jira/browse/ARROW-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17661796#comment-17661796 ]
Rok Mihevc commented on ARROW-4774: ----------------------------------- This issue has been migrated to [issue #21293|https://github.com/apache/arrow/issues/21293] on GitHub. Please see the [migration documentation|https://github.com/apache/arrow/issues/14542] for further details. > [C++][Parquet] Call Table::Validate when writing a table > -------------------------------------------------------- > > Key: ARROW-4774 > URL: https://issues.apache.org/jira/browse/ARROW-4774 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Affects Versions: 0.11.1, 0.12.1 > Environment: Windows 10 16299.431, Python 3.6.4 64-bit, pyarrow 0.11.1 > Windows Linux (WSL) Ubuntu 18.04.1, Python 3.6.5 64-bit, pyarrow 0.12.1 > Reporter: Stephen Gallagher > Assignee: Francois Saint-Jacques > Priority: Blocker > Labels: pull-request-available > Fix For: 0.13.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > When writing a table to a parquet file that contains both flat arrays of > different leng > Reproducer: > {code:python} > import pyarrow as pa > import pyarrow.parquet as pq > import numpy as np > array1 = np.array([0, 1, 2], dtype=np.uint8) > array2 = np.array([[0,1,2], [3, 4, 5]], dtype=np.uint8).T > t1 = pa.uint8() > t2 = pa.list_(pa.uint8()) > fields = [ > pa.field('a1', t1), > pa.field('a2', t2) > ] > myschema = pa.schema(fields) > mytable = pa.Table.from_arrays([ > pa.array(array1, type=t1), > pa.array([array2[:,0], array2[:,1]], type=t2)], > schema=myschema) > pq.write_table(mytable, 'example.parquet') > {code} > Windows 10 (Python 3.6.4 64-bit, pyarrow 0.11.1) crash code: > {code:bash} > Process finished with exit code -1073741819 (0xC0000005) > {code} > WSL (Python 3.6.5 64-bit, pyarrow 0.12.1) Crash code: > {code:bash} > Segmentation fault (core dumped) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)