Jeff Reback created ARROW-1286: ---------------------------------- Summary: PYTHON: support Categorical serialization to/from parquet Key: ARROW-1286 URL: https://issues.apache.org/jira/browse/ARROW-1286 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Jeff Reback
related to https://issues.apache.org/jira/browse/ARROW-439 pandas Categorical types are not NotImplemented. minimal example. pandas 0.20.3 & pyarrow 0.5.0 {code} In [1]: df = pd.DataFrame({'a': pd.Categorical(list('abc'))}) In [2]: df.dtypes Out[2]: a category dtype: object In [4]: import pyarrow In [5]: import pyarrow.parquet In [6]: table = pyarrow.Table.from_pandas(df, timestamps_to_ms=True) ...: pyarrow.parquet.write_table( ...: table, 'foo.pq') ...: ...: --------------------------------------------------------------------------- ArrowNotImplementedError Traceback (most recent call last) <ipython-input-6-4512e9a2e15e> in <module>() 1 table = pyarrow.Table.from_pandas(df, timestamps_to_ms=True) 2 pyarrow.parquet.write_table( ----> 3 table, 'foo.pq') 4 /Users/jreback/miniconda3/envs/pandas/lib/python3.6/site-packages/pyarrow/parquet.py in write_table(table, where, row_group_size, version, use_dictionary, compression, use_deprecated_int96_timestamps, **kwargs) 770 version=version, 771 use_deprecated_int96_timestamps=use_deprecated_int96_timestamps) --> 772 writer = ParquetWriter(where, table.schema, **options) 773 writer.write_table(table, row_group_size=row_group_size) 774 writer.close() _parquet.pyx in pyarrow._parquet.ParquetWriter.__cinit__() error.pxi in pyarrow.lib.check_status() ArrowNotImplementedError: NotImplemented: unhandled type {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)