[ https://issues.apache.org/jira/browse/ARROW-18099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe Korn reassigned ARROW-18099: -------------------------------- Assignee: Damian Barabonkov > [Python] Cannot create pandas categorical from table only with nulls > -------------------------------------------------------------------- > > Key: ARROW-18099 > URL: https://issues.apache.org/jira/browse/ARROW-18099 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 9.0.0 > Environment: OSX 12.6 > M1 silicon > Reporter: Damian Barabonkov > Assignee: Damian Barabonkov > Priority: Minor > Labels: pull-request-available, python-conversion > Time Spent: 20m > Remaining Estimate: 0h > > A pyarrow Table with only null values cannot be instantiated as a Pandas > DataFrame with said column as a category. However, pandas does support > "empty" categoricals. Therefore, a simple patch would be to load the pa.Table > as an object first and convert, once in pandas, to a categorical which will > be empty. However, that does not solve the pyarrow bug at its root. > > Sample reproducible example > {code:java} > import pyarrow as pa > pylist = [{'x': None, '__index_level_0__': 2}, {'x': None, > '__index_level_0__': 3}] > tbl = pa.Table.from_pylist(pylist) > > # Errors > df_broken = tbl.to_pandas(categories=["x"]) > > # Works > df_works = tbl.to_pandas() > df_works = df_works.astype({"x": "category"}) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)