Uwe L. Korn created ARROW-2428:
----------------------------------
Summary: [Python] Support ExtensionArrays in to_pandas conversion
Key: ARROW-2428
URL: https://issues.apache.org/jira/browse/ARROW-2428
Project: Apache Arrow
Issue Type: Improvement
Components: Python
Reporter: Uwe L. Korn
Fix For: 1.0.0
With the next release of Pandas, it will be possible to define custom column
types that back a {{pandas.Series}}. Thus we will not be able to cover all
possible column types in the {{to_pandas}} conversion by default as we won't be
aware of all extension arrays.
To enable users to create {{ExtensionArray}} instances from Arrow columns in
the {{to_pandas}} conversion, we should provide a hook in the {{to_pandas}}
call where they can overload the default conversion routines with the ones that
produce their {{ExtensionArray}} instances.
This should avoid additional copies in the case where we would nowadays first
convert the Arrow column into a default Pandas column (probably of object type)
and the user would afterwards convert it to a more efficient
{{ExtensionArray}}. This hook here will be especially useful when you build
{{ExtensionArrays}} where the storage is backed by Arrow.
The meta-issue that tracks the implementation inside of Pandas is:
https://github.com/pandas-dev/pandas/issues/19696
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)