[ https://issues.apache.org/jira/browse/ARROW-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc updated ARROW-3911: ------------------------------ External issue URL: https://github.com/apache/arrow/issues/20524 > [Python] Deduplicate datetime.date objects in Table.to_pandas internals > ----------------------------------------------------------------------- > > Key: ARROW-3911 > URL: https://issues.apache.org/jira/browse/ARROW-3911 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Wes McKinney > Assignee: Wes McKinney > Priority: Major > Fix For: 0.12.0 > > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/arrow_to_pandas.cc#L631 > In Python 3, {{datetime.date}} objects are 32-bytes in addition to the > {{PyObject*}}. So when there are many repeated dates, this will save a lot of > memory in large DataFrame objects -- This message was sent by Atlassian Jira (v8.20.10#820010)