Extension types will round trip correctly through Parquet so long as the storage type can be roundtripped (as Micah pointed out support for reading all nested types is not yet available).
Note for reinforcement that Feather V2 is exactly an Arrow IPC file -- so IPC files could already do this prior to 0.17.0. People seem to like the name so I figured there wasn't much reason to discard the "brand" which already has a good reputation in the community. On Fri, Apr 24, 2020 at 1:26 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > Hi Bryan, > Extension types isn't explicitly called out but > https://issues.apache.org/jira/browse/ARROW-1644 (and related subtasks) > might be a good place to track this. > > Thanks, > Micah > > On Fri, Apr 24, 2020 at 11:13 AM Bryan Cutler <cutl...@gmail.com> wrote: > > > I've been trying out IO with Arrow's extension types and I was able write a > > parquet file but reading it back causes an error: > > "pyarrow.lib.ArrowInvalid: Unsupported nested type: ...". Looking at the > > code for the parquet reader, it checks nested types and only allows a few > > specific ones. Is this a known limitation? I couldn't find a JIRA but I'll > > make one if it is. Alternatively, I was able to convert my extension array > > to/from a Pandas DataFrame and read/write to a Feather file, which is > > awesome - nice work! > > > > Thanks, > > Bryan > >