Re: Passing user-defined "extension" types in the Arrow protocol

2019-02-26 Thread Wes McKinney
hi Paul, On Tue, Feb 26, 2019 at 1:16 PM Paul Taylor wrote: > > An alternative that's worked for us is (ab)using single-child > SparseUnions to represent custom types. We have an enum of "well-known" > typeIds (UUID, vec2's, IP addresses, etc), whose data is stored in one > of the known Arrow typ

Re: Passing user-defined "extension" types in the Arrow protocol

2019-02-26 Thread Paul Taylor
An alternative that's worked for us is (ab)using single-child SparseUnions to represent custom types. We have an enum of "well-known" typeIds (UUID, vec2's, IP addresses, etc), whose data is stored in one of the known Arrow types, as you've done. Pros are the typeIds buffer is tiny, and doesn'

Re: Passing user-defined "extension" types in the Arrow protocol

2019-02-25 Thread Wes McKinney
On Mon, Feb 25, 2019 at 5:36 PM Antoine Pitrou wrote: > > Does it also roundtrip through e.g. Pandas conversion? No. Any Arrow metadata is lost when you call to_pandas() (because pandas objects don't have the ability to preserve any column-level metadata, only the physical data type). The metadat

Re: Passing user-defined "extension" types in the Arrow protocol

2019-02-25 Thread Antoine Pitrou
Le 26/02/2019 à 00:32, Wes McKinney a écrit : > hi folks, > > I recently wrote a patch to propose a C++ API for user-defined "extension" > types > > https://github.com/apache/arrow/pull/3694 > > The idea is that an extension type wraps a pre-existing Arrow type. > For example a UUIDType can b