Le 26/02/2019 à 00:32, Wes McKinney a écrit : > hi folks, > > I recently wrote a patch to propose a C++ API for user-defined "extension" > types > > https://github.com/apache/arrow/pull/3694 > > The idea is that an extension type wraps a pre-existing Arrow type. > For example a UUIDType can be represented as FixedSizeBinary(16). The > intent is that Arrow consumers which are not aware of an extension > type can ignore the additional type metadata and still interact with > the raw storage > > One question is how to permit such metadata to be preserved through > IPC / RPC messages (i.e., Schema.fbs) and how other languages can > interact with it. There are couple options: > > * What I implemented in my patch: use the Field-level custom_metadata > field with known key names "arrow_extension_name" and > "arrow_extension_data" for the type's unique identifier and serialized > form, respectively. If we opt for this, then we should add a section > to the specification to codify the convention used > > * Add a new field to the Field table in Schema.fbs > > The former is attractive in the sense that consumers who don't have > special handling for an extension type will carry along the Field > metadata in their schema, so it can be passed on in subsequent IPC > messages without writing any extra code.
Does it also roundtrip through e.g. Pandas conversion? Regards Antoine.