Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-06-07 Thread Wes McKinney
Great, thanks Jacques. I'll kick off a vote thread so we can hopefully get this approved On Fri, Jun 7, 2019 at 3:27 PM Jacques Nadeau wrote: > > I'm good with this. The consistent separator is a good improvement. > > On Thu, Jun 6, 2019 at 1:06 PM Wes McKinney wrote: > > > hey Jacques, > > > >

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-06-07 Thread Jacques Nadeau
I'm good with this. The consistent separator is a good improvement. On Thu, Jun 6, 2019 at 1:06 PM Wes McKinney wrote: > hey Jacques, > > On Thu, Jun 6, 2019 at 12:53 PM Jacques Nadeau wrote: > > > > Thanks for pushing this along. I think it is important. Sorry I'm coming > > late to the conver

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-06-06 Thread Wes McKinney
hey Jacques, On Thu, Jun 6, 2019 at 12:53 PM Jacques Nadeau wrote: > > Thanks for pushing this along. I think it is important. Sorry I'm coming > late to the conversation. Couple thoughts: > > - Should we reconsider having this be an independent optional field as > opposed to overloading customer

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-06-06 Thread Jacques Nadeau
Thanks for pushing this along. I think it is important. Sorry I'm coming late to the conversation. Couple thoughts: - Should we reconsider having this be an independent optional field as opposed to overloading customer_metadata? It avoids having the weird string prefixing behavior - I'd be incline

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-06-03 Thread Wes McKinney
hi Micah, I have just updated my PR per your comments with more examples of extension types. https://github.com/apache/arrow/pull/4332 Are there more comments about this? I can start a vote in a couple of days absent further opinions. Can someone volunteer to review David's Java PR? I would lik

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-05-18 Thread Micah Kornfield
Hi Wes, Like I said I think this approach looks good, I think what I'm looking for is a little more documentation/examples on how additional types would be handled. I think Tensor would be a good example, we also had questions about INET addresses previously, maybe this would be a another good ill

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-05-18 Thread Wes McKinney
On Sat, May 18, 2019, 1:58 PM Wes McKinney wrote: > Hi Micah, > > The use cases I'm aware of are mostly coming from proprietary > applications. My idea was for the extension metadata to be as unobtrusive > as possible. The only alternative as I see it would be to have an Extension > value in the

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-05-18 Thread Wes McKinney
Hi Micah, The use cases I'm aware of are mostly coming from proprietary applications. My idea was for the extension metadata to be as unobtrusive as possible. The only alternative as I see it would be to have an Extension value in the Type union which would be more intrusive to applications handli

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-05-18 Thread Micah Kornfield
Hi Wes, This approach seems reasonable to me. I'm a little concerned we haven't validated many use-cases against the approach (but I don't see any obvious flaws). Thanks, Micah On Fri, May 17, 2019 at 5:16 AM Wes McKinney wrote: > As Micah brought up, as part of this we would like to formalize

Re: [DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-05-17 Thread Wes McKinney
As Micah brought up, as part of this we would like to formalize the use of "ARROW:" as a reserved metadata key prefix. This is similar to Apache Avro which uses "avro." as a reserved prefix [1]. If someone has a different idea about what the prefix should be I'm open to other ideas [1] : https://a

[DISCUSS] Formalizing "extension type" metadata in the Arrow binary protocol

2019-05-16 Thread Wes McKinney
hi folks, In a prior mailing list thread from February [1] I brought up some work I'd done in C++ to create an API to define custom data types that can be embedded in built-in Arrow logical types. These are serialized through IPC by adding special fields to the `custom_metadata` member of Field in