Re: [DISCUSS][Arrow] Extension metadata encoding design

2023-08-16 Thread Jeremy Leibs
I realize it's a not-insignificant change, and I'm not (yet) proposing such a change without more discussion and thought into the consequences. But, I don't think this would actually break any protocol, so I don't want to prematurely preclude this as a possible future direction. My understanding

Re: [DISCUSS][Arrow] Extension metadata encoding design

2023-08-16 Thread Antoine Pitrou
Hmm, you're right that letting the extension type peek at the entire metadata values would have been another solution. That said, for protocol compatibility reasons, we cannot easily change this anymore. Regards Antoine. Le 16/08/2023 à 17:48, Jeremy Leibs a écrit : Thanks for the con

Re: [DISCUSS][Arrow] Extension metadata encoding design

2023-08-16 Thread Jeremy Leibs
Thanks for the context, Antoine. However, even in those examples, I don't really see how coercing the metadata to a single string makes much of a difference. I believe the main difference of what I'm proposing would be that the ExtensionType::Deserialize interface: https://github.com/apache/arrow/

Re: [DISCUSS][Arrow] Extension metadata encoding design

2023-08-16 Thread Antoine Pitrou
Hi Jeremy, A single key makes it easier for generic code to recreate extension types it does not know about. Here is an example in the C++ IPC layer: https://github.com/apache/arrow/blob/641201416c1075edfd05d78b539275065daac31d/cpp/src/arrow/ipc/metadata_internal.cc#L823-L845 Here is simila