The C-interface representation is probably slightly less readable then the JSON implementation if I understand the flatbuffer to JSON conversion properly. But as Antoine pointed out it depends on the use-case.
FWIW, flatbuffers maintainers indicated forward/backward compatibility is intended to be preserved in the JSON encoding as well. On Sat, Jan 4, 2020 at 2:16 PM Jacques Nadeau <jacq...@apache.org> wrote: > What do people think about using the C interface representation? > > On Sun, Dec 29, 2019 at 12:42 PM Micah Kornfield <emkornfi...@gmail.com> > wrote: > >> I opened https://github.com/google/flatbuffers/issues/5688 to try to get >> some clarity. >> >> On Tue, Dec 24, 2019 at 12:13 PM Wes McKinney <wesmck...@gmail.com> >> wrote: >> >> > On Tue, Dec 24, 2019 at 2:47 AM Micah Kornfield <emkornfi...@gmail.com> >> > wrote: >> > >> >> > >> If we were to make the same kinds of forward/backward compatibility >> > >> guarantees as with Flatbuffers it could create a lot of work for >> > >> maintainers. >> > > >> > > Does it pay to follow-up with the flatbuffer project to understand if >> > the forward/backward compatibility guarantees the flatbuffers provide >> > extend to their JSON format? >> > >> > I spent a few minutes looking at the Flatbuffers codebase and >> > documentation and did not find anything, so this seems like useful >> > information to have regardless. >> > >> > > >> > > On Sun, Dec 15, 2019 at 11:17 AM Wes McKinney <wesmck...@gmail.com> >> > wrote: >> > >> >> > >> I'd be open to looking at a proposal for a human-readable text >> > >> representation, but I'm definitely wary about making any kind of >> > >> cross-version compatibility guarantees (beyond "we will do our >> best"). >> > >> If we were to make the same kinds of forward/backward compatibility >> > >> guarantees as with Flatbuffers it could create a lot of work for >> > >> maintainers. >> > >> >> > >> On Thu, Dec 12, 2019 at 12:43 AM Micah Kornfield < >> emkornfi...@gmail.com> >> > wrote: >> > >> > >> > >> > > >> > >> > > With these two together, it would seem not too difficult to >> create >> > a text >> > >> > > representation for Arrow schemas that (at some point) has some >> > >> > > compatibility guarantees, but maybe I'm missing something? >> > >> > >> > >> > >> > >> > I think the main risk is if somehow flatbuffers JSON parsing >> doesn't >> > handle >> > >> > backward compatible changes to the arrow schema message. Given the >> > way the >> > >> > documentation is describing the JSON functionality I think this >> would >> > be >> > >> > considered a bug. >> > >> > >> > >> > The one downside to calling the "schema" canonical is the >> flatbuffers >> > JSON >> > >> > functionality only appears to be available in C++ and Java via JNI, >> > so it >> > >> > wouldn't have cross language support. I think this issue is more >> one >> > of >> > >> > semantics though (i.e. does the JSON description become part of the >> > "Arrow >> > >> > spec" or does it live as a C++/Python only feature). >> > >> > >> > >> > -Micah >> > >> > >> > >> > >> > >> > On Tue, Dec 10, 2019 at 10:51 AM Christian Hudon < >> > chr...@elementai.com> >> > >> > wrote: >> > >> > >> > >> > > Micah: I didn't know that Flatbuffers supported serialization >> > to/from JSON, >> > >> > > thanks. That seems like a very good start, at least. I'll aim to >> > create a >> > >> > > draft pull request that at least wires everything up in Arrow so >> we >> > can >> > >> > > load/save a Schema.fbs instance from/to JSON. At least it'll make >> > it easier >> > >> > > for me to see how Arrow schemas would look in JSON with that. >> > >> > > >> > >> > > Otherwise, I'm still gathering requirements internally here. For >> > example, >> > >> > > one thing that would be nice would be to be able to output a JSON >> > Schema >> > >> > > from at least a subset of the Arrow schema. (That way our users >> > could start >> > >> > > by passing around JSON with a given schema, and transition pieces >> > of a >> > >> > > workflow to Arrow as they're ready.) But that part can also be >> done >> > outside >> > >> > > of the Arrow code, if deemed not relevant to have in the Arrow >> > codebase >> > >> > > itself. >> > >> > > >> > >> > > One core requirement for us, however, would be eventual >> > compatibility >> > >> > > between Arrow versions for a given text representation of a >> schema. >> > >> > > Meaning, if you have a text description of a given Arrow schema, >> > you can >> > >> > > load it into different versions of Arrow and it creates a valid >> > Schema >> > >> > > Flatbuffer description, that Arrow can use. Wes, were you >> thinking >> > of that, >> > >> > > or of something else, when you wrote "only makes sense if it is >> > offered >> > >> > > without any backward/forward compatibility guarantees"? >> > >> > > >> > >> > > For the now, or me, assuming the JSON serialization done by the >> > Flatbuffer >> > >> > > libraries is usable, it seems we have all the pieces to make this >> > happen: >> > >> > > 1) The binary Schema.fbs data structures has to be compatible >> > between >> > >> > > different versions of Arrow, otherwise two processes with >> different >> > Arrow >> > >> > > versions won't be able to interoperate, no? >> > >> > > 2) The Flatbuffer <-> JSON serialization supplied by the >> Flatbuffers >> > >> > > library also has to be compatible between different versions of >> the >> > >> > > Flatbuffers library, since the main use case seems to be storing >> > >> > > Flatbuffers assets into version control. Breaking changes there >> > will also >> > >> > > be painful to their users. >> > >> > > >> > >> > > With these two together, it would seem not too difficult to >> create >> > a text >> > >> > > representation for Arrow schemas that (at some point) has some >> > >> > > compatibility guarantees, but maybe I'm missing something? >> > >> > > >> > >> > > Thanks, >> > >> > > >> > >> > > Christian >> > >> > > >> > >> > > Le lun. 9 déc. 2019, à 07 h 00, Wes McKinney < >> wesmck...@gmail.com> >> > a >> > >> > > écrit : >> > >> > > >> > >> > > > The only "canonical" representation of schemas at the moment is >> > the >> > >> > > > Flatbuffers data structure [1] >> > >> > > > >> > >> > > > Having a human-readable/parseable text representation I think >> only >> > >> > > > makes sense if it is offered without any backward/forward >> > >> > > > compatibility guarantees. >> > >> > > > >> > >> > > > Note I had previously opened >> > >> > > > https://issues.apache.org/jira/browse/ARROW-3730 where I noted >> > that >> > >> > > > there's no way (aside from generating the Flatbuffers >> messages) to >> > >> > > > generate a schema representation that can be used later to >> > reconstruct >> > >> > > > a schema in a program. If such a representation were human >> > >> > > > readable/editable that seems beneficial. >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > [1]: >> > https://github.com/apache/arrow/blob/master/format/Schema.fbs >> > >> > > > >> > >> > > > On Sat, Dec 7, 2019 at 11:56 AM Maarten Ballintijn < >> > maart...@xs4all.nl> >> > >> > > > wrote: >> > >> > > > > >> > >> > > > > >> > >> > > > > Is there a syntax specified for schemas? >> > >> > > > > >> > >> > > > > Cheers, >> > >> > > > > Maarten. >> > >> > > > > >> > >> > > > > >> > >> > > > > > On Dec 6, 2019, at 5:01 PM, Micah Kornfield < >> > emkornfi...@gmail.com> >> > >> > > > wrote: >> > >> > > > > > >> > >> > > > > > Hi Christian, >> > >> > > > > > As far as I know no-one is working on a canonical text >> > representation >> > >> > > > for >> > >> > > > > > schemas. A JSON serializer exists for integration test >> > purposes, but >> > >> > > > > > IMO it shouldn't be relied upon as canonical. >> > >> > > > > > >> > >> > > > > > It looks like Flatbuffers supports serialization to/from >> JSON >> > [1 >> > >> > > > > > < >> > https://google.github.io/flatbuffers/flatbuffers_guide_use_cpp.html >> > >> > > > >], >> > >> > > > > > using that functionality might be a promising avenue to >> > pursue for a >> > >> > > > human >> > >> > > > > > readable schema. I could see adding a helper method >> someplace >> > under >> > >> > > > IPC for >> > >> > > > > > this. Would that meet your needs? I think if there are >> other >> > >> > > > > > requirements, then a proposal would be welcome. Ideally, a >> > solution >> > >> > > > would >> > >> > > > > > not require additional build/runtime dependencies. >> > >> > > > > > >> > >> > > > > > >> > >> > > > > > Thanks, >> > >> > > > > > Micah >> > >> > > > > > >> > >> > > > > > [1] See Text & schema parsing >> > >> > > > > > >> > https://google.github.io/flatbuffers/flatbuffers_guide_use_cpp.html >> > >> > > > > > >> > >> > > > > > On Fri, Dec 6, 2019 at 1:26 PM Christian Hudon < >> > chr...@elementai.com >> > >> > > > >> > >> > > > wrote: >> > >> > > > > > >> > >> > > > > >> Hi, >> > >> > > > > >> >> > >> > > > > >> For the uses I would like to make of Arrow, I would need a >> > >> > > > human-readable >> > >> > > > > >> and -writable version of an Arrow Schema, that could be >> > converted to >> > >> > > > and >> > >> > > > > >> from the Arrow Schema C++ object. Going through the doc >> for >> > 0.15.1, >> > >> > > I >> > >> > > > don't >> > >> > > > > >> see anything to that effect, with the closest being the >> > ToString() >> > >> > > > method >> > >> > > > > >> on DataType instances, but which is meant for debugging >> > only. (I >> > >> > > need >> > >> > > > an >> > >> > > > > >> expression of an Arrow Schema that people can read, and >> that >> > can >> > >> > > live >> > >> > > > > >> outside of the code for a particular operation.) >> > >> > > > > >> >> > >> > > > > >> Is a text representation of an Arrow Schema something that >> > is being >> > >> > > > worked >> > >> > > > > >> on now? If not, would you folks be interested in me >> putting >> > up an >> > >> > > > initial >> > >> > > > > >> proposal for discussion? Any design constraints I should >> pay >> > >> > > > attention to, >> > >> > > > > >> then? >> > >> > > > > >> >> > >> > > > > >> Thanks, >> > >> > > > > >> >> > >> > > > > >> Christian >> > >> > > > > >> -- >> > >> > > > > >> >> > >> > > > > >> >> > >> > > > > >> │ Christian Hudon >> > >> > > > > >> >> > >> > > > > >> │ Applied Research Scientist >> > >> > > > > >> >> > >> > > > > >> Element AI, 6650 Saint-Urbain #500 >> > >> > > > > >> >> > >> > > > > >> Montréal, QC, H2S 3G9, Canada >> > >> > > > > >> Elementai.com >> > >> > > > > >> >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > > >> > >> > > -- >> > >> > > >> > >> > > >> > >> > > │ Christian Hudon >> > >> > > >> > >> > > │ Applied Research Scientist >> > >> > > >> > >> > > Element AI, 6650 Saint-Urbain #500 >> > >> > > >> > >> > > Montréal, QC, H2S 3G9, Canada >> > >> > > Elementai.com >> > >> > > >> > >> >