The only "canonical" representation of schemas at the moment is the
Flatbuffers data structure [1]

Having a human-readable/parseable text representation I think only
makes sense if it is offered without any backward/forward
compatibility guarantees.

Note I had previously opened
https://issues.apache.org/jira/browse/ARROW-3730 where I noted that
there's no way (aside from generating the Flatbuffers messages) to
generate a schema representation that can be used later to reconstruct
a schema in a program. If such a representation were human
readable/editable that seems beneficial.



[1]: https://github.com/apache/arrow/blob/master/format/Schema.fbs

On Sat, Dec 7, 2019 at 11:56 AM Maarten Ballintijn <maart...@xs4all.nl> wrote:
>
>
> Is there a syntax specified for schemas?
>
> Cheers,
> Maarten.
>
>
> > On Dec 6, 2019, at 5:01 PM, Micah Kornfield <emkornfi...@gmail.com> wrote:
> >
> > Hi Christian,
> > As far as I know no-one is working on a canonical text representation for
> > schemas.  A JSON serializer exists for integration test purposes, but
> > IMO it shouldn't be relied upon as canonical.
> >
> > It looks like Flatbuffers supports serialization to/from JSON [1
> > <https://google.github.io/flatbuffers/flatbuffers_guide_use_cpp.html>],
> > using that functionality might be a promising avenue to pursue for a human
> > readable schema. I could see adding a helper method someplace under IPC for
> > this.  Would that meet your needs?  I think if there are other
> > requirements, then a proposal would be welcome.  Ideally, a solution would
> > not require additional build/runtime dependencies.
> >
> >
> > Thanks,
> > Micah
> >
> > [1] See Text & schema parsing
> > https://google.github.io/flatbuffers/flatbuffers_guide_use_cpp.html
> >
> > On Fri, Dec 6, 2019 at 1:26 PM Christian Hudon <chr...@elementai.com> wrote:
> >
> >> Hi,
> >>
> >> For the uses I would like to make of Arrow, I would need a human-readable
> >> and -writable version of an Arrow Schema, that could be converted to and
> >> from the Arrow Schema C++ object. Going through the doc for 0.15.1, I don't
> >> see anything to that effect, with the closest being the ToString() method
> >> on DataType instances, but which is meant for debugging only. (I need an
> >> expression of an Arrow Schema that people can read, and that can live
> >> outside of the code for a particular operation.)
> >>
> >> Is a text representation of an Arrow Schema something that is being worked
> >> on now? If not, would you folks be interested in me putting up an initial
> >> proposal for discussion? Any design constraints I should pay attention to,
> >> then?
> >>
> >> Thanks,
> >>
> >>  Christian
> >> --
> >>
> >>
> >> │ Christian Hudon
> >>
> >> │ Applied Research Scientist
> >>
> >>   Element AI, 6650 Saint-Urbain #500
> >>
> >>   Montréal, QC, H2S 3G9, Canada
> >>   Elementai.com
> >>
>

Reply via email to