>
> I just thought of one other requirement: the format needs to support
> arbitrary byte sequences.
>
Can you clarify why this is needed? Is it that custom_metadata maps should
allow byte sequences as values?

On Fri, Aug 13, 2021 at 10:00 AM Phillip Cloud <cpcl...@gmail.com> wrote:

> On Fri, Aug 13, 2021 at 11:43 AM Antoine Pitrou <anto...@python.org>
> wrote:
>
> >
> > Le 13/08/2021 à 17:35, Phillip Cloud a écrit :
> > >
> > >> I.e. make the ability to read and write by humans be more important
> than
> > >> speed of validation.
> > >
> > > I think I differ on whether the IR should be easy to read and write by
> > > humans.
> > > IR is going to be predominantly read and written by machines, though of
> > > course
> > > we will need a way to inspect it for debugging.
> >
> > But the code executed by machines is written by humans.  I think that's
> > mostly where the contention resides: is it easy to code, in any given
> > language, the routines required to produce or consume the IR?
> >
>
> Definitely not for flatbuffers, since flatbuffers is IMO annoying to use in
> any language except C++,
> and it's borderline annoying there too. Protobuf is similar (less annoying
> in Rust,
> but still annoying in Python and C++ IMO), though I think any binary format
> is going to be
> less human-friendly, by construction.
>
> If we were to use something like JSON or msgpack, can someone sketch out
> the interaction
> between the IR and the rest of arrow's type system?
>
> Would we need a JSON-encoded-arrow-type -> in-memory representation for an
> Arrow type in a given language?
>
> I just thought of one other requirement: the format needs to support
> arbitrary byte sequences. JSON
> doesn't support untransformed byte sequences, though it's not uncommon to
> base64-encode a byte sequence.
> IMO that adds an unnecessary layer of complexity, which is another tradeoff
> to consider.
>

Reply via email to