hi Patrick,

The JSON representation of schemas weren't intended as public APIs.
Can you use the pyarrow Schema directly? I'm not sure I would advise
using the JSON for building any kind of production software.

Although, I'm not opposed to exposing this functionality in Python
with the clear caveat that the JSON representation is not to be used
for persistence. We have only designed it to be used for integration
testing.

see

https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/json.h
https://github.com/apache/arrow/blob/master/cpp/src/arrow/ipc/json-internal.h

I just created https://issues.apache.org/jira/browse/ARROW-2857. If
someone wants to submit a patch I will be happy to take a look.

Thanks
Wes

On Fri, Jul 13, 2018 at 10:17 AM, Patrick Surry <patr...@hopper.com> wrote:
> Feels like I’m missing something obvious, but is there an easy way to
> read/write arrow schema objects as json in pyarrow?  It looks like java api
> has toJSON methods but can’t see if/how they’re exposed in python api.
>
> wesmckinn (via slack) said: we haven't exposed JSON functionality in Python
> yet afaik.
>
> In the github, it looked like there might be some route via the pyarrow
> jvm, e.g.
> https://github.com/apache/arrow/blob/4481b070c9eca4140aaa3a2470ede920411598a0/python/pyarrow/tests/test_jvm.py#L139
> but import pyarrow.jvm as pa_jvm doesn't work for me either, so now stuck :(
>
> I'm on
>
>>>> pa.__version__
>
> '0.9.0.post1'
> Hoping to use pyarrow schema as a way to explicitly declare layout of some
> pandas dataframes for validation and maybe type coercion for edge cases
> like a numeric column which is entirely null and gets inferred by pandas as
> a different type.
>
> Thanks,
> Patrick
> --
> [image: hopper.com] <http://www.hopper.com> Patrick Surry
> Chief Data Scientist
> (857) 919 1700 | @patricksurry

Reply via email to