Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Jeremy Leibs
obj = "date32" > > > elif obj == "int": > > > # default int to int32 > > > obj = "int32" > > > obj = pa.type_for_alias(obj) > > > return obj &

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Jeremy Leibs
_convert_to_arrow_schema(fields_dict): > > """ > > > > :param fields_dict: > > :returns: pyarrow schema > > > > """ > > columns = [] > > for field, typ in fields_dict.items(): > > pa

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Aldrin
arrow_type(field, typ) > columns.append(pa.field(field, pa_type)) > schema = pa.schema(columns) > return schema > > -Original Message- > From: Lee, David (PAG) <david@blackrock.com.INVALID> > Sent: Monday, July 8, 2024 11:58 AM > To:

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Weston Pace
items) > > else: > > if isinstance(obj, str): > > obj = pa.type_for_alias(obj) > > return obj > > > > > > def _convert_to_arrow_schema(fields_dict): > > """ > > > > :param fields_dic

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Ian Cook
ram fields_dict: > :returns: pyarrow schema > > """ > columns = [] > for field, typ in fields_dict.items(): > pa_type = _convert_to_arrow_type(field, typ) > columns.append(pa.field(field, pa_type)) > schema = pa.schema(columns) &

RE: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Lee, David (PAG)
umns.append(pa.field(field, pa_type)) schema = pa.schema(columns) return schema -Original Message- From: Lee, David (PAG) Sent: Monday, July 8, 2024 11:58 AM To: dev@arrow.apache.org Subject: RE: [DISCUSS] Approach to generic schema representation External Email: Use caution wit

RE: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Lee, David (PAG)
t;date32" elif typ == "int": # default int to int32 typ = "int32" pa_type = _convert_to_arrow_type(field, typ) columns.append(pa.field(field, pa_type)) schema = pa.schema(columns) return schema -----Original Message- From:

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Jorge Cardoso Leitão
Hi, So, something like a human and computer readable standard for arrow schemas, e.g. via yaml or a json schema. We kind of do this in our integration tests / golden tests, where we have a non-official json representation of an arrow schema. The ask here is to standardize such a format in some

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Jeremy Leibs
That handles questions of machine-to-machine coordination, and let's me do things like validation, but it doesn't address questions of the kind of user-facing API documentation someone would need to practically form and/or process data when integrating a library into their code. I want to be able

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Weston Pace
+1 for empty stream/file as schema serialization. I have used this approach myself on more than one occasion and it works well. It can even be useful for transmitting schemas between different arrow-native libraries in the same language (e.g. rust->rust) since it allows the different libraries to

Re: [DISCUSS] Approach to generic schema representation

2024-07-08 Thread Matt Topol
Hey Jeremy, Currently the first message of an IPC stream is a Schema message which consists solely of a flatbuffer message and defined in the Schema.fbs file of the Arrow repo. All of the libraries that can read Arrow IPC should be able to also handle converting a single IPC schema message back in