Hi all,
Assume I have a json file named 'my_data.json' as below.
*{"a": [1, 2], "b": {"c": true, "d": "1991-02-03"}}
{"a": [3, 4, 5], "b": {"c": false, "d": "2019-04-01"**}}*
If I need to do a join operation based on attribute d, can I do it
directly from arrow structs? ( or are there any efficient alternatives?)
Also how nested attributes in json format are mapped into buffers once
converted in arrow format? (example taken from documentation)
>>> table = json.read_json("my_data.json")>>> table
pyarrow.Table
a: list<item: int64>
child 0, item: int64
b: struct<c: bool, d: timestamp[s]>
child 0, c: bool
child 1, d: timestamp[s]>>> table.to_pandas()
a b0 [1, 2]
{'c': True, 'd': 1991-02-03 00:00:00}1 [3, 4, 5] {'c': False, 'd':
2019-04-01 00:00:00}
Thank You