No, struct array is not naturally castable to map. It's not something that can be done zero-copy and I don't think anyone has encountered this need before. Let me make sure I understand.
The goal is to go from a type of STRUCT<N1: T, N2: T, N3: T, ... NZ: T>, where every key in the struct has the same type, to a MAP<STRING: T>, where each record will have Z map entries? This seems like it could be expressed as a compute function. I don't think it would be very natural as a cast since it has a pretty strict requirement that all fields in the struct have the same type and so it will be pretty limited. I think you could have a compute function as well that went the opposite direction. I do agree with Alenka, if there is any way to create your original input data as a map then that will have better performance. On Wed, May 3, 2023 at 4:58 AM Jerald Alex <vminf...@gmail.com> wrote: > Hi Alenka, > > Great! Thank you so much for your inputs. > > I have indeed tried to use schema when creating a table from a pylist and > it worked but in my use case, I wouldn't know the table schema beforehand > especially for the other columns - I need to do transformations before I > can cast it to the expected schema. Please let me know if you have any > other thoughts. > > Regards, > Infant Alex > > On Wed, May 3, 2023 at 9:43 AM Alenka Frim <ale...@voltrondata.com > .invalid> > wrote: > > > Hi Alex, > > > > passing the schema to from_pylist() method on the Table should work for > > your example (not sure if it solves your initial problem?) > > > > import pyarrow as pa > > > > table_schema = pa.schema([pa.field("id", pa.int32()), > > pa.field("names", pa.map_(pa.string(), pa.string()))]) > > > > table_data = [{"id": 1,"names": {"first_name": "Tyler", > "last_name":"Brady" > > }}, > > {"id": 2,"names": {"first_name": "Walsh", "last_name": "Weaver"}}] > > > > pa.Table.from_pylist(table_data, schema=table_schema) > > # pyarrow.Table > > # id: int32 > > # names: map<string, string> > > # child 0, entries: struct<key: string not null, value: string> not null > > # child 0, key: string not null > > # child 1, value: string > > # ---- > > # id: [[1,2]] > > # names: > > > > > [[keys:["first_name","last_name"]values:["Tyler","Brady"],keys:["first_name","last_name"]values:["Walsh","Weaver"]]] > > > > > > Best, Alenka > > > > On Wed, May 3, 2023 at 9:13 AM Jerald Alex <vminf...@gmail.com> wrote: > > > > > Any inputs on this please? > > > > > > On Tue, May 2, 2023 at 10:03 AM Jerald Alex <vminf...@gmail.com> > wrote: > > > > > > > Hi Experts, > > > > > > > > Can anyone please highlight if it is possible to cast struct to map > > type? > > > > > > > > I tried the following but it seems to be producing an error as > below. > > > > > > > > pyarrow.lib.ArrowNotImplementedError: Unsupported cast from > > > > struct<first_name: string, last_name: string> to map using function > > > cast_map > > > > > > > > Note: Snippet is just an example to show the problem. > > > > > > > > Code Snippet: > > > > > > > > table_schema = pa.schema([pa.field("id", pa.int32()), > pa.field("names", > > > > pa.map_(pa.string(), pa.string()))]) > > > > > > > > table_data = [{"id": 1,"names": {"first_name": "Tyler", "last_name": > > > > "Brady"}}, > > > > {"id": 2,"names": {"first_name": "Walsh", "last_name": "Weaver"}}] > > > > > > > > tbl = pa.Table.from_pylist(table_data) > > > > print(tbl) > > > > print(tbl.cast(table_schema)) > > > > print(tbl) > > > > > > > > Error : > > > > > > > > id: int64 > > > > names: struct<first_name: string, last_name: string> > > > > child 0, first_name: string > > > > child 1, last_name: string > > > > ---- > > > > id: [[1,2]] > > > > names: [ > > > > -- is_valid: all not null > > > > -- child 0 type: string > > > > ["Tyler","Walsh"] > > > > -- child 1 type: string > > > > ["Brady","Weaver"]] > > > > Traceback (most recent call last): > > > > File "/Users/ > > > > > > > > > > infant.a...@cognitedata.com/Documents/Github/HubOcean/demo/pyarrow_types.py > > > ", > > > > line 220, in <module> > > > > print(tbl.cast(table_schema)) > > > > File "pyarrow/table.pxi", line 3489, in pyarrow.lib.Table.cast > > > > File "pyarrow/table.pxi", line 523, in > pyarrow.lib.ChunkedArray.cast > > > > File "/Users/ > > > > > > > > > > infant.a...@cognitedata.com/Library/Caches/pypoetry/virtualenvs/demo-LzMA3Hsd-py3.10/lib/python3.10/site-packages/pyarrow/compute.py > > > ", > > > > line 391, in cast > > > > return call_function("cast", [arr], options) > > > > File "pyarrow/_compute.pyx", line 560, in > > > pyarrow._compute.call_function > > > > File "pyarrow/_compute.pyx", line 355, in > > > pyarrow._compute.Function.call > > > > File "pyarrow/error.pxi", line 144, in > > > > pyarrow.lib.pyarrow_internal_check_status > > > > File "pyarrow/error.pxi", line 121, in pyarrow.lib.check_status > > > > pyarrow.lib.ArrowNotImplementedError: Unsupported cast from > > > > struct<first_name: string, last_name: string> to map using function > > > cast_map > > > > > > > > Regards, > > > > Alex Vincent > > > > > > > > > >