hi Li,

You want to use StructArray.from_arrays

a_values = pa.array([1])
b_values = pa.array([2])
struct_values = pa.StructArray.from_arrays(['a', 'b'], [a_values, b_values])

In [4]: struct_values
Out[4]:
<pyarrow.lib.StructArray object at 0x7f06b9d1edb8>
[
  {'b': 2, 'a': 1}
]

You can find other examples in the Python test suite. I would be very
appreciative for contributions to the Python documentation about this.

I also want to be able to construct struct type data from sequences of
Python dictionaries. This is likely to be a good amount of work to do
in generality (particularly if you want to support type inference):
https://issues.apache.org/jira/browse/ARROW-1705

- Wes

On Mon, Dec 18, 2017 at 5:44 PM, Li Jin <ice.xell...@gmail.com> wrote:
> Also tried this:
>
>>>> pa.array([{'a': 1.0, 'b': 2.0}], pa.struct([pa.field('a',
> pa.float64()), pa.field('b', pa.float64())]))
>
> Traceback (most recent call last):
>
>   File "<stdin>", line 1, in <module>
>
>   File "array.pxi", line 56, in pyarrow.lib.array
>
>   File "error.pxi", line 85, in pyarrow.lib.check_status
>
> pyarrow.lib.ArrowNotImplementedError: No type converter implemented for
> struct<a: double, b: double>
>
>
>
>
> On Mon, Dec 18, 2017 at 5:35 PM, Li Jin <ice.xell...@gmail.com> wrote:
>
>> Hey folks,
>>
>> What's best way to create a pyarrow.Array of struct? I tried to create a
>> pyarrow.Array from a pd.Series of dict but doesn't seem to work (0.7.1):
>>
>> >>> s
>>
>> 0    {'a': 1, 'b': 2}
>>
>> Name: stats, dtype: object
>>
>> >>> pa.Array.from_pandas(s)
>>
>> Traceback (most recent call last):
>>
>>   File "<stdin>", line 1, in <module>
>>
>>   File "array.pxi", line 225, in pyarrow.lib.Array.from_pandas
>>
>>   File "error.pxi", line 77, in pyarrow.lib.check_status
>>
>> pyarrow.lib.ArrowInvalid: Error inferring Arrow type for Python object
>> array. Got Python object of type dict but can only handle these types:
>> string, bool, float, int, date, time, decimal, list, array
>>
>> >>> pa.Array.from_pandas(df)
>>
>> Traceback (most recent call last):
>>
>>   File "<stdin>", line 1, in <module>
>>
>>   File "array.pxi", line 225, in pyarrow.lib.Array.from_pandas
>>
>>   File "error.pxi", line 77, in pyarrow.lib.check_status
>>
>> pyarrow.lib.ArrowInvalid: Error inferring Arrow type for Python object
>> array. Got Python object of type dict but can only handle these types:
>> string, bool, float, int, date, time, decimal, list, array
>>
>>
>> What's the correct way to do this?
>>
>>
>>
>>
>>

Reply via email to