Hive doesn't know it needs to skip your square brackets, so you numbers are
really [1, 2, and 3]. [1 and 3] cannot be parsed to numbers, so they become
null.

I think you interpret the second column as [1, 2, 3] of type string. Then
you can  remove the brackets, and use a UDF (write your own if there isn't
one) to generate integer array from the striped string.


On Mon, Sep 22, 2014 at 6:12 PM, Ankita Bakshi <ankita.bak...@gmail.com>
wrote:

> Hi,
>
> I have '|' delimited file where arrays are serialized with square
> brackets. I am trying create a hive table to parse this file.
>
> Example:
>
> first|[1,2,3]|100
>
> second|[11,12,13]|200
>
>
> Create External Table H_histoTest(dim1 string, hist ARRAY<BIGINT>,
> measure1 bigint)
>
> ROW FORMAT DELIMITED FIELDS
>
> TERMINATED BY '|'
>
> COLLECTION ITEMS TERMINATED BY ','
>
> LINES TERMINATED BY '\n'
>
> LOCATION '/user/ankita/hive/histoTest';
>
>
> hive> select * from H_histoTest;
>
> first [null,2,null] 100
>
> second [null,12,null] 200
>
>
>
> If I remove the square brackets than the array is parsed correctly.
>
> first|1,2,3|100
>
> second|11,12,13|200
>
>
> hive> select * from H_histoTest;
>
> first [1,2,3] 100
>
> second [11,12,13] 200
>
>
> Let me know if I am missing something.
>
>
> Thanks,
> Ankita
>

Reply via email to