Never seen nested collections on Hive, so I'm not sure about "array
<struct <a string, b int, c map<string, string>>>".

In my case, my arrays and maps always contains values of primitive
types, such as string, int, bigint, etc.

On Fri, Sep 28, 2012 at 1:01 PM, Sadananda Hegde <saduhe...@gmail.com> wrote:
> How does "collection items terminated by" work  on a nested structure? Say
> the  table is created with the DDL:
>
> CREATE TABLE table_1(f1 int, f2 string, f3  array <struct <a string, b int,
> c map<string, string>>>)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '|'
> COLLECTION ITEMS TERMINATED BY ','
> MAP KEYS TERMINATED BY '='
> LINES TERMINATED BY '\'n'
> STORED AS TEXTFILE;
>
> I guess comma seperator wll be used for the items in the outer most
> structure (i.e. array).  Is that true?
>  1. What would be the seperator character between a,b and c (struct
> elements)?
>  2. What would be the seperator for mapelements?
>  3. Is there a way to explicitly specify those ITEMS seperators rather than
> using the default ones like ^B, ^C, etc, (like multiple collection items)?
>
>  The original data is in xml format (complex one with many nested levels)
> and we are planning to parse that xml using a java parser into delimited
> text file which can be used to load the hive table. My question is:
>      " How should we be representng the f3 like structures in the data
> file?"
>
> The actual file has lot many fields with quite a few complex types like f3
> above; but I guess logic would be the same.
>
> Thanks for your help.....
>
> Regards,
> Sadu
>
>
>
>
>
>
>

Reply via email to