Never seen nested collections on Hive, so I'm not sure about "array <struct <a string, b int, c map<string, string>>>".
In my case, my arrays and maps always contains values of primitive types, such as string, int, bigint, etc. On Fri, Sep 28, 2012 at 1:01 PM, Sadananda Hegde <saduhe...@gmail.com> wrote: > How does "collection items terminated by" work on a nested structure? Say > the table is created with the DDL: > > CREATE TABLE table_1(f1 int, f2 string, f3 array <struct <a string, b int, > c map<string, string>>>) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > COLLECTION ITEMS TERMINATED BY ',' > MAP KEYS TERMINATED BY '=' > LINES TERMINATED BY '\'n' > STORED AS TEXTFILE; > > I guess comma seperator wll be used for the items in the outer most > structure (i.e. array). Is that true? > 1. What would be the seperator character between a,b and c (struct > elements)? > 2. What would be the seperator for mapelements? > 3. Is there a way to explicitly specify those ITEMS seperators rather than > using the default ones like ^B, ^C, etc, (like multiple collection items)? > > The original data is in xml format (complex one with many nested levels) > and we are planning to parse that xml using a java parser into delimited > text file which can be used to load the hive table. My question is: > " How should we be representng the f3 like structures in the data > file?" > > The actual file has lot many fields with quite a few complex types like f3 > above; but I guess logic would be the same. > > Thanks for your help..... > > Regards, > Sadu > > > > > > >