Thanks Manish. It's a good article; But it's still not clear to mehow you define when the column is of nested type (like array of maps, maps or array, etc).
Just a clarification on item 2 below. 2. **What would be the seperator for map elements?**** For Map element separator is “=” '=' is the MAP key separator, what I mean is the item separator when the map contains multiple key/value pairs like, (Key1=Value1; Key2=Value2; Key3=Value3....) Here '=' is the key separator and ';' is the item separator. I can handle the above example with COLLECTION ITEMS TERMINATED BY ';' and MAP KEYS TERMINATED BY '=' if the element is of type MAP. The COLLECTION ITEMS TERMINATED BY ',' works on all three data types ( maps, arrays, struct) when they are by them selves. The problem is defining them for nested structures. Because we need multiple separators: one separator for array items and a different separator for map items defined within that array, etc. The default hive delimiters work just fine.The delimiters in that case will be level1 will have '^A', level 2 '^B', level 3 '^C', etc; What I am trying to do is to explicitly define them. The COLLECTION ITEMS TERMINATED BY ',' statement addresses the first level (^A); but don't know how to define the separators for other levels (to use instead of ^B, ^C, etc). Thanks, Sadu On Fri, Sep 28, 2012 at 1:28 AM, Manish.Bhoge <manish.bh...@target.com>wrote: > Hi Sadu,**** > > ** ** > > See my answer below.**** > > ** ** > > Also this will help you to understand in detail about collection, MAP and > Array.**** > > ** ** > > > http://datumengineering.wordpress.com/2012/09/27/agility-in-hive-map-array-score-for-hive/ > **** > > ** ** > > ** ** > > *From:* Sadananda Hegde [mailto:saduhe...@gmail.com] > *Sent:* Friday, September 28, 2012 10:31 AM > *To:* user@hive.apache.org > *Subject:* Defining collection items terminated by for a nested data type* > *** > > ** ** > > How does "collection items terminated by" work on a nested structure? Say > the table is created with the DDL:**** > > **** > > CREATE TABLE table_1(f1 int, f2 string, f3 array <struct <a string, b > int, c map<string, string>>>) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '|' > COLLECTION ITEMS TERMINATED BY ',' > MAP KEYS TERMINATED BY '=' > LINES TERMINATED BY '\'n' > STORED AS TEXTFILE;**** > > **** > > I guess comma seperator wll be used for the items in the outer > most structure (i.e. array). Is that true?**** > > Yes. Right, comma is a separator for array.**** > > **1. **What would be the seperator character between a,b and c > (struct elements)?**** > > I think it is \n. Not very sure about this.**** > > **2. **What would be the seperator for mapelements?**** > > For Map element separator is “=”**** > > 3. Is there a way to explicitly specify those ITEMS seperators rather > than using the default ones like ^B, ^C, etc, (like multiple collection > items)?**** > > You can define the custom separator. But multiple collection seems > infeasible. **** > > The original data is in xml format (complex one with many nested levels) > and we are planning to parse that xml using a java parser into delimited > text file which can be used to load the hive table. My question is:**** > > " How should we be representng the f3 like structures in the data > file?" **** > > **** > > The actual file has lot many fields with quite a few complex types like f3 > above; but I guess logic would be the same. **** > > --- For this either you need to write custom input reader in MAP-REDUCE > or use custom serde.**** > > Thanks for your help.....**** > > **** > > Regards,**** > > Sadu**** > > **** > > **** > > **** > > **** > > **** > > **** > > **** >