;) I actually thought it was a clever choice on Hive's part. There's no real need for the 2nd tier separators, despite the nested collections!
However, it's still tricky to know what Hive expects when you're generating table data with other apps. dean On Thu, Jun 20, 2013 at 9:34 PM, Stephen Sprague <sprag...@gmail.com> wrote: > look at it the other around if you want. knowing an array of a two > element struct is topologically the same as a map - they darn well better > be the same. :) > > > > On Thu, Jun 20, 2013 at 7:00 PM, Dean Wampler <deanwamp...@gmail.com>wrote: > >> It's not as "simple" as it seems, as I discovered yesterday, to my >> surprise. I created a table like this: >> >> CREATE TABLE t ( >> name STRING, >> stuff ARRAY<STRUCT<foo:String, bar:INT>>); >> >> I then used an insert statement to see how Hive would store the records, >> so I could populate the real table with another process. Hive used ^A for >> the field separator, ^B for the collection separator, in this case, to >> separate structs in the array, and ^C to separate the elements in each >> struct, e.g.,: >> >> Dean Wampler^Afirst^C1^Bsecond^C2^Bthird^C3 >> >> In other words, the structure you would expect for this table: >> >> CREATE TABLE t ( >> name STRING, >> stuff MAP<String, INT>); >> >> We should have covered the permutations of nested structures in our book, >> but we didn't It would be great to document them, for realz some where. >> >> dean >> >> On Thu, Jun 20, 2013 at 9:56 AM, Stephen Sprague <sprag...@gmail.com>wrote: >> >>> you only get three. field separator, array elements separator (aka >>> collection delimiter), and map key/value separator (aka map key >>> delimiter). >>> >>> when you nest deeper then you gotta use the default '^D', '^E' etc for >>> each level. At least that's been my experience which i've found has worked >>> successfully. >>> >>> >>> On Thu, Jun 20, 2013 at 7:45 AM, neha <ms.nehato...@gmail.com> wrote: >>> >>>> Thanks a lot for your reply, Stephen. >>>> To answer your question - I was not aware of the fact that we could use >>>> delimiter (in my example, '|') for first level of nesting. I tried now and >>>> it worked fine. >>>> >>>> My next question - Is there any way to provide delimiter in DDL for >>>> second level of nesting? >>>> Thanks again!! >>>> >>>> >>>> On Thu, Jun 20, 2013 at 8:02 PM, Stephen Sprague <sprag...@gmail.com>wrote: >>>> >>>>> its all there in the documentation under "create table" and it seems >>>>> you got everything right too except one little thing - in your second >>>>> example there for 'sample data loaded' - instead of '^B' change that to >>>>> '|' and you should be good. That's the delimiter that separates your two >>>>> array elements - ie collections. >>>>> >>>>> i guess the real question for me is when you say 'since there is no >>>>> way to use given delimiter "|" ' what did you mean by that? >>>>> >>>>> >>>>> >>>>> On Thu, Jun 20, 2013 at 1:42 AM, neha <ms.nehato...@gmail.com> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I have 2 questions about complex data types in nested composition. >>>>>> >>>>>> 1 >> I did not find a way to provide delimiter information in DDL if >>>>>> one or more column has nested array/struct. In this case, default >>>>>> delimiter >>>>>> has to be used for complex type column. >>>>>> Please let me know if this is a limitation as of now or I am missing >>>>>> something. >>>>>> >>>>>> e.g.: >>>>>> *DDL*: >>>>>> hive> create table example(col1 int, col2 >>>>>> array<struct<st1:int,st2:string>>) row format delimited fields terminated >>>>>> by ','; >>>>>> OK >>>>>> Time taken: 0.226 seconds >>>>>> >>>>>> *Sample data loaded:* >>>>>> 1,1^Cstring1^B2^Cstring2 >>>>>> >>>>>> *O/P:* >>>>>> hive> select * from example; >>>>>> OK >>>>>> 1 [{"st1":1,"st2":"string1"},{"st1":2,"st2":"string2"}] >>>>>> Time taken: 0.288 seconds >>>>>> >>>>>> 2 >> For the same DDL given above, if we provide clause* collection >>>>>> items terminated by '|' *and still use default delimiters (since >>>>>> there is no way to use given delimiter '|') then the select query shows >>>>>> incorrect data. >>>>>> Please let me know if this is something expected. >>>>>> >>>>>> e.g. >>>>>> *DDL*: >>>>>> hive> create table example(col1 int, col2 >>>>>> array<struct<st1:int,st2:string>>) row format delimited fields terminated >>>>>> by ',' collection items terminated by '|'; >>>>>> OK >>>>>> Time taken: 0.175 seconds >>>>>> >>>>>> *Sample data loaded:* >>>>>> 1,1^Cstring1^B2^Cstring2 >>>>>> >>>>>> *O/P: >>>>>> *hive> select * from >>>>>> example; >>>>>> >>>>>> OK >>>>>> 1 [{"st1":1,"st2":"string1\u00022"}] >>>>>> Time taken: 0.141 seconds >>>>>> ** >>>>>> Thanks & Regards. >>>>>> >>>>> >>>>> >>>> >>> >> >> >> -- >> Dean Wampler, Ph.D. >> @deanwampler >> http://polyglotprogramming.com >> > > -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com