look at it the other around if you want. knowing an array of a two element struct is topologically the same as a map - they darn well better be the same. :)
On Thu, Jun 20, 2013 at 7:00 PM, Dean Wampler <deanwamp...@gmail.com> wrote: > It's not as "simple" as it seems, as I discovered yesterday, to my > surprise. I created a table like this: > > CREATE TABLE t ( > name STRING, > stuff ARRAY<STRUCT<foo:String, bar:INT>>); > > I then used an insert statement to see how Hive would store the records, > so I could populate the real table with another process. Hive used ^A for > the field separator, ^B for the collection separator, in this case, to > separate structs in the array, and ^C to separate the elements in each > struct, e.g.,: > > Dean Wampler^Afirst^C1^Bsecond^C2^Bthird^C3 > > In other words, the structure you would expect for this table: > > CREATE TABLE t ( > name STRING, > stuff MAP<String, INT>); > > We should have covered the permutations of nested structures in our book, > but we didn't It would be great to document them, for realz some where. > > dean > > On Thu, Jun 20, 2013 at 9:56 AM, Stephen Sprague <sprag...@gmail.com>wrote: > >> you only get three. field separator, array elements separator (aka >> collection delimiter), and map key/value separator (aka map key >> delimiter). >> >> when you nest deeper then you gotta use the default '^D', '^E' etc for >> each level. At least that's been my experience which i've found has worked >> successfully. >> >> >> On Thu, Jun 20, 2013 at 7:45 AM, neha <ms.nehato...@gmail.com> wrote: >> >>> Thanks a lot for your reply, Stephen. >>> To answer your question - I was not aware of the fact that we could use >>> delimiter (in my example, '|') for first level of nesting. I tried now and >>> it worked fine. >>> >>> My next question - Is there any way to provide delimiter in DDL for >>> second level of nesting? >>> Thanks again!! >>> >>> >>> On Thu, Jun 20, 2013 at 8:02 PM, Stephen Sprague <sprag...@gmail.com>wrote: >>> >>>> its all there in the documentation under "create table" and it seems >>>> you got everything right too except one little thing - in your second >>>> example there for 'sample data loaded' - instead of '^B' change that to >>>> '|' and you should be good. That's the delimiter that separates your two >>>> array elements - ie collections. >>>> >>>> i guess the real question for me is when you say 'since there is no way >>>> to use given delimiter "|" ' what did you mean by that? >>>> >>>> >>>> >>>> On Thu, Jun 20, 2013 at 1:42 AM, neha <ms.nehato...@gmail.com> wrote: >>>> >>>>> Hi All, >>>>> >>>>> I have 2 questions about complex data types in nested composition. >>>>> >>>>> 1 >> I did not find a way to provide delimiter information in DDL if >>>>> one or more column has nested array/struct. In this case, default >>>>> delimiter >>>>> has to be used for complex type column. >>>>> Please let me know if this is a limitation as of now or I am missing >>>>> something. >>>>> >>>>> e.g.: >>>>> *DDL*: >>>>> hive> create table example(col1 int, col2 >>>>> array<struct<st1:int,st2:string>>) row format delimited fields terminated >>>>> by ','; >>>>> OK >>>>> Time taken: 0.226 seconds >>>>> >>>>> *Sample data loaded:* >>>>> 1,1^Cstring1^B2^Cstring2 >>>>> >>>>> *O/P:* >>>>> hive> select * from example; >>>>> OK >>>>> 1 [{"st1":1,"st2":"string1"},{"st1":2,"st2":"string2"}] >>>>> Time taken: 0.288 seconds >>>>> >>>>> 2 >> For the same DDL given above, if we provide clause* collection >>>>> items terminated by '|' *and still use default delimiters (since >>>>> there is no way to use given delimiter '|') then the select query shows >>>>> incorrect data. >>>>> Please let me know if this is something expected. >>>>> >>>>> e.g. >>>>> *DDL*: >>>>> hive> create table example(col1 int, col2 >>>>> array<struct<st1:int,st2:string>>) row format delimited fields terminated >>>>> by ',' collection items terminated by '|'; >>>>> OK >>>>> Time taken: 0.175 seconds >>>>> >>>>> *Sample data loaded:* >>>>> 1,1^Cstring1^B2^Cstring2 >>>>> >>>>> *O/P: >>>>> *hive> select * from >>>>> example; >>>>> >>>>> OK >>>>> 1 [{"st1":1,"st2":"string1\u00022"}] >>>>> Time taken: 0.141 seconds >>>>> ** >>>>> Thanks & Regards. >>>>> >>>> >>>> >>> >> > > > -- > Dean Wampler, Ph.D. > @deanwampler > http://polyglotprogramming.com >