No Problem. I had something different in mind where I wanted to split this complete string into different columns to simplify the queries, like ASP.NET_SessionId, Rviewd, UserId, UserType, LastLogin
Now let me try with your approach. I have seen this DDL in hive tutorial but wasn't sure whether I should use both array and map together or not. Let me try this out and will let you know the result. Thanks for clarifications. Thank You, Manish From: Bejoy KS [mailto:bejoy...@yahoo.com] Sent: Friday, September 21, 2012 1:16 PM To: Manish.Bhoge; user@hive.apache.org; user Subject: Re: Map issue in Hive. Hey Manish Sorry If my post was not clear. You need to use either Array or Map for that based on the data it holds. looking at your sample data ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM I assume it need to be split like this, which is of the format key '=' value and key value pairs are separated by ';' . ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw; +Rviewd=; +UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d; +UserType=G; +LastLogin=9/11/2012+12:00:01+AM So you can have Map as the column data type and DDL should be like . COLLECTION ITEMS TERMINATED BY ';' MAP KEYS TERMINATED BY '=' Hope it is clear now :) Regards, Bejoy KS ________________________________ From: Manish.Bhoge <manish.bh...@target.com> To: "user@hive.apache.org" <user@hive.apache.org>; 'Bejoy KS' <bejoy...@yahoo.com>; user <u...@hadoop.apache.org> Sent: Friday, September 21, 2012 1:01 PM Subject: RE: Map issue in Hive. Thanks Bejoy, So you mean to say in the below scenario we have to have both collection and map together? Do I need to define Array and MAP together for the same column? As I understand from your mail this column has not only MAP but collection of Maps. Is this assumption is right? Thank You, Manish. -----Original Message----- From: Bejoy KS [mailto:bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>] Sent: Friday, September 21, 2012 10:50 AM To: user@hive.apache.org<mailto:user@hive.apache.org>; user Subject: Re: Map issue in Hive. Hi Manish Couple of things to keep in mind here if you have a column data like this "key1:value1;key2:value2;key3:value3;" and this column has to be handled by a map data type, Then the DDL should like like FIELDS TERMINATED BY '<any char>' COLLECTION ITEMS TERMINATED BY ';' MAP KEYS TERMINATED BY ',' ie when you have a key value pair, the separator for each key value pair is specified using 'COLLECTION ITEMS TERMINATED BY' and the separator for key and value within each pair is specified using 'MAP KEYS TERMINATED BY' . In your column if it is just a collection of elements rather than a key value pair, you can use an Array data type instead. Here just specify the delimiter for each values using 'COLLECTION ITEMS TERMINATED BY' Regards, Bejoy KS ________________________________ From: Manish <manishbh...@rocketmail.com<mailto:manishbh...@rocketmail.com>> To: user <u...@hadoop.apache.org<mailto:u...@hadoop.apache.org>> Cc: user <user@hive.apache.org<mailto:user@hive.apache.org>> Sent: Friday, September 21, 2012 10:04 AM Subject: Map issue in Hive. Hivers, I have a web log which i need to load into single table. But one column has complete string of important data. However i want to extract complete information from 1 column and do further analysis. Issue here is that after giving ';' as a delimiter i was expecting Map for all occurrence of ';'. But it is considering only first delimiter(;) and rest of the string is coming in value pair. This is how 1 column data is looks like ASP.NET_SessionId=bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM It is getting stored as below. {"ASP.NET_SessionId":"bzqgdenuhxxyqmc2vv5tvrdw;+Rviewd=;+UserId=%7bb5cecc61-cd09-4aa6-bc92-cae367f1753b%7d;+UserType=G;+LastLogin=9/11/2012+12:00:01+AM"} Below is the DDL. CREATE external TABLE page_view_tmp_2 ( C_0 STRING, C_1 MAP<STRING,STRING>, C_2 STRING, C_3 STRING, C_41 STRING) COMMENT 'Page View' ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' MAP KEYS TERMINATED BY ';' STORED AS TEXTFILE