Hi Mahsa
      It is possible to store unstructured data in have if the records follow a 
constant pattern like log files. You need to use a SERDE for the same. It would 
be nice parsing your text line by line using regular expressions and you can 
use RegexSerDe for the same . In the serde properties define 
input.regex - the regular expression
output.format string - to which columns the parsed data corresponds to

An example of apache web log analytics is given in hive wiki

https://cwiki.apache.org/Hive/gettingstarted.html#GettingStarted-ApacheWeblogData 


add jar ../build/contrib/hive_contrib.jar; CREATE TABLE apachelog ( host 
STRING, identity STRING, user STRING, time STRING, request STRING, status 
STRING, size STRING, referer STRING, agent STRING)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES ( "input.regex" = "([^]*) ([^]*) ([^]*) (-|\\[^\\]*\\]) 
([^ \"]*|\"[^\"]*\") (-|[0-9]*) (-|[0-9]*)(?: ([^ \"]*|\".*\") ([^ 
\"]*|\".*\"))?", "output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s 
%8$s %9$s" )
STORED AS TEXTFILE;


Regards
Bejoy KS



________________________________
 From: mahsa mofidpoor <mofidp...@gmail.com>
To: user@hive.apache.org 
Sent: Thursday, March 1, 2012 2:25 AM
Subject: Hive and unstructured data
 

Hello

I am curious to know how Hive maps the real-world unstructured data (like 
Facebook logs) with its own structures(tables). In other words, if there is a 
concept of building a table over unstructured data in Hive then how does the 
structure is exactly defined?

Thank you in advance for your response.
Mahsa  

Reply via email to