On Fri, Jul 1, 2011 at 2:16 PM, Yichuan (William) Hu <huyich...@gmail.com>wrote:
> Hi, > > I am doing some simple tests to create table, load data using Hive. I > am working on the VM provided by cloudera > (https://ccp.cloudera.com/display/SUPPORT/Cloudera%27s+Hadoop+Demo+VM). > > I have a text file with each line containing an IP address and a name, > e.g., > > 123.45.67.89 tom > 123.45.67.92 mark > > I create a table using following command: > > CREATE TABLE ip_name( > ip STRING, > name STRING > ) > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > WITH SERDEPROPERTIES( > "input.regex" = "^([\d.]+) ([a-z]+)", > "output.format.string" = "%1$s %2$s" > ) > STORED AS TEXTFILE; > > Then, I use the following command to load data into the table: > > LOAD DATA LOCAL INPATH '/home/cloudera/test.txt' OVERWRITE INTO TABLE > ip_name; > > Table was successfully created and file was also loaded, but all are > NULL (the number of rows in the table is the same as the number of > rows in the file). What could be the problem? > > Thanks a lot! > > William > You do not need the regex serde for this. Specify the table normally and use space as the delimiter. CREATE EXTERNAL TABLE logdata( xxx STRING, yyy STRING, ... z_t) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040' STORED AS TEXTFILE; http://www.mail-archive.com/common-user@hadoop.apache.org/msg11178.html