On Fri, Jul 1, 2011 at 2:16 PM, Yichuan (William) Hu <huyich...@gmail.com>wrote:

> Hi,
>
> I am doing some simple tests to create table, load data using Hive. I
> am working on the VM provided by cloudera
> (https://ccp.cloudera.com/display/SUPPORT/Cloudera%27s+Hadoop+Demo+VM).
>
> I have a text file with each line containing an IP address and a name,
> e.g.,
>
> 123.45.67.89 tom
> 123.45.67.92 mark
>
> I create a table using following command:
>
> CREATE TABLE ip_name(
> ip STRING,
> name STRING
> )
> ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> WITH SERDEPROPERTIES(
> "input.regex" = "^([\d.]+) ([a-z]+)",
> "output.format.string" = "%1$s %2$s"
> )
> STORED AS TEXTFILE;
>
> Then, I use the following command to load data into the table:
>
> LOAD DATA LOCAL INPATH '/home/cloudera/test.txt' OVERWRITE INTO TABLE
> ip_name;
>
> Table was successfully created and file was also loaded, but all are
> NULL (the number of rows in the table is the same as the number of
> rows in the file). What could be the problem?
>
> Thanks a lot!
>
> William
>

You do not need the regex serde for this. Specify the table normally and use
space as the delimiter.

CREATE EXTERNAL TABLE logdata(
     xxx STRING,
     yyy STRING,
     ...
     z_t)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\040'
STORED AS TEXTFILE;

http://www.mail-archive.com/common-user@hadoop.apache.org/msg11178.html

Reply via email to