Thank you, Vijay. I was beginning to understand things that way myself, and you made it perfectly clear.
Sincerely, Mark On Tue, Sep 27, 2011 at 11:18 PM, Vijay <tec...@gmail.com> wrote: > There are a couple of problems. First of all, input.regex needs to be > "(\\w+)". Please note the case. > The bigger problem though, is that, with this (and most) serdes, you > can only expect one row per line of input. So multiple words within > the text cannot generate multiple rows. The best option is to probably > parse the text file and generate a different file with each word on a > separate line and then load it into hive. > > Hope that helps, > Vijay > > On Tue, Sep 27, 2011 at 6:45 PM, Mark Kerzner <mark.kerz...@shmsoft.com> > wrote: > > Hi, Hive experts, > > > > Would you see what I am doing wrong? For a simple test of breaking a text > > into words and putting these words into a table, I am doing this > > > > CREATE EXTERNAL TABLE books1 > > ( > > words string > > ) > > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' > > WITH SERDEPROPERTIES ("input.regex" = "\\W") > > STORED AS TextFile; > > > > LOAD DATA INPATH '/test-data/ch1/moby-dick.txt' OVERWRITE INTO TABLE > > books1; > > > > This SerDe works in Java code, but in Hive I am getting all nulls in the > > books1 table. > > > > Thank you, > > Mark > > >