Thank you, Vijay.

I was beginning to understand things that way myself, and you made it
perfectly clear.

Sincerely,
Mark

On Tue, Sep 27, 2011 at 11:18 PM, Vijay <tec...@gmail.com> wrote:

> There are a couple of problems. First of all, input.regex needs to be
> "(\\w+)". Please note the case.
> The bigger problem though, is that, with this (and most) serdes, you
> can only expect one row per line of input. So multiple words within
> the text cannot generate multiple rows. The best option is to probably
> parse the text file and generate a different file with each word on a
> separate line and then load it into hive.
>
> Hope that helps,
> Vijay
>
> On Tue, Sep 27, 2011 at 6:45 PM, Mark Kerzner <mark.kerz...@shmsoft.com>
> wrote:
> > Hi, Hive experts,
> >
> > Would you see what I am doing wrong? For a simple test of breaking a text
> > into words and putting these words into a table, I am doing this
> >
> > CREATE EXTERNAL TABLE books1
> > (
> >   words string
> > )
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
> > WITH SERDEPROPERTIES ("input.regex" = "\\W")
> > STORED AS TextFile;
> >
> > LOAD DATA INPATH '/test-data/ch1/moby-dick.txt'  OVERWRITE INTO TABLE
> > books1;
> >
> > This SerDe works in Java code, but in Hive I am getting all nulls in the
> > books1 table.
> >
> > Thank you,
> > Mark
> >
>

Reply via email to