Hi Jack, could it be that your source file does not use \001 as delimiter but the character sequence "^A"? I get this feeling when looking at your cat printout and also when you say you can do a search and replace. If I do a print out of a file with the same delimiter I get the following
2014-04-097846478510 2014-04-107851178558 These are actually three columns but you cannot see the ^A because it is not a printable character. Check your file in a hex editor or something to verify that you have the correct delimiter. Br, Petter 2014-05-18 6:40 GMT+02:00 Jack Yang <j...@uow.edu.au>: > Hi All, > > I have a local file called mytest.txt (restored in hdfs already). The > content is like this: > > $ cat -A HDFSLOAD_DIR/mytest.txt > > 49139801^A25752451^Aunknown$ > > 49139801^A24751754^Aunknown$ > > 49139801^A2161696^Anice$ > > > > To load this raw data above, I then defined the table like this in HQL: > > > > create table my_test( > > userid BIGINT, > > movieId BIGINT, > > comment STRING > > ) > > ROW FORMAT DELIMITED > > FIELDS TERMINATED BY '\001' > > STORED AS TEXTFILE; > > > > My problem is that when I “SELECT * FROM my_test;” , I got this: > > > > NULL NULL NULL > > NULL NULL NULL > > NULL NULL NULL > > > > I then replace “^A” with “^” in mytest.txt, and also re-defined my table > structure by using : > > FIELDS TERMINATED BY '^’ > > > > So when I select all, I got the correct results. > > > > Any thoughts??? Thanks in advance. > > > > > > Best regards, > > Jack > > > > >