Thanks for your help 2011/4/28 Loren Siebert <lo...@siebert.org>
> You have the file type as sequence file, but you are trying to load a GZip > file. Won’t that only work if the table is defined as a text file? > I've think sequence = gzip file before, and now I realized it's not. It's work when table is defined as text file. > > Hive isn’t doing anything on your behalf when you do LOAD DATA. It’s > syntactic sugar for copying a file into a HDFS location. From there, if you > want a RCFile table or a sequence file table or whatever, you can select > from the raw_logs table into the new table (e.g., raw_logs_rcfile) that you > have defined in the different format. > > So, this is the only way I can put data into a table defined as sequence file? Can I generate the RCFile use a unix command or some tools ? > > On Apr 27, 2011, at 9:33 PM, wd wrote: > > hi, > > I've tried to load gzip files into hive to save disk space, but failed. > > hive> load data local inpath 'tmp_b.20110426.gz' into table raw_logs > partition ( dt=20110426 ); > Copying data from file:/home/wd/t/tmp_b.20110426.gz > Copying file: file:/home/wd/t/tmp_b.20110426.gz > Loading data to table default.raw_logs partition (dt=20110426) > Failed with exception Wrong file format. Please check the file's format. > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > > The raw_logs table is created by: > create table raw_logs ( ............) partitioned by ( dt int ) STORED AS > SEQUENCEFILE; > > Is there something wrong? The error is same both in hive 0.5 and 0.7. > > >