You have the file type as sequence file, but you are trying to load a GZip file. Won’t that only work if the table is defined as a text file?
Hive isn’t doing anything on your behalf when you do LOAD DATA. It’s syntactic sugar for copying a file into a HDFS location. From there, if you want a RCFile table or a sequence file table or whatever, you can select from the raw_logs table into the new table (e.g., raw_logs_rcfile) that you have defined in the different format. On Apr 27, 2011, at 9:33 PM, wd wrote: > hi, > > I've tried to load gzip files into hive to save disk space, but failed. > > hive> load data local inpath 'tmp_b.20110426.gz' into table raw_logs > partition ( dt=20110426 ); > Copying data from file:/home/wd/t/tmp_b.20110426.gz > Copying file: file:/home/wd/t/tmp_b.20110426.gz > Loading data to table default.raw_logs partition (dt=20110426) > Failed with exception Wrong file format. Please check the file's format. > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > > The raw_logs table is created by: > create table raw_logs ( ............) partitioned by ( dt int ) STORED AS > SEQUENCEFILE; > > Is there something wrong? The error is same both in hive 0.5 and 0.7.