You have the file type as sequence file, but you are trying to load a GZip 
file. Won’t that only work if the table is defined as a text file? 

Hive isn’t doing anything on your behalf when you do LOAD DATA. It’s syntactic 
sugar for copying a file into a HDFS location. From there, if you want a RCFile 
table or a sequence file table or whatever, you can select from the raw_logs 
table into the new table (e.g., raw_logs_rcfile) that you have defined in the 
different format.


On Apr 27, 2011, at 9:33 PM, wd wrote:

> hi,
> 
> I've tried to load gzip files into hive to save disk space, but failed.
> 
> hive> load data local inpath 'tmp_b.20110426.gz' into table raw_logs 
> partition ( dt=20110426 );
> Copying data from file:/home/wd/t/tmp_b.20110426.gz
> Copying file: file:/home/wd/t/tmp_b.20110426.gz
> Loading data to table default.raw_logs partition (dt=20110426)
> Failed with exception Wrong file format. Please check the file's format.
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> 
> The raw_logs table is created by:
> create table raw_logs ( ............)  partitioned by ( dt int ) STORED AS 
> SEQUENCEFILE;
> 
> Is there something wrong? The error is same both in hive 0.5 and 0.7.

Reply via email to