Hi Stephen, *hive> show create table facts520_normal_text;* *OK* *CREATE TABLE facts520_normal_text(* * fact_key bigint,* * products_key int,* * retailers_key int,* * suppliers_key int,* * time_key int,* * units int)* *ROW FORMAT DELIMITED* * FIELDS TERMINATED BY ','* * LINES TERMINATED BY '\n'* *STORED AS INPUTFORMAT* * 'org.apache.hadoop.mapred.TextInputFormat'* *OUTPUTFORMAT* * 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'* *LOCATION* * 'hdfs:// aana1.ird.com/user/hive/warehouse/facts_520.db/facts520_normal_text'* *TBLPROPERTIES (* * 'numPartitions'='0',* * 'numFiles'='1',* * 'transient_lastDdlTime'='1369395430',* * 'numRows'='0',* * 'totalSize'='545216508',* * 'rawDataSize'='0')* *Time taken: 0.353 seconds*
The syserror log shows this: *java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) * * at org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:543) * * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)* * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)* * at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) * * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)* * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)* * at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) * * at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474)* * at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800)* * at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:546)* * at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)* * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)* * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* * at java.security.AccessController.doPrivileged(Native Method)* * at javax.security.auth.Subject.doAs(Subject.java:415)* * at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * * at org.apache.hadoop.mapred.Child.main(Child.java:262)* *Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.GZipCodec not found* * at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) * * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) * * ... 21 more* *java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) * * at org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:739) * * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)* * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)* * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* * at java.security.AccessController.doPrivileged(Native Method)* * at javax.security.auth.Subject.doAs(Subject.java:415)* * at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * * at org.apache.hadoop.mapred.Child.main(Child.java:262)* *Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.GZipCodec not found* * at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) * * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) * * ... 16 more* *org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:479) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:739) * * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)* * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)* * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* * at java.security.AccessController.doPrivileged(Native Method)* * at javax.security.auth.Subject.doAs(Subject.java:415)* * at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * * at org.apache.hadoop.mapred.Child.main(Child.java:262)* *Caused by: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) * * at org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) * * ... 14 more* *Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.GZipCodec not found* * at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) * * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) * * ... 16 more* *org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:479) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:739) * * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)* * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)* * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* * at java.security.AccessController.doPrivileged(Native Method)* * at javax.security.auth.Subject.doAs(Subject.java:415)* * at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * * at org.apache.hadoop.mapred.Child.main(Child.java:262)* *Caused by: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) * * at org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) * * ... 14 more* *Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.GZipCodec not found* * at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) * * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) * * ... 16 more* *org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:479) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:739) * * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567)* * at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)* * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)* * at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* * at java.security.AccessController.doPrivileged(Native Method)* * at javax.security.auth.Subject.doAs(Subject.java:415)* * at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) * * at org.apache.hadoop.mapred.Child.main(Child.java:262)* *Caused by: java.lang.IllegalArgumentException: Compression codec org.apache.hadoop.io.compress.GZipCodec was not found.* * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:85) * * at org.apache.hadoop.hive.ql.exec.Utilities.getFileExtension(Utilities.java:934) * * at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:469) * * ... 14 more* *Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.io.compress.GZipCodec not found* * at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1493) * * at org.apache.hadoop.mapred.FileOutputFormat.getOutputCompressorClass(FileOutputFormat.java:82) * * ... 16 more* It says that GZipCodec is not found. Isn't Snappy,GZip and BZip codecs available on Hadoop by default? Thank you, Sachin On Wed, Jun 5, 2013 at 11:58 PM, Stephen Sprague <sprag...@gmail.com> wrote: > well... the hiveException has the word "metadata" in it. maybe that's a > hint or a red-herrring. :) Let's try the following: > > 1. show create table * facts520_normal_text; > > * > *2. anything useful at this URL? ** > http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_000002or > is it just the same stack dump? > > > * > > > On Wed, Jun 5, 2013 at 3:17 AM, Sachin Sudarshana <sachin.had...@gmail.com > > wrote: > >> Hi, >> >> I have hive 0.10 + (CDH 4.2.1 patches) installed on my cluster. >> >> I have a table facts520_normal_text stored as a textfile. I'm trying to >> create a compressed table from this table using GZip codec. >> >> *hive> SET hive.exec.compress.output=true;* >> *hive> SET >> mapred.output.compression.codec=org.apache.hadoop.io.compress.GZipCodec;* >> *hive> SET mapred.output.compression.type=BLOCK;* >> * >> * >> *hive>* >> * > Create table facts520_gzip_text* >> * > (fact_key BIGINT,* >> * > products_key INT,* >> * > retailers_key INT,* >> * > suppliers_key INT,* >> * > time_key INT,* >> * > units INT)* >> * > ROW FORMAT DELIMITED FIELDS TERMINATED BY ','* >> * > LINES TERMINATED BY '\n'* >> * > STORED AS TEXTFILE;* >> * >> * >> *hive> INSERT OVERWRITE TABLE facts520_gzip_text SELECT * from >> facts520_normal_text;* >> >> >> When I run the above queries, the MR job fails. >> >> The error that the Hive CLI itself shows is the following: >> >> *Total MapReduce jobs = 3* >> *Launching Job 1 out of 3* >> *Number of reduce tasks is set to 0 since there's no reduce operator* >> *Starting Job = job_201306051948_0010, Tracking URL = >> http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010* >> *Kill Command = /usr/lib/hadoop/bin/hadoop job -kill >> job_201306051948_0010* >> *Hadoop job information for Stage-1: number of mappers: 3; number of >> reducers: 0* >> *2013-06-05 21:09:42,281 Stage-1 map = 0%, reduce = 0%* >> *2013-06-05 21:10:11,446 Stage-1 map = 100%, reduce = 100%* >> *Ended Job = job_201306051948_0010 with errors* >> *Error during job, obtaining debugging information...* >> *Job Tracking URL: >> http://aana1.ird.com:50030/jobdetails.jsp?jobid=job_201306051948_0010* >> *Examining task ID: task_201306051948_0010_m_000004 (and more) from job >> job_201306051948_0010* >> *Examining task ID: task_201306051948_0010_m_000001 (and more) from job >> job_201306051948_0010* >> * >> * >> *Task with the most failures(4):* >> *-----* >> *Task ID:* >> * task_201306051948_0010_m_000002* >> * >> * >> *URL:* >> * >> http://aana1.ird.com:50030/taskdetails.jsp?jobid=job_201306051948_0010&tipid=task_201306051948_0010_m_000002 >> * >> *-----* >> *Diagnostic Messages for this Task:* >> *java.lang.RuntimeException: >> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while >> processing row >> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23} >> * >> * at >> org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)* >> * at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)* >> * at >> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:418)* >> * at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)* >> * at org.apache.hadoop.mapred.Child$4.run(Child.java:268)* >> * at java.security.AccessController.doPrivileged(Native Method)* >> * at javax.security.auth.Subject.doAs(Subject.java:415)* >> * at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) >> * >> * at org.apache.hadoop.mapred.Child.main(Child.java:262)* >> *Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive >> Runtime Error while processing row >> {"fact_key":7549094,"products_key":205,"retailers_key":304,"suppliers_key":402,"time_key":103,"units":23} >> * >> * at org.apach* >> * >> * >> *FAILED: Execution Error, return code 2 from >> org.apache.hadoop.hive.ql.exec.MapRedTask* >> *MapReduce Jobs Launched:* >> *Job 0: Map: 3 HDFS Read: 0 HDFS Write: 0 FAIL* >> *Total MapReduce CPU Time Spent: 0 msec* >> >> >> I'm unable to figure out why this is happening. It looks like the data is >> not being able to be copied properly. >> Or is it that GZip codec is not supported on textfiles? >> >> Any help in this issue is greatly appreciated! >> >> Thank you, >> Sachin >> >> >> >