Hi, I am running a Terasort with a cluster of 8 nodes.The map phase completes but when the reduce phase is around 68-70% I get this following error.
* 12/08/10 11:02:36 INFO mapred.JobClient: Task Id : attempt_201208101018_0001_r_000027_0, Status : FAILED java.lang.RuntimeException: problem advancing post rec#38320220 * * at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1214)* * at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.moveToNext(ReduceTask.java:249) * * at org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:245) * * at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:40) * * at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519)* * at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)* * at org.apache.hadoop.mapred.Child$4.run(Child.java:255)* * at java.security.AccessController.doPrivileged(Native Method)* * at javax.security.auth.Subject.doAs(Subject.java:416)* * at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) * * at org.apache.hadoop.mapred.Child.main(Child.java:249)* *Caused by: org.apache.hadoop.fs.ChecksumException: Checksum Error* * at org.apache.hadoop.mapred.IFileInputStream.doRead(IFileInputStream.java:164)* * at org.apache.hadoop.mapred.IFileInputStream.read(IFileInputStream.java:101)* * at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:328)* * at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:358)* * at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:342)* * at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:374)* * at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)* * at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330) * * at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) * * at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$RawKVIteratorReader.next(ReduceTask.java:2531) * * at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220)* * at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330) * * at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) * * at org.apache.hadoop.mapred.Task$ValuesIterator.readNextKey(Task.java:1253)* * at org.apache.hadoop.mapred.Task$ValuesIterator.next(Task.java:1212)* * ... 10 more* I came across somone facing the same issue<http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201001.mbox/%3c1c802db51001280427j5b8e57dai4a8d0fdd038...@mail.gmail.com%3E>in the mail-archives and he seemed to resolve it by listing hostnames in the */etc/hosts *file, but all my nodes have correct info about the hostnames in /etc/hosts, but I still have these reducers throwing error. Any help regarding this issue is appreciated .Thanks -- --With Regards Pavan Kulkarni