Hi,

I had same error few days back.

Now difficulty we have is to find which gz file is corrupt. Its not corrupt
as such but some how hadoop says it is. If you made the file in Windows and
then transfer to hadoop of can give. This error. If you want to see which
file is corrupt do select count query and watch job tracker for error , it
would give name of gz file currently processed and then of it fails you can
find and remove that file. You can then again gzip that in some Linux
machine and upload it would work.

Thanks,

-----------
Sent from Mobile , short and crisp.
On 29-Aug-2012 12:23 AM, "Kiwon Lee" <kiwoni....@gmail.com> wrote:

> Hi
>
> I have a lot of compressed gzip files on hdfs.
> An exception has occurred at TaskTracker, during processing of MR.
> If any file is invalid, may I know that?
>
>
> 2012-08-28 09:17:56,320 INFO ExecMapper: ExecMapper: processed 0 rows:
> used memory = 125190136
> 2012-08-28 09:17:56,324 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
> Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
>  2012-08-28 09:17:56,326 ERROR
> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
> as:ubuntu (auth:SIMPLE) cause:java.io.IOException: java.io.EOFException:
> Unexpected end of input stream
> 2012-08-28 09:17:56,326 WARN org.apache.hadoop.mapred.Child: Error running
> child
> java.io.IOException: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
>         at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:275)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>         at
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:210)
>         at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:195)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>         at org.apache.hadoop.mapred.Child.main(Child.java:264)
> Caused by: java.io.EOFException: Unexpected end of input stream
>         at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
>         at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:83)
>         at java.io.InputStream.read(InputStream.java:82)
>         at
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
>         at org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:160)
>         at
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38)
>         at
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:273)
>         ... 13 more
>
>
> --
>
> *Best Regards.** Ethan (Kiwon Lee)*
>    kiwoni....@gmail.com
>
>
>

Reply via email to