it with
>>> > Exception on the entire job.
>>> > I like SPARK-6593, since it can cover also additional cases, not just in
>>> > case of corrupted zip files.
>>> >
>>> >
>>> >
>>> > From: Dale Richardson
>&g
t; > I like SPARK-6593, since it can cover also additional cases, not just
>> in
>> > case of corrupted zip files.
>> >
>> >
>> >
>> > From: Dale Richardson
>> > To: "dev@spark.apache.org"
>> > Date: 29/0
6593, since it can cover also additional cases, not just in
> > case of corrupted zip files.
> >
> >
> >
> > From: Dale Richardson
> > To: "dev@spark.apache.org"
> > Date: 29/03/2015 11:48 PM
> > Subject:One corrupt gzip in a
job.
> I like SPARK-6593, since it can cover also additional cases, not just in
> case of corrupted zip files.
>
>
>
> From: Dale Richardson
> To: "dev@spark.apache.org"
> Date: 29/03/2015 11:48 PM
> Subject: One corrupt gzip in a directory of
Richardson
To: "dev@spark.apache.org"
Date: 29/03/2015 11:48 PM
Subject: One corrupt gzip in a directory of 100s
Recently had an incident reported to me where somebody was analysing a
directory of gzipped log files, and was struggling to load them into spark
because one of the
Recently had an incident reported to me where somebody was analysing a
directory of gzipped log files, and was struggling to load them into spark
because one of the files was corrupted - calling
sc.textFiles('hdfs:///logs/*.gz') caused an IOException on the particular
executor that was reading