Have you seen this thread ?
http://stackoverflow.com/questions/24402737/how-to-read-gz-files-in-spark-using-wholetextfiles
On Tue, Feb 16, 2016 at 2:17 AM, Deepak Gopalakrishnan
wrote:
> Hello,
>
> I'm reading S3 files using wholeTextFiles() . My files are gzip format but
> the names of the fil
Hello,
I'm reading S3 files using wholeTextFiles() . My files are gzip format but
the names of the files does not end with a ".gz". I cannot force the names
of these files to end with a ".gz" . Is there a way to specify the
InputFormat as Gzip when using wholeTextFiles()
?
--
Regards,
*Deepak Go