Empty RDD after LzoTextInputFormat in newAPIHadoopFile

Ivoirians Tue, 29 Jul 2014 11:06:45 -0700

Hello,

There seems to be very little documentation on the usage of newAPIHadoopFile
and even less of it in conjunction with opening LZO compressed files. I've
hit a wall with some unexpected behavior that I don't know how to interpret.


This is a test program I'm running in an effort to get this working, after
finding previous threads on this subject.



The job runs on a yarn cluster and input is the path of a very much
non-empty LZO file sitting in hdfs, which I can manually decompress and read
as a textfile, with a count of ~3 million. What I don't know how to
interpret is that the above code runs without complaints and prints 0. I
would appreciate some guidance with where to go; there are no error messages
to point me anywhere, just an empty RDD.

Thanks,
Kevin



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Empty-RDD-after-LzoTextInputFormat-in-newAPIHadoopFile-tp10873.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Empty RDD after LzoTextInputFormat in newAPIHadoopFile

Reply via email to