Re: Reading sequencefile

2014-03-11 Thread Jaonary Rabarisoa
Thank you. I fogort the classOf[*] arguments. On Tue, Mar 11, 2014 at 10:46 AM, Shixiong Zhu wrote: > Hi Jaonary, > > You can use "sc.sequenceFile" to load your file. E.g., > > scala> import org.apache.hadoop.io._ > import org.apache.hadoop.io._ > > scala> val rdd = sc.sequenceFile("path_to_fil

Re: Reading sequencefile

2014-03-11 Thread Shixiong Zhu
Hi Jaonary, You can use "sc.sequenceFile" to load your file. E.g., scala> import org.apache.hadoop.io._ import org.apache.hadoop.io._ scala> val rdd = sc.sequenceFile("path_to_file", classOf[Text], classOf[BytesWritable]) rdd: org.apache.spark.rdd.RDD[(org.apache.hadoop.io.Text, org.apache.hadoo

Reading sequencefile

2014-03-11 Thread Jaonary Rabarisoa
Hi all, I'm trying to read a sequenceFile that represent a set of jpeg image generated using this tool : http://stuartsierra.com/2008/04/24/a-million-little-files . According to the documentation : "Each key is the name of a file (a Hadoop “Text”), the value is the binary contents of the file (a B