subject:"Re\: Reading Binary files in Spark program"

Re: Reading Binary files in Spark program

2015-05-20 Thread Akhil Das

Hi Basically, you need to convert it to a serializable format before doing the collect. You can fire up a spark shell and paste this: val sFile = sc.sequenceFile[LongWritable, > Text]("/home/akhld/sequence/sigmoid") > *.map(_._2.toString)* > sFile.take(5).foreach(println) Use the

Re: Reading Binary files in Spark program

2015-05-20 Thread Tapan Sharma

I am not doing anything special. *Here is the code :* SparkConf sparkConf = new SparkConf().setAppName("JavaSequenceFile"); JavaSparkContext ctx = new JavaSparkContext(sparkConf); JavaPairRDD seqFiles = ctx.sequenceFile(args[0], String.class, Byte.class) ; // Following statements is giving exc

Re: Reading Binary files in Spark program

2015-05-20 Thread Akhil Das

If you can share the complete code and a sample file, may be i can try to reproduce it on my end. Thanks Best Regards On Wed, May 20, 2015 at 7:00 AM, Tapan Sharma wrote: > Problem is still there. > Exception is not coming at the time of reading. > Also the count of JavaPairRDD is as expected.

Re: Reading Binary files in Spark program

2015-05-19 Thread Tapan Sharma

Problem is still there. Exception is not coming at the time of reading. Also the count of JavaPairRDD is as expected. It is when we are calling collect() or toArray() methods, the exception is coming. Something to do with Text class even though I haven't used it in the program. Regards Tapan On T

Re: Reading Binary files in Spark program

2015-05-19 Thread Tapan Sharma

Thanks. I will try and let you know. But what exactly is an issue? Any pointers? Regards Tapan On Tue, May 19, 2015 at 6:26 PM, Akhil Das wrote: > Try something like: > > JavaPairRDD output = sc.newAPIHadoopFile(inputDir, > org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.cla

Re: Reading Binary files in Spark program

2015-05-19 Thread Akhil Das

Try something like: JavaPairRDD output = sc.newAPIHadoopFile(inputDir, org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.class, IntWritable.class, Text.class, new Job().getConfiguration()); With the type of input format that you require. Thanks Best Regards On Tue, May 1

Re: Reading Binary files in Spark program

Re: Reading Binary files in Spark program

Re: Reading Binary files in Spark program

Re: Reading Binary files in Spark program

Re: Reading Binary files in Spark program

Re: Reading Binary files in Spark program

6 matches

Site Navigation

Mail list logo

Footer information