Hi
Basically, you need to convert it to a serializable format before doing the
collect.
You can fire up a spark shell and paste this:
val sFile = sc.sequenceFile[LongWritable,
> Text]("/home/akhld/sequence/sigmoid")
> *.map(_._2.toString)*
> sFile.take(5).foreach(println)
Use the
I am not doing anything special.
*Here is the code :*
SparkConf sparkConf = new SparkConf().setAppName("JavaSequenceFile");
JavaSparkContext ctx = new JavaSparkContext(sparkConf);
JavaPairRDD seqFiles = ctx.sequenceFile(args[0],
String.class, Byte.class) ;
// Following statements is giving exc
If you can share the complete code and a sample file, may be i can try to
reproduce it on my end.
Thanks
Best Regards
On Wed, May 20, 2015 at 7:00 AM, Tapan Sharma
wrote:
> Problem is still there.
> Exception is not coming at the time of reading.
> Also the count of JavaPairRDD is as expected.
Problem is still there.
Exception is not coming at the time of reading.
Also the count of JavaPairRDD is as expected. It is when we are calling
collect() or toArray() methods, the exception is coming.
Something to do with Text class even though I haven't used it in the
program.
Regards
Tapan
On T
Thanks. I will try and let you know. But what exactly is an issue? Any
pointers?
Regards
Tapan
On Tue, May 19, 2015 at 6:26 PM, Akhil Das
wrote:
> Try something like:
>
> JavaPairRDD output = sc.newAPIHadoopFile(inputDir,
> org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.cla
Try something like:
JavaPairRDD output = sc.newAPIHadoopFile(inputDir,
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.class,
IntWritable.class,
Text.class, new Job().getConfiguration());
With the type of input format that you require.
Thanks
Best Regards
On Tue, May 1