Re: ClassCastException while reading data from HDFS through Spark

UMESH CHAUDHARY Wed, 07 Oct 2015 02:58:41 -0700

As per the Exception, it looks like there is a mismatch in actual sequence
file's value type and the one which is provided by you in your code.
Change BytesWritable
to *LongWritable * and feel the execution.


-Umesh

On Wed, Oct 7, 2015 at 2:41 PM, Vinoth Sankar <vinoth9...@gmail.com> wrote:

> I'm just reading data from HDFS through Spark. It throws 
> *java.lang.ClassCastException:
> org.apache.hadoop.io.LongWritable cannot be cast to
> org.apache.hadoop.io.BytesWritable* at line no 6. I never used
> LongWritable in my code, no idea how the data was in that format.
>
> Note : I'm not using MapReduce Concepts and also I'm not creating Jobs
> explicitly. So i can't use job.setMapOutputKeyClass and
> job.setMapOutputValueClass.
>
> JavaPairRDD<IntWritable, BytesWritable> hdfsContent =
> sparkContext.sequenceFile(hdfsPath, IntWritable.class, BytesWritable.class);
> JavaRDD<FileData> lines = hdfsContent.map(new Function<Tuple2<IntWritable,
> BytesWritable>, FileData>()
> {
> public FileData call(Tuple2<IntWritable, BytesWritable> tuple2) throws
> InvalidProtocolBufferException
> {
> byte[] bytes = tuple2._2().getBytes();
> return FileData.parseFrom(bytes);
> }
> });
>

Re: ClassCastException while reading data from HDFS through Spark

Reply via email to