As per the Exception, it looks like there is a mismatch in actual sequence file's value type and the one which is provided by you in your code. Change BytesWritable to *LongWritable * and feel the execution.
-Umesh On Wed, Oct 7, 2015 at 2:41 PM, Vinoth Sankar <vinoth9...@gmail.com> wrote: > I'm just reading data from HDFS through Spark. It throws > *java.lang.ClassCastException: > org.apache.hadoop.io.LongWritable cannot be cast to > org.apache.hadoop.io.BytesWritable* at line no 6. I never used > LongWritable in my code, no idea how the data was in that format. > > Note : I'm not using MapReduce Concepts and also I'm not creating Jobs > explicitly. So i can't use job.setMapOutputKeyClass and > job.setMapOutputValueClass. > > JavaPairRDD<IntWritable, BytesWritable> hdfsContent = > sparkContext.sequenceFile(hdfsPath, IntWritable.class, BytesWritable.class); > JavaRDD<FileData> lines = hdfsContent.map(new Function<Tuple2<IntWritable, > BytesWritable>, FileData>() > { > public FileData call(Tuple2<IntWritable, BytesWritable> tuple2) throws > InvalidProtocolBufferException > { > byte[] bytes = tuple2._2().getBytes(); > return FileData.parseFrom(bytes); > } > }); >