Yes, that's even what the objectFile javadoc says. It is expecting a SequenceFile with NullWritable keys and BytesWritable values containing the serialized values. This looks correct to me.
On Tue, Jan 13, 2015 at 8:39 AM, Kevin Burton <bur...@spinn3r.com> wrote: > This is interesting. > > I’m using ObjectInputStream to try to read a file written as > saveAsObjectFile… but it’s not working. > > The documentation says: > > "Write the elements of the dataset in a simple format using Java > serialization, which can then be loaded using SparkContext.objectFile().” > > … but that’s not right. > > def saveAsObjectFile(path: String) { > this.mapPartitions(iter => iter.grouped(10).map(_.toArray)) > .map(x => (NullWritable.get(), new > BytesWritable(Utils.serialize(x)))) > .saveAsSequenceFile(path) > } > > .. am I correct to assume that each entry is a serialized object BUT that > the entire thing is wrapped as a sequence file? > > -- > > Founder/CEO Spinn3r.com > Location: *San Francisco, CA* > blog: http://burtonator.wordpress.com > … or check out my Google+ profile > <https://plus.google.com/102718274791889610666/posts> > <http://spinn3r.com> > >