Hi.
I am not sure, but you might be able to pull it of by writing your own
custom serialization class and use the saveAsObjectFile() on the RDD.
Here is an article on how to create a custom serializer to support Kryo
serialization to and from disk.
http://blog.madhukaraphatak.com/kryo-disk-ser
As I wrote before, the result of my pipeline is binary objects, which I
want to write directly as raw bytes, and not serializing them again.
Is it possible?
On Sat, Jul 25, 2015 at 11:28 AM Akhil Das
wrote:
> Its been added from spark 1.1.0 i guess
> https://issues.apache.org/jira/browse/SPARK-
Its been added from spark 1.1.0 i guess
https://issues.apache.org/jira/browse/SPARK-1161
Thanks
Best Regards
On Sat, Jul 25, 2015 at 12:06 AM, Oren Shpigel wrote:
> Sorry, I didn't mention I'm using the Python API, which doesn't have the
> saveAsObjectFiles method.
> Is there any alternative fr
Sorry, I didn't mention I'm using the Python API, which doesn't have the
saveAsObjectFiles method.
Is there any alternative from Python?
And also, I want to write the raw bytes of my object into files on disk,
and not using some Serialization format to be read back into Spark.
Is it possible?
Any
You can look into .saveAsObjectFiles
Thanks
Best Regards
On Thu, Jul 23, 2015 at 8:44 PM, Oren Shpigel wrote:
> Hi,
> I use Spark to read binary files using SparkContext.binaryFiles(), and then
> do some calculations, processing, and manipulations to get new objects
> (also
> binary).
> The nex