I think you could also try saveAsHadoopFile with a custom output format
like
https://github.com/amutu/tdw/blob/master/qe/contrib/src/java/org/apache/hadoop/hive/contrib/fileformat/protobuf/mapred/ProtobufOutputFormat.java
On Thu, 16 Jan 2020 at 09:34, Duan,Bing wrote:
> Hi all:
>
> I read binary data(protobuf format) from filesystem by binaryFiles
> function to a RDD[Array[Byte]] it works fine. But when I save the it to
> filesystem by saveAsTextFile, the quotation mark was be escaped like this:
> "\"20192_1\"",1,24,0,2,"\"S66.000x001\””,which should
> be "20192_1",1,24,0,2,”S66.000x001”.
>
> Anyone could give me some tip to implement a function
> like saveAsBinaryFile to persist the RDD[Array[Byte]]?
>
> Bests!
>
> Bing
>