Spark Write BinaryType Column as continues file to S3

Philipp Kraus Fri, 08 Apr 2022 08:14:01 -0700

Hello,

I have got a data frame with numerical data in Spark 3.1.1 (Java) which should 
be converted to a binary file. 
My idea is that I create a udf function that generates a byte array based on 
the numerical values, so I can apply this function on each row of the data 
frame and get than a new column with row-wise binary byte data.
If this is done, I would like to write this column as continues byte stream to 
a file which is stored in a S3 bucket.


So my question is, is the idea with the udf function a good idea and is it 
possible to write this continues byte stream directly to S3 / is there any 
built-in functionality?
What is a good strategy to do this?

Thanks for help
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark Write BinaryType Column as continues file to S3

Reply via email to