Hello,

I have got a data frame with numerical data in Spark 3.1.1 (Java) which should 
be converted to a binary file. 
My idea is that I create a udf function that generates a byte array based on 
the numerical values, so I can apply this function on each row of the data 
frame and get than a new column with row-wise binary byte data.
If this is done, I would like to write this column as continues byte stream to 
a file which is stored in a S3 bucket.

So my question is, is the idea with the udf function a good idea and is it 
possible to write this continues byte stream directly to S3 / is there any 
built-in functionality?
What is a good strategy to do this?

Thanks for help
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to