Hello, I have got a data frame with numerical data in Spark 3.1.1 (Java) which should be converted to a binary file. My idea is that I create a udf function that generates a byte array based on the numerical values, so I can apply this function on each row of the data frame and get than a new column with row-wise binary byte data. If this is done, I would like to write this column as continues byte stream to a file which is stored in a S3 bucket.
So my question is, is the idea with the udf function a good idea and is it possible to write this continues byte stream directly to S3 / is there any built-in functionality? What is a good strategy to do this? Thanks for help --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org