Re: Spark Write BinaryType Column as continues file to S3

Sean Owen Fri, 08 Apr 2022 08:32:56 -0700

That's for strings, but still doesn't address what is desired w.r.t.
writing a binary column


On Fri, Apr 8, 2022 at 10:31 AM Bjørn Jørgensen <bjornjorgen...@gmail.com>
wrote:

> In the New spark 3.3 there Will be an sql function
> https://github.com/apache/spark/commit/25dd4254fed71923731fd59838875c0dd1ff665a
> hope this can help you.
>
> fre. 8. apr. 2022, 17:14 skrev Philipp Kraus <
> philipp.kraus.flashp...@gmail.com>:
>
>> Hello,
>>
>> I have got a data frame with numerical data in Spark 3.1.1 (Java) which
>> should be converted to a binary file.
>> My idea is that I create a udf function that generates a byte array based
>> on the numerical values, so I can apply this function on each row of the
>> data frame and get than a new column with row-wise binary byte data.
>> If this is done, I would like to write this column as continues byte
>> stream to a file which is stored in a S3 bucket.
>>
>> So my question is, is the idea with the udf function a good idea and is
>> it possible to write this continues byte stream directly to S3 / is there
>> any built-in functionality?
>> What is a good strategy to do this?
>>
>> Thanks for help
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Re: Spark Write BinaryType Column as continues file to S3

Reply via email to