binaryFile write

2022-04-09 Thread Philipp Kraus
Hello, I’m using Spark 3.1.1 and cannot yet update to a newer version. I have got a data frame with a single column of DataTypes.BinaryType where each row contains a byte array with generated binary data. I try now to write this data in a single file with mydataframe.coalesce( 1 ) .write

Fwd: Spark Write BinaryType Column as continues file to S3

2022-04-08 Thread Philipp Kraus
Hello, > Am 08.04.2022 um 17:34 schrieb Lalwani, Jayesh : > > What format are you writing the file to? Are you planning on your own custom > format, or are you planning to use standard formats like parquet? I’m dealing with geo-spatial data (Apache Sedona), so I have got a data frame with such

Fwd: Spark Write BinaryType Column as continues file to S3

2022-04-08 Thread Philipp Kraus
;https://en.wikipedia.org/wiki/LAS_file_format> Thanks a lot > > On Fri, Apr 8, 2022 at 10:14 AM Philipp Kraus > <mailto:philipp.kraus.flashp...@gmail.com>> wrote: > Hello, > > I have got a data frame with numerical data in Spark 3.1.1 (Java) which > should b

Spark Write BinaryType Column as continues file to S3

2022-04-08 Thread Philipp Kraus
Hello, I have got a data frame with numerical data in Spark 3.1.1 (Java) which should be converted to a binary file. My idea is that I create a udf function that generates a byte array based on the numerical values, so I can apply this function on each row of the data frame and get than a new

Spark Executor dies in K8 cluster

2021-05-19 Thread Philipp Kraus
Hello, I have got the following first testing setup: Kubernetes Cluster 1.20 (4 nodes, each node with 120 GB hard disk, 4 cpus, 40 GB memory) Spark installation by Binami Helm Charts https://artifacthub.io/packages/helm/bitnami/spark (Chart Version 5.4.2 / Spark 3.1.1) using GeoSpark versi