[CONNECT] Why Can't We Specify Cluster Deploy Mode for Spark Connect?

2024-09-08 Thread Nagatomi Yasukazu
Hi All, Why is it not possible to specify cluster as the deploy mode for Spark Connect? As discussed in the following thread, it appears that there is an "arbitrary decision" within spark-submit that "Cluster mode is not applicable" to Spark Connect. GitHub Issue Comment: https://github.com/kube

Spark 3.2.1 vs Spark 3.5.2

2024-09-08 Thread Stephen Coy
Hi everyone, We are migrating our ETL tasks from Spark 3.2.1 (Java 11) to Spark 3.5.2 (Java 17). One of these applications that works fine on 3.2 completely kills our cluster on 3.5.2 The clusters consist of five 256GB workers and a 256GB master. The task is run with "--executor-memory 200G”