That's fine. The other points that I mentioned still apply.
On Thu, Apr 11, 2019 at 4:52 PM V0lleyBallJunki3
wrote:
> I am not using pyspark. The job is written in Scala
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
I am not using pyspark. The job is written in Scala
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
unsubscribe
You will probably need to do a couple of things. One, you will need to
probably increase the "spark.sql.broadcastTimeout" setting. As well, when
you broadcast a variable it gets replicated once per executor not once per
machine so you will need to increase your executor size and allow more
cores to
I am using spark.sparkContext.broadcast() to broadcast. Is this even true if
the memory on our machines is 244 Gb a 70 Gb variable can't be broadcasted
even with high network speed?
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
--
From: V0lleyBallJunki3
Sent: 10 April 2019 10:06
To: user@spark.apache.org
Subject: Unable to broadcast a very large variable
Hello,
I have a 110 node cluster with each executor having 50 GB memory and I
want to broadcast a variable of 70GB with each machine have 244 GB of
memory. I am h
Hello,
I have a 110 node cluster with each executor having 50 GB memory and I
want to broadcast a variable of 70GB with each machine have 244 GB of
memory. I am having difficulty doing that. I was wondering at what size is
it unwise to broadcast a variable. Is there a general rule of thumb?
-