Re: spark.driver.memory is not set (pyspark, 1.1.0)

jamborta Wed, 01 Oct 2014 10:44:03 -0700

Thank you for the replies. It makes sense for scala/java, but in python the
JVM is launched when the spark context is initialised, so it should be able
to set it, I assume.
On 1 Oct 2014 18:25, "Andrew Or-2 [via Apache Spark User List]" <
ml-node+s1001560n15510...@n3.nabble.com> wrote:


> Hi Tamas,
>
> Yes, Marcelo is right. The reason why it doesn't make sense to set
> "spark.driver.memory" in your SparkConf is because your application code,
> by definition, *is* the driver. This means by the time you get to the
> code that initializes your SparkConf, your driver JVM has already started
> with some heap size, and you can't easily change the size of the JVM once
> it has started. Note that this is true regardless of the deploy mode
> (client or cluster).
>
> Alternatives to set this include the following: (1) You can set
> "spark.driver.memory" in your `spark-defaults.conf` on the node that
> submits the application, (2) You can use the --driver-memory command line
> option if you are using Spark submit (bin/pyspark goes through this path,
> as you have discovered on your own).
>
> Does that make sense?
>
>
> 2014-10-01 10:17 GMT-07:00 Tamas Jambor <[hidden email]
> <http://user/SendEmail.jtp?type=node&node=15510&i=0>>:
>
>> when you say "respective backend code to launch it", I thought this is
>> the way to do that.
>>
>> thanks,
>> Tamas
>>
>> On Wed, Oct 1, 2014 at 6:13 PM, Marcelo Vanzin <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=1>> wrote:
>> > Because that's not how you launch apps in cluster mode; you have to do
>> > it through the command line, or by calling directly the respective
>> > backend code to launch it.
>> >
>> > (That being said, it would be nice to have a programmatic way of
>> > launching apps that handled all this - this has been brought up in a
>> > few different contexts, but I don't think there's an "official"
>> > solution yet.)
>> >
>> > On Wed, Oct 1, 2014 at 9:59 AM, Tamas Jambor <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=2>> wrote:
>> >> thanks Marcelo.
>> >>
>> >> What's the reason it is not possible in cluster mode, either?
>> >>
>> >> On Wed, Oct 1, 2014 at 5:42 PM, Marcelo Vanzin <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=3>> wrote:
>> >>> You can't set up the driver memory programatically in client mode. In
>> >>> that mode, the same JVM is running the driver, so you can't modify
>> >>> command line options anymore when initializing the SparkContext.
>> >>>
>> >>> (And you can't really start cluster mode apps that way, so the only
>> >>> way to set this is through the command line / config files.)
>> >>>
>> >>> On Wed, Oct 1, 2014 at 9:26 AM, jamborta <[hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=4>> wrote:
>> >>>> Hi all,
>> >>>>
>> >>>> I cannot figure out why this command is not setting the driver
>> memory (it is
>> >>>> setting the executor memory):
>> >>>>
>> >>>>     conf = (SparkConf()
>> >>>>                 .setMaster("yarn-client")
>> >>>>                 .setAppName("test")
>> >>>>                 .set("spark.driver.memory", "1G")
>> >>>>                 .set("spark.executor.memory", "1G")
>> >>>>                 .set("spark.executor.instances", 2)
>> >>>>                 .set("spark.executor.cores", 4))
>> >>>>     sc = SparkContext(conf=conf)
>> >>>>
>> >>>> whereas if I run the spark console:
>> >>>> ./bin/pyspark --driver-memory 1G
>> >>>>
>> >>>> it sets it correctly. Seemingly they both generate the same commands
>> in the
>> >>>> logs.
>> >>>>
>> >>>> thanks a lot,
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/spark-driver-memory-is-not-set-pyspark-1-1-0-tp15498.html
>> >>>> Sent from the Apache Spark User List mailing list archive at
>> Nabble.com.
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: [hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=5>
>> >>>> For additional commands, e-mail: [hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=6>
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Marcelo
>> >
>> >
>> >
>> > --
>> > Marcelo
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=7>
>> For additional commands, e-mail: [hidden email]
>> <http://user/SendEmail.jtp?type=node&node=15510&i=8>
>>
>>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/spark-driver-memory-is-not-set-pyspark-1-1-0-tp15498p15510.html
>  To unsubscribe from spark.driver.memory is not set (pyspark, 1.1.0), click
> here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=15498&code=amFtYm9ydGFAZ21haWwuY29tfDE1NDk4fC00Mjk2ODU1NTM=>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-driver-memory-is-not-set-pyspark-1-1-0-tp15498p15512.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: spark.driver.memory is not set (pyspark, 1.1.0)

Reply via email to