Because that you have large broadcast, they need to be loaded into
Python worker for each tasks, if the worker is not reused.

We will really appreciate that if you could provide a short script to
reproduce the freeze, then we can investigate the root cause and fix
it. Also, fire a JIRA for it, thanks!

On Wed, Jan 21, 2015 at 4:56 PM, Tassilo Klein <tjkl...@gmail.com> wrote:
> I set spark.python.worker.reuse = false and now it seems to run longer than
> before (it has not crashed yet). However, it is very very slow. How to
> proceed?
>
> On Wed, Jan 21, 2015 at 2:21 AM, Davies Liu <dav...@databricks.com> wrote:
>>
>> Could you try to disable the new feature of reused worker by:
>> spark.python.worker.reuse = false
>>
>> On Tue, Jan 20, 2015 at 11:12 PM, Tassilo Klein <tjkl...@bwh.harvard.edu>
>> wrote:
>> > Hi,
>> >
>> > It's a bit of a longer script that runs some deep learning training.
>> > Therefore it is a bit hard to wrap up easily.
>> >
>> > Essentially I am having a loop, in which a gradient is computed on each
>> > node
>> > and collected (this is where it freezes at some point).
>> >
>> >  grads =
>> > zipped_trainData.map(distributed_gradient_computation).collect()
>> >
>> >
>> > The distributed_gradient_computation mainly contains a Theano derived
>> > function. The theano function itself is a broadcast variable.
>> >
>> > Let me know if you need more information.
>> >
>> > Best,
>> >  Tassilo
>> >
>> > On Wed, Jan 21, 2015 at 1:17 AM, Davies Liu <dav...@databricks.com>
>> > wrote:
>> >>
>> >> Could you provide a short script to reproduce this issue?
>> >>
>> >> On Tue, Jan 20, 2015 at 9:00 PM, TJ Klein <tjkl...@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> > I just recently tried to migrate from Spark 1.1 to Spark 1.2 - using
>> >> > PySpark. Initially, I was super glad, noticing that Spark 1.2 is way
>> >> > faster
>> >> > than Spark 1.1. However, the initial joy faded quickly when I noticed
>> >> > that
>> >> > all my stuff didn't successfully terminate operations anymore. Using
>> >> > Spark
>> >> > 1.1 it still works perfectly fine, though.
>> >> > Specifically, the execution just freezes without any error output at
>> >> > one
>> >> > point, when calling a joint map() and collect() statement (after
>> >> > having
>> >> > it
>> >> > called many times successfully before in a loop).
>> >> >
>> >> > Any clue? Or do I have to wait for the next version?
>> >> >
>> >> > Best,
>> >> >  Tassilo
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > View this message in context:
>> >> >
>> >> > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-slow-working-Spark-1-2-fast-freezing-tp21278.html
>> >> > Sent from the Apache Spark User List mailing list archive at
>> >> > Nabble.com.
>> >> >
>> >> > ---------------------------------------------------------------------
>> >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> >> > For additional commands, e-mail: user-h...@spark.apache.org
>> >> >
>> >
>> >
>>
>>
>> The information in this e-mail is intended only for the person to whom it
>> is
>> addressed. If you believe this e-mail was sent to you in error and the
>> e-mail
>> contains patient information, please contact the Partners Compliance
>> HelpLine at
>> http://www.partners.org/complianceline . If the e-mail was sent to you in
>> error
>> but does not contain patient information, please contact the sender and
>> properly
>> dispose of the e-mail.
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to