I set spark.python.worker.reuse = false and now it seems to run longer than before (it has not crashed yet). However, it is very very slow. How to proceed?
On Wed, Jan 21, 2015 at 2:21 AM, Davies Liu <dav...@databricks.com> wrote: > Could you try to disable the new feature of reused worker by: > spark.python.worker.reuse = false > > On Tue, Jan 20, 2015 at 11:12 PM, Tassilo Klein <tjkl...@bwh.harvard.edu> > wrote: > > Hi, > > > > It's a bit of a longer script that runs some deep learning training. > > Therefore it is a bit hard to wrap up easily. > > > > Essentially I am having a loop, in which a gradient is computed on each > node > > and collected (this is where it freezes at some point). > > > > grads = zipped_trainData.map(distributed_gradient_computation).collect() > > > > > > The distributed_gradient_computation mainly contains a Theano derived > > function. The theano function itself is a broadcast variable. > > > > Let me know if you need more information. > > > > Best, > > Tassilo > > > > On Wed, Jan 21, 2015 at 1:17 AM, Davies Liu <dav...@databricks.com> > wrote: > >> > >> Could you provide a short script to reproduce this issue? > >> > >> On Tue, Jan 20, 2015 at 9:00 PM, TJ Klein <tjkl...@gmail.com> wrote: > >> > Hi, > >> > > >> > I just recently tried to migrate from Spark 1.1 to Spark 1.2 - using > >> > PySpark. Initially, I was super glad, noticing that Spark 1.2 is way > >> > faster > >> > than Spark 1.1. However, the initial joy faded quickly when I noticed > >> > that > >> > all my stuff didn't successfully terminate operations anymore. Using > >> > Spark > >> > 1.1 it still works perfectly fine, though. > >> > Specifically, the execution just freezes without any error output at > one > >> > point, when calling a joint map() and collect() statement (after > having > >> > it > >> > called many times successfully before in a loop). > >> > > >> > Any clue? Or do I have to wait for the next version? > >> > > >> > Best, > >> > Tassilo > >> > > >> > > >> > > >> > -- > >> > View this message in context: > >> > > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-slow-working-Spark-1-2-fast-freezing-tp21278.html > >> > Sent from the Apache Spark User List mailing list archive at > Nabble.com. > >> > > >> > --------------------------------------------------------------------- > >> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > >> > For additional commands, e-mail: user-h...@spark.apache.org > >> > > > > > > > > The information in this e-mail is intended only for the person to whom it > is > addressed. If you believe this e-mail was sent to you in error and the > e-mail > contains patient information, please contact the Partners Compliance > HelpLine at > http://www.partners.org/complianceline . If the e-mail was sent to you in > error > but does not contain patient information, please contact the sender and > properly > dispose of the e-mail. >