Seems like it is a bug rather than a feature.
I filed a bug report: https://issues.apache.org/jira/browse/SPARK-5363
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-slow-working-Spark-1-2-fast-freezing-tp21278p21317.html
Sent from the Apache Spark
What do you suggest? Should I send you the script so you can run it
yourself?
Yes, my broadcast variables are fairly large (1.7 MBytes).
On Wed, Jan 21, 2015 at 8:20 PM, Davies Liu wrote:
> Because that you have large broadcast, they need to be loaded into
> Python worker for each tasks, if the
Because that you have large broadcast, they need to be loaded into
Python worker for each tasks, if the worker is not reused.
We will really appreciate that if you could provide a short script to
reproduce the freeze, then we can investigate the root cause and fix
it. Also, fire a JIRA for it, tha
I set spark.python.worker.reuse = false and now it seems to run longer than
before (it has not crashed yet). However, it is very very slow. How to
proceed?
On Wed, Jan 21, 2015 at 2:21 AM, Davies Liu wrote:
> Could you try to disable the new feature of reused worker by:
> spark.python.worker.reu
We have not meet this issue, so not sure there are bugs related to
reused worker or not.
Could provide more details about it?
On Wed, Jan 21, 2015 at 2:27 AM, critikaled wrote:
> I'm also facing the same issue.
> is this a bug?
>
>
>
> --
> View this message in context:
> http://apache-spark-us
I'm also facing the same issue.
is this a bug?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-1-slow-working-Spark-1-2-fast-freezing-tp21278p21283.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---
Could you try to disable the new feature of reused worker by:
spark.python.worker.reuse = false
On Tue, Jan 20, 2015 at 11:12 PM, Tassilo Klein wrote:
> Hi,
>
> It's a bit of a longer script that runs some deep learning training.
> Therefore it is a bit hard to wrap up easily.
>
> Essentially I a
Hi,
It's a bit of a longer script that runs some deep learning training.
Therefore it is a bit hard to wrap up easily.
Essentially I am having a loop, in which a gradient is computed on each
node and collected (this is where it freezes at some point).
grads = zipped_trainData.map(distributed_gr
Could you provide a short script to reproduce this issue?
On Tue, Jan 20, 2015 at 9:00 PM, TJ Klein wrote:
> Hi,
>
> I just recently tried to migrate from Spark 1.1 to Spark 1.2 - using
> PySpark. Initially, I was super glad, noticing that Spark 1.2 is way faster
> than Spark 1.1. However, the in