Hi Spark devs,
I'm seeing a stacktrace where the classloader that reads from the REPL is
hung, and blocking all progress on that executor. Below is that hung
thread's stacktrace, and also the stacktrace of another hung thread.
I thought maybe there was an issue with the REPL's JVM on the other s
There's a related discussion
https://issues.apache.org/jira/browse/SPARK-2815
-- --
??: "Chester Chen";
: 2014??8??21??(??) 7:42
??: "dev";
: Re: is Branch-1.1 SBT build broken for yarn-alpha ?
Just tried on master branc
Just tried on master branch, and the master branch works fine for yarn-alpha
On Wed, Aug 20, 2014 at 4:39 PM, Chester Chen wrote:
> I just updated today's build and tried branch-1.1 for both yarn and
> yarn-alpha.
>
> For yarn build, this command seem to work fine.
>
> sbt/sbt -Pyarn -Dhadoop.v
I just updated today's build and tried branch-1.1 for both yarn and
yarn-alpha.
For yarn build, this command seem to work fine.
sbt/sbt -Pyarn -Dhadoop.version=2.3.0-cdh5.0.1 projects
for yarn-alpha
sbt/sbt -Pyarn-alpha -Dhadoop.version=2.0.5-alpha projects
I got the following
Any ideas
Che
Yeah that's the one we discussed...sorry I pointed to a different one that
I was reading...
On Wed, Aug 20, 2014 at 3:28 PM, DB Tsai wrote:
> To be specific, I was discussing this PR with Debasish which reduces
> lots of issues when sending big objects to executors without using
> broadcast exp
To be specific, I was discussing this PR with Debasish which reduces
lots of issues when sending big objects to executors without using
broadcast explicitly.
Broadcast RDD object once per TaskSet (instead of sending it for every task)
https://issues.apache.org/jira/browse/SPARK-2521
Sincerely,
D
Hi Patrick,
Last few days I came across some bugs which got exposed due to ALS runs on
large scale data...although it was not related to the akka changes but
during the debug I found across some akka related changes that might have
an impact of overall performance...one example is the following:
obs are hung, I see the following in mesos master logs:
> >
> > I0820 19:28:02.651296 24666 master.cpp:2282] Sending 7 offers to
> framework 20140820-170154-1315739402-5050-24660-0020
> > I0820 19:28:02.654502 24668 master.cpp:1578] Processing reply for
> offers: [ 20140820-
Hey Deb,
Can you be specific what changes you are mentioning? We have not, to my
knowledge, made major architectural changes around akka use.
I think in general we don't want people to be using Spark's actor system
directly - it is an internal communication component in Spark and could
e.g. be re
ivial (e.g. parallelize 1 to 1 and sum). Killing one of
the jobs typically allows the others to start proceeding.
While jobs are hung, I see the following in mesos master logs:
I0820 19:28:02.651296 24666 master.cpp:2282] Sending 7 offers to framework
20140820-170154-1315739402-5050-24660-0020
Hi,
There have been some recent changes in the way akka is used in spark and I
feel they are major changes...
Is there a design document / JIRA / experiment on large datasets that
highlight the impact of changes (1.0 vs 1.1) ? Basically it will be great
to understand where akka is used in the cod
Hi Debasish,
The fix is to raise spark.yarn.executor.memoryOverhead until this goes
away. This controls the buffer between the JVM heap size and the amount of
memory requested from YARN (JVMs can take up memory beyond their heap
size). You should also make sure that, in the YARN NodeManager
confi
I could reproduce the issue in both 1.0 and 1.1 using YARN...so this is
definitely a YARN related problem...
At least for me right now only deployment option possible is standalone...
On Tue, Aug 19, 2014 at 11:29 PM, Xiangrui Meng wrote:
> Hi Deb,
>
> I think this may be the same issue as de
13 matches
Mail list logo