Largest Spark Cluster

2014-04-04 Thread Parviz Deyhim
Spark community, What's the size of the largest Spark cluster ever deployed? I've heard Yahoo is running Spark on several hundred nodes but don't know the actual number. can someone share? Thanks

Re: JMX with Spark

2014-04-15 Thread Parviz Deyhim
home directory or $home/conf directory? works for me with metrics.properties hosted under conf dir. On Tue, Apr 15, 2014 at 6:08 PM, Paul Schooss wrote: > Has anyone got this working? I have enabled the properties for it in the > metrics.conf file and ensure that it is placed under spark's home

Re: Spark Streaming source from Amazon Kinesis

2014-04-21 Thread Parviz Deyhim
it is possible Nick. Please take a look here: https://aws.amazon.com/articles/Elastic-MapReduce/4926593393724923 the source code is here as a pull request: https://github.com/apache/spark/pull/223 let me know if you have any questions. On Mon, Apr 21, 2014 at 1:00 PM, Nicholas Chammas < nichola

Re: Spark Streaming source from Amazon Kinesis

2014-04-21 Thread Parviz Deyhim
sorry Matei. Will definitely start working on making the changes soon :) On Mon, Apr 21, 2014 at 1:10 PM, Matei Zaharia wrote: > There was a patch posted a few weeks ago ( > https://github.com/apache/spark/pull/223), but it needs a few changes in > packaging because it uses a license that isn’t

Re: spark-0.9.1 compiled with Hadoop 2.3.0 doesn't work with S3?

2014-04-21 Thread Parviz Deyhim
I ran into the same issue. The problem seems to be with the jets3t library that Spark uses in project/SparkBuild.scala. change this: "net.java.dev.jets3t" % "jets3t" % "0.7.1" to "net.java.dev.jets3t" % "jets3t" % "0.9.0" "0.7.1" is not the right version of jets3t

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-23 Thread Parviz Deyhim
You need to set SPARK_MEM or SPARK_EXECUTOR_MEMORY (for Spark 1.0) to amount of memory your application needs to consume at each node. Try setting those variables (example: export SPARK_MEM=10g) or set it via SparkConf.set as suggested by jholee. On Tue, Apr 22, 2014 at 4:25 PM, jaeholee wrote:

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-23 Thread Parviz Deyhim
it means you're out of disk space. Check to see if you have enough free disk space left your node(s). On Wed, Apr 23, 2014 at 2:08 PM, jaeholee wrote: > After doing that, I ran my code once with a smaller example, and it worked. > But ever since then, I get the "No space left on device" message