Python 2.7 + numpy break sortByKey()

2014-03-01 Thread nicholas.chammas
Unexpected behavior. Here's the repro: 1. Launch an EC2 cluster with spark-ec2. 1 slave; default instance type. 2. Upgrade the cluster to Python 2.7 using the instructions here

OutOfMemoryError when loading input file

2014-03-01 Thread Yonathan Perez
Hello, I'm trying to run a simple test program that loads a large file (~12.4GB) into memory of a single many-core machine. The machine I'm using has more than enough memory (1TB RAM) and 64 cores (of which I want to use 16 for worker threads). Even though I set both the executor memory (spark.exe

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Nicholas Chammas
Voila. On Sun, Mar 2, 2014 at 12:01 AM, Nan Zhu wrote: > Yes, I think open an issue in JIRA is good > > and I volunteer to help fixing this > > Best, > > -- > Nan Zhu > > On Saturday, March 1, 2014 at 11:49 PM, Nicholas Chammas wrote: >

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Nan Zhu
Yes, I think open an issue in JIRA is good and I volunteer to help fixing this Best, -- Nan Zhu On Saturday, March 1, 2014 at 11:49 PM, Nicholas Chammas wrote: > Should I open an issue in JIRA to track this as a minor bug? > > > On Sat, Mar 1, 2014 at 8:07 PM, Josh Rosen (mailto:rosenv

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Nicholas Chammas
Should I open an issue in JIRA to track this as a minor bug? On Sat, Mar 1, 2014 at 8:07 PM, Josh Rosen wrote: > This seems like a bug; I think we should change the script to allow you to > sign into a cluster without workers. > > Imagine that I launch a cluster using spot workers for the insta

NoSuchMethodError in KafkaReciever

2014-03-01 Thread venki-kratos
I am trying to user code similar to following : public JavaPairDStream openStream() { HashMap kafkaParams = Maps.newHashMap(); kafkaParams.put(ZK_CONNECT,kafkaConfig.getString(ZK_CONNECT)); kafkaParams.put(CONSUMER_GRP_ID,kafkaConfig.getString(CONSUMER_GRP_ID));

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Nicholas Chammas
In my case, I was troubleshooting some error messages about Java socket exceptions and reset connections. So I thought to start up a one node cluster to rule out any potential problems with intra-cluster communication. Nick 2014년 3월 1일 토요일, Josh Rosen님이 작성한 메시지: > This seems like a bug; I think

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Aureliano Buendia
On Sun, Mar 2, 2014 at 1:07 AM, Josh Rosen wrote: > This seems like a bug; I think we should change the script to allow you to > sign into a cluster without workers. > > Imagine that I launch a cluster using spot workers for the instances; if > all of my workers die, I still want to be able to si

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Josh Rosen
This seems like a bug; I think we should change the script to allow you to sign into a cluster without workers. Imagine that I launch a cluster using spot workers for the instances; if all of my workers die, I still want to be able to sign into the master (a non-spot instance) to retrieve job resu

Re: spark-ec2 login expects at least 1 slave

2014-03-01 Thread Patrick Wendell
Yep, currently it only supports running at least 1 slave. On Sat, Mar 1, 2014 at 4:47 PM, nicholas.chammas wrote: > I successfully launched a Spark EC2 "cluster" with 0 slaves using spark-ec2. > When trying to login to the master node with spark-ec2 login, I get the > following: > > Searching for

spark-ec2 login expects at least 1 slave

2014-03-01 Thread nicholas.chammas
I successfully launched a Spark EC2 "cluster" with 0 slaves using spark-ec2. When trying to login to the master node with spark-ec2 login, I get the following: Searching for existing cluster test-blah... Found 1 master(s), 0 slaves ERROR: Could not find slaves in group test-blah-slaves Is this

Re: Where does println output go?

2014-03-01 Thread Aureliano Buendia
On Sat, Mar 1, 2014 at 9:49 PM, David Thomas wrote: > So I'm having this code: > > rdd.foreach(p => { > print(p) > }) > The above closure gets executed on workers, you need to look at the logs of the workers to see the output. > > Where can I see this output? Currently I'm running my spar

Where does println output go?

2014-03-01 Thread David Thomas
So I'm having this code: rdd.foreach(p => { print(p) }) Where can I see this output? Currently I'm running my spark program on a cluster. When I run the jar using sbt run, I see only INFO logs on the console. Where should I check to see the application sysouts?

Spark properties setting doesn't take effect

2014-03-01 Thread hequn8128
Hi! I wrote a standalone cluster app in scala, and i did some properties setting: System.setProperty("spark.akka.frameSize", "100") System.setProperty("spark.executor.memory", "3g") val sc = new SparkContext(...) It is strange that "spark.executor.memory" has taken effect, but "spark.akka.fr