SnappyCompressionCodec on the master

2015-07-08 Thread nizang
hi, I'm running spark standalone cluster (1.4.0). I have some applications running with scheduler every hour. I found that on one of the executions, the job got to be FINISHED after very few seconds (instead of ~5 minutes), and in the logs on the master, I can see the following exception: org.apa

cores and resource management

2015-07-05 Thread nizang
hi, We're running spark 1.4.0 on ec2, with 6 machines, 4 cores each. We're trying to run an application on a number of total-executor-cores. but we want it to run on the minimal number of machines as possible (e.g. total-executor-cores=4, we'll want single machine. total-executor-cores=12, we'll w

Re: Spark standalone cluster - resource management

2015-06-22 Thread nizang
to give a bit more data on what I'm trying to get - I have many tasks I want to run in parallel, so I want each task to catch small part of the cluster (-> only limited part of my 20 cores in the cluster) I have important tasks that I want them to get 10 cores, and I have small tasks that I want

Spark standalone cluster - resource management

2015-06-22 Thread nizang
hi, I'm running spark standalone cluster with 5 slaves, each has 4 cores. When I run job with the following configuration: /root/spark/bin/spark-submit -v --total-executor-cores 20 --executor-memory 22g --executor-cores 4 --class com.windward.spark.apps.MyApp --name dev-app --properties-fil

s3 - Can't make directory for path

2015-06-21 Thread nizang
hi, I'm trying to setup a standalone server, and in one of my tests, I got the following exception: java.io.IOException: Can't make directory for path 's3n://ww-sandbox/name_of_path' since it is a file. at org.apache.hadoop.fs.s3native.NativeS3FileSystem.mkdir(NativeS3FileSystem.java:541)

Worker is KILLED for no reason

2015-06-15 Thread nizang
hi, I'm using the new 1.4.0 installation, and ran a job there. The job finished and everything seems fine. When I enter the application, I can see that the job is marked as KILLED: Removed Executors ExecutorID Worker Cores Memory State Logs 0 worker-20150615080550-172.31.11.225-

Re: Job marked as killed in spark 1.4

2015-06-14 Thread nizang
hi, A simple way to recreate the problem - I have two servers installations, one with spark 1.3.1, and one with spark 1.4.0 I ran the following on both servers: root@ip-172-31-6-108 ~]$ spark/bin/spark-shell --total-executor-cores 1 scala> val text = sc.textFile("hdfs:///some-file.txt”); s

Job marked as killed in spark 1.4

2015-06-13 Thread nizang
hi, I have a running and working cluster with spark 1.3.1, and I tried to install a new cluster that is working with spark 1.4.0 I ran a job on the new 1.4.0 cluster, and the same job on the old 1.3.1 cluster After the job finished (in both clusters), I entered the job in the UI, and in the new

Get all servers in security group in bash(ec2)

2015-05-28 Thread nizang
hi, Is there anyway in bash (from an ec2 apsrk server) to list all the servers in my security group (or better - in a given security group) I tried using: wget -q -O - http://instance-data/latest/meta-data/security-groups security_group_xxx but now, I want all the servers in security group secu

Adding slaves on spark standalone on ec2

2015-05-28 Thread nizang
hi, I'm working on spark standalone system on ec2, and I'm having problems on resizing the cluster (meaning - adding or removing slaves). In the basic ec2 scripts (http://spark.apache.org/docs/latest/ec2-scripts.html), there's only script for lunching the cluster, not adding slaves to it. On the