java.io.FileNotFoundException: shuffle

2014-07-02 Thread nit
pressure..but I could never figure out the root cause. -- 14/07/02 07:34:45 WARN TaskSetManager: Loss was due to java.io.FileNotFoundException java.io.FileNotFoundException: /var/storage/sda3/nm-local/usercache/nit/appcache/application_1403208801430_0183/spark-local-20140702065054-388d/0e

Re: Yay for 1.0.0! EC2 Still has problems.

2014-07-10 Thread nit
I am also running into "modules/mod_authn_alias.so" issue on r3.8xlarge when launched cluster with ./spark-ec2; so ganglia is not accessible. From the posts it seems that Patrick suggested using Ubuntu 12.04. Can you please provide name of AMI that can be used with -a flag that will not have this

Re: Yay for 1.0.0! EC2 Still has problems.

2014-07-10 Thread nit
I am also running into "modules/mod_authn_alias.so" issue on r3.8xlarge when launched cluster with ./spark-ec2; so ganglia is not accessible. From the posts it seems that Patrick suggested using Ubuntu 12.04. Can you please provide name of AMI that can be used with -a flag that will not have this

spark-ec2 script with Tachyon

2014-07-16 Thread nit
Hi, It seems that spark-ec2 script deploys Tachyon module along with other setup. I am trying to use .persist(OFF_HEAP) for RDD persistence, but on worker I see this error -- Failed to connect (2) to master localhost/127.0.0.1:19998 : java.net.ConnectException: Connection refused -- >From netsta

Readin from Amazon S3 behaves inconsistently: return different number of lines...

2014-07-31 Thread nit
*First Question:* On Amazon S3 I have a directory with 1024 files, where each file size is ~9Mb; and each line in a file has two entries separated by '\t'. Here is my program, which is calculating total number of entries in the dataset -- val inputId = sc.textFile(inputhPath, noParts).flat

Re: Installing Spark 0.9.1 on EMR Cluster

2014-07-31 Thread nit
Have you tried flag " --spark-version" of spark-ec2 ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Installing-Spark-0-9-1-on-EMR-Cluster-tp11084p11096.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark job finishes then command shell is blocked/hangs?

2014-07-31 Thread nit
which version of spark are you running? have you tried sc.stop as as last line of your program? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-job-finishes-then-command-shell-is-blocked-hangs-tp11095p11097.html Sent from the Apache Spark User List mai

Re: Readin from Amazon S3 behaves inconsistently: return different number of lines...

2014-08-01 Thread nit
@sean - I am using latest code from master branch, up to commit# a7d145e98c55fa66a541293930f25d9cdc25f3b4 . In my case I have multiple directories with 1024 files(in that sizes of files may be different). For some directories I always get consistent result... and for others I can reproduce the inc

Re: Compiling Spark master (284771ef) with sbt/sbt assembly fails on EC2

2014-08-01 Thread nit
I also ran into same issue. What is the solution? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Compiling-Spark-master-284771ef-with-sbt-sbt-assembly-fails-on-EC2-tp11155p11189.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: java.lang.ClassCastException: java.lang.Long cannot be cast to scala.Tuple2

2014-09-17 Thread nit
@ankur - I have also seen this recently. Is there a patch available for this issue? (in my recent experience on non-graphx apps, sort based shuffle looks better while dealing with memory pressure...) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-