How to read zip files from HDFS into spark-shell using scala

2014-08-09 Thread Alton Alexander
I've tried uploading a zip file that contains a csv to hdfs and then read it into spark using spark-shell and the first line is all messed up. However when i upload a gzip to hdfs and then read it into spark it does just fine. See output below: Is there a way to read a zip file as is from hdfs in

Re: Using pyspark shell in local[n] (single machine) mode unnecessarily tries to connect to HDFS NameNode ...

2014-04-10 Thread Alton Alexander
I am doing the exact same thing for the purpose of learning. I also don't have a hadoop cluster and plan to scale on ec2 as soon as I get it working locally. I am having good success just using the binaries on and not compiling from source... Is there a reason why you aren't just using the binarie

0.9 wont start cluster on ec2, SSH connection refused?

2014-04-11 Thread Alton Alexander
I run the follwoing command and it correctly starts one head and one master but then it fails because it can't log onto the head with the ssh key. The wierd thing is that I can log onto the head with that same public key. (ssh -i myamazonkey.pem r...@ec2-54-86-3-208.compute-1.amazonaws.com) Thanks

Re: 0.9 wont start cluster on ec2, SSH connection refused?

2014-04-11 Thread Alton Alexander
p; reachable? > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi > > > > On Fri, Apr 11, 2014 at 12:37 PM, Alton Alexander > wrote: >> >> I run the follwoing command and it correctly starts one head and one >>