Hi We have a deployment tool from GCE that we use internally for Spark. Let me know if you want access to that. Not really clean enough to opensource though :). Regards Mayur
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Sat, Apr 19, 2014 at 10:24 AM, Aureliano Buendia <buendia...@gmail.com>wrote: > Frank, > > Thanks for the prompt reply. Unfortunately I've been experiencing this for > the past few weeks on N Virginia farm, note that the latency might also > depend on the instance type. > > I'll try to amend the ec2 script as you suggested, but that will mean > waiting even longer for the cluster to come up. The current waiting time > cannot be classified as short (above 15 mins for 50 instances). > > I have tried this with and without spot pricing, and there was no > difference. It seems like amazon is not catching up fast enough with the > clustering demands. > > I wish spark would officially support google compute engine as well, > specially with the recent price drop, and given that gce is known to start > up much faster [1]. > > > [1] > http://gigaom.com/2013/03/15/by-the-numbers-how-google-compute-engine-stacks-up-to-amazon-ec2/ > > > > On Sat, Apr 19, 2014 at 5:11 AM, FRANK AUSTIN NOTHAFT < > fnoth...@berkeley.edu> wrote: > >> Aureliano, >> >> I've been noticing this error recently as well: >> >> ssh: connect to host ec-xx-xx-xx-xx.compute-1.amazonaws.com port 22: >> Connection refused >> Error 255 while executing remote command, retrying after 30 seconds >> >> However, this isn't an issue with the spark-ec2 scripts. After the >> scripts fail, if you wait a bit longer (e.g., another 2 minutes), the EC2 >> hosts will finish launching and port 22 will open up. Until the EC2 host >> has launched and opened port 22 for SSH, SSH cannot succeed, and the >> Spark-ec2 scripts will fail. I've noticed that EC2 machine launch latency >> seems to be highest in Oregon; I haven't run into this problem on either >> the California or Virgina EC2 farms. To work around this issue, I've >> manually modified my copy of the EC2 scripts to wait for 6 failures (i.e., >> 3 minutes), which seems to work OK. Might be worth a try on your end. I >> can't comment about the password request; I haven't seen that on my end. >> >> Regards, >> >> Frank Austin Nothaft >> fnoth...@berkeley.edu >> fnoth...@eecs.berkeley.edu >> 202-340-0466 >> >> >> On Fri, Apr 18, 2014 at 8:57 PM, Aureliano Buendia >> <buendia...@gmail.com>wrote: >> >>> Hi, >>> >>> Since 0.9.0 spark-ec2 has gone unstable. During launch it throws many >>> errors like: >>> >>> ssh: connect to host ec-xx-xx-xx-xx.compute-1.amazonaws.com port 22: >>> Connection refused >>> Error 255 while executing remote command, retrying after 30 seconds >>> >>> .. and recently, it prompts for passwords!: >>> >>> Warning: Permanently added '' (RSA) to the list of known hosts. >>> Password: >>> >>> Note that the hostname in Permanently added '' is missing in the log, >>> which is probably why it asks for a password. >>> >>> Is this a known bug? >>> >> >> >