Frank,

Thanks for the prompt reply. Unfortunately I've been experiencing this for
the past few weeks on N Virginia farm, note that the latency might also
depend on the instance type.

I'll try to amend the ec2 script as you suggested, but that will mean
waiting even longer for the cluster to come up. The current waiting time
cannot be classified as short (above 15 mins for 50 instances).

I have tried this with and without spot pricing, and there was no
difference. It seems like amazon is not catching up fast enough with the
clustering demands.

I wish spark would officially support google compute engine as well,
specially with the recent price drop, and given that gce is known to start
up much faster [1].


[1]
http://gigaom.com/2013/03/15/by-the-numbers-how-google-compute-engine-stacks-up-to-amazon-ec2/



On Sat, Apr 19, 2014 at 5:11 AM, FRANK AUSTIN NOTHAFT <[email protected]
> wrote:

> Aureliano,
>
> I've been noticing this error recently as well:
>
> ssh: connect to host ec-xx-xx-xx-xx.compute-1.amazonaws.com port 22:
> Connection refused
> Error 255 while executing remote command, retrying after 30 seconds
>
> However, this isn't an issue with the spark-ec2 scripts. After the scripts
> fail, if you wait a bit longer (e.g., another 2 minutes), the EC2 hosts
> will finish launching and port 22 will open up. Until the EC2 host has
> launched and opened port 22 for SSH, SSH cannot succeed, and the Spark-ec2
> scripts will fail. I've noticed that EC2 machine launch latency seems to be
> highest in Oregon; I haven't run into this problem on either the California
> or Virgina EC2 farms. To work around this issue, I've manually modified my
> copy of the EC2 scripts to wait for 6 failures (i.e., 3 minutes), which
> seems to work OK. Might be worth a try on your end. I can't comment about
> the password request; I haven't seen that on my end.
>
> Regards,
>
> Frank Austin Nothaft
> [email protected]
> [email protected]
> 202-340-0466
>
>
> On Fri, Apr 18, 2014 at 8:57 PM, Aureliano Buendia 
> <[email protected]>wrote:
>
>> Hi,
>>
>> Since 0.9.0 spark-ec2 has gone unstable. During launch it throws many
>> errors like:
>>
>> ssh: connect to host ec-xx-xx-xx-xx.compute-1.amazonaws.com port 22:
>> Connection refused
>> Error 255 while executing remote command, retrying after 30 seconds
>>
>> .. and recently, it prompts for passwords!:
>>
>> Warning: Permanently added '' (RSA) to the list of known hosts.
>> Password:
>>
>> Note that the hostname in Permanently added '' is missing in the log,
>> which is probably why it asks for a password.
>>
>> Is this a known bug?
>>
>
>

Reply via email to