My experience is that gaining 20 spot instances accounts for a tiny fraction of the total time of provisioning a cluster with spark-ec2. This is not (solely) an AWS issue.
-- Martin Goodson | VP Data Science (0)20 3397 1240 [image: Inline image 1] On Thu, Jun 26, 2014 at 10:14 PM, Nicholas Chammas < nicholas.cham...@gmail.com> wrote: > Hmm, I remember a discussion on here about how the way in which spark-ec2 > rsyncs stuff to the cluster for setup could be improved, and I’m assuming > there are other such improvements to be made. Perhaps those improvements > don’t matter much when compared to EC2 instance launch times, but I’m not > sure. > > > > On Thu, Jun 26, 2014 at 4:48 PM, Aureliano Buendia <buendia...@gmail.com> > wrote: > >> >> >> >> On Thu, Jun 26, 2014 at 9:42 PM, Nicholas Chammas < >> nicholas.cham...@gmail.com> wrote: >> >>> >>> That’s technically true, but I’d be surprised if there wasn’t a lot of >>> room for improvement in spark-ec2 regarding cluster launch+config >>> times. >>> >> Unfortunately, this is a spark support issue, but an AWS one. Starting a >> few months ago, Amazon AWS services have been having bigger and bigger >> lags. Indeed, the default timeout hard coded in spark-ec2 is no longer >> able to launch the cluster successfully, and many people here reported that >> they had to increase it. >> >> >> >>> >> >> >