When a spot instance terminates, you lose all data (RDD partitions) stored
in the executors that ran on that instance. Spark can recreate the
partitions from input data, but if that requires going through multiple
preceding shuffles a good chunk of the job will need to be redone.
-Sven

On Thu, Mar 24, 2016 at 10:15 PM, Dillian Murphey <crackshotm...@gmail.com>
wrote:

> I'm very new to apache spark. I'm just a user not a developer.
>
> I'm running a cluster with many spot instances. Am I correct in
> understanding that spark can handle an unlimited number of spot instance
> failures and restarts?  Sometimes all the spot instances will dissapear
> without warning, and then they come back.  Can I trust spark to pickup all
> jobs where it left off?
>
> I'm noticing some instability with my system. I'm suspecting it could be
> disk or RAM issues.  When I add a lot of slaves I run low on RAM on my
> master.  Maybe that's part of the problem. But jut want to confirm my
> understanding.
>



-- 
www.skrasser.com <http://www.skrasser.com/?utm_source=sig>

Reply via email to