Hey Rajesh,

Fromm y experience, it’s a stable feature, however you must keep in mind that 
it will not guarantee that you will not lose the data that is on the pods of 
the nodes getting a spot kill. Once you have a spot a kill, you have 120s to 
give the node back to the cloud provider. This is when the decommission script 
will start and sometimes 120s is enough to migrate the shuffle/rdd blocks, and 
sometimes it’s not. It really depends on your workload and data at the end.


Best regards,

Ahmed Khaldi
Solutions Architect

NetApp Limited.
+33617424566 Mobile Phone
kah...@netapp.com<mailto:pump...@netapp.com>



From: Rajesh Mahindra <rjshmh...@gmail.com>
Date: Tuesday, 18 June 2024 at 23:54
To: user@spark.apache.org <user@spark.apache.org>
Subject: Spark Decommission
Vous ne recevez pas souvent de courriers de la part de rjshmh...@gmail.com. 
Découvrez pourquoi cela est 
important<https://aka.ms/LearnAboutSenderIdentification>

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments


Hi folks,

I am planning to leverage the "Spark Decommission" feature in production since 
our company uses SPOT instances on Kubernetes. I wanted to get a sense of how 
stable the feature is for production usage and if any one has thoughts around 
trying it out in production, especially in kubernetes environment.

Thanks,
Rajesh

Reply via email to