Thanks for the info about adding/removing nodes dynamically. That's valuable.
2014년 5월 16일 금요일, Akhil Das<ak...@sigmoidanalytics.com>님이 작성한 메시지: > Hi Han :) > > 1. Is there a way to automatically re-spawn spark workers? We've > situations where executor OOM causes worker process to be DEAD and it does > not came back automatically. > > => Yes. You can either add OOM killer > exception<http://backdrift.org/how-to-create-oom-killer-exceptions> on > all of your Spark processes. Or you can have a cronjob which will keep > monitoring your worker processes and if they goes down the cronjob will > bring it back. > > 2. How to dynamically add (or remove) some worker machines to (from) the > cluster? We'd like to leverage the auto-scaling group in EC2 for example. > > => You can add/remove worker nodes on the fly by spawning a new machine > and then adding that machine's ip address in the master node then rsyncing > the spark directory with all worker machines including the one you added. > Then simply you can use the *start-all.sh* script inside the master node > to bring up the new worker in action. For removing a worker machine from > master can be done in the same way, you have to remove the workers IP > address from the masters *slaves *file and then you can restart your > slaves and that will get your worker removed. > > > FYI, we have a deployment tool (a web-based UI) that we use for internal > purposes, it is build on top of the spark-ec2 script (with some changes) > and it has a module for adding/removing worker nodes on the fly. It looks > like the attached screenshot. If you want i can give you some access. > > Thanks > Best Regards > > > On Wed, May 14, 2014 at 9:52 PM, Han JU > <ju.han.fe...@gmail.com<javascript:_e(%7B%7D,'cvml','ju.han.fe...@gmail.com');> > > wrote: > >> Hi all, >> >> Just 2 questions: >> >> 1. Is there a way to automatically re-spawn spark workers? We've >> situations where executor OOM causes worker process to be DEAD and it does >> not came back automatically. >> >> 2. How to dynamically add (or remove) some worker machines to (from) >> the cluster? We'd like to leverage the auto-scaling group in EC2 for >> example. >> >> We're using spark-standalone. >> >> Thanks a lot. >> >> -- >> *JU Han* >> >> Data Engineer @ Botify.com >> >> +33 0619608888 >> > >