Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread Mingyu Kim
Cool. Using Ambari to monitor and scale up/down the cluster sounds promising. Thanks for the pointer! Mingyu From: Deepak Sharma Date: Monday, December 14, 2015 at 1:53 AM To: cs user Cc: Mingyu Kim , "user@spark.apache.org" Subject: Re: Autoscaling of Spark YARN cluster An

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread Deepak Sharma
An approach I can think of is using Ambari Metrics Service(AMS) Using these metrics , you can decide upon if the cluster is low in resources. If yes, call the Ambari management API to add the node to the cluster. Thanks Deepak On Mon, Dec 14, 2015 at 2:48 PM, cs user wrote: > Hi Mingyu, > > I'

Re: Autoscaling of Spark YARN cluster

2015-12-14 Thread cs user
Hi Mingyu, I'd be interested in hearing about anything else you find which might meet your needs for this. One way perhaps this could be done would be to use Ambari. Ambari comes with a nice api which you can use to add additional nodes into a cluster: https://github.com/apache/ambari/blob/trunk

Autoscaling of Spark YARN cluster

2015-12-14 Thread Mingyu Kim
Hi all, Has anyone tried out autoscaling Spark YARN cluster on a public cloud (e.g. EC2) based on workload? To be clear, I¹m interested in scaling the cluster itself up and down by adding and removing YARN nodes based on the cluster resource utilization (e.g. # of applications queued, # of resourc