On Wed, Dec 29, 2010 at 2:27 PM, Erik Carlin <erik.car...@rackspace.com> wrote: > We know Amazon is highly, highly elastic. While the instances launched > per day is impressive, we know that many of those instances have a short > life.
OK, good point. But, this begs the question: what should Nova's priority be? Elasticity -- in other words, being able to quickly spin up and down hundreds of thousands of instances per day? Or manageability at large scale -- in other words, a system that is easy to administer at hundreds of thousands of physical nodes? Or pure scalability on the user end -- meaning, given a specific installation of applications on a given type of instance (say, m1.large), what is the pattern of throughput for that set of applications as the size of the grid increases to hundreds of thousands of physical nodes? Or do we take an ambivalent position on the above and go for some sort of "general scalability"? > I see Guy is now teaming up with CloudKick on this report. The EC2 > instance ID enables precise measurement of instances launched, and > CloudKick provides some quantitative measure of lifetime of instances. > Last time I checked, those numbers we're something like 3% of EC2 > instances launched via CK were still running (as a point of reference, > something like 80% of Rackspace cloud servers were still running). I see this as tangential at best, and mostly a localized issue with Rackspace Cloud Servers, and not something that is inherently important to Nova. Let me explain. IMHO, there are two big reasons why there is less "churn rate" of instances on Cloud Servers than EC2: 1) Different level/type of customers RS Cloud Servers tends to attract a more "corporate" or "enterprisey" type of customer. These customers tend to deploy applications into the RS Cloud with more permanent patterns. Applications like departmental or financial applications don't tend to "disappear" or be experimental. 2) Application bursting/overflow capacity Perhaps more important than the type of customer RS Cloud Servers attracts, I think many people believe that automating capacity bursting into the RS Cloud is more difficult than EC2 (possibly due to a larger feature set in the EC2 API for managing groups of servers/IP ranges?), and that may contribute to the lower instance churn rate. Due to EC2's elasticity (and "hackability" as termie would call it...), companies are better able to programmatically spin up instances to offload peak traffic from web applications and then spin those instances down after traffic subsides. Perhaps if RS Cloud Servers had better hackability, I think you'd see the RS Cloud churn rate increase dramatically. These are just my thoughts, though. I'd be interested to hear what other's opinions on this are. > To meet the elasticity demands of EC2, nova would need to support a high > change rate of adds/deletes (not to mention state polling, resizes, etc). > Is there a nova change rate target as well or just a physical host limit? > The 1M host limit still seems reasonable to me. Large scale deployments > will break into regions where each region is an independent nova > deployment that each has a 1M host limit. This change rate is something that should be tracked in the continuous integration project. Cheers! -jay _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp