When you look at the scalability issue solely from the perspective of the cloud 
provider, requiring polling is the lazier, but not really more scalable 
solution. Especially if you go nuts with caching. Then it might be even a bit 
more scalable.

But when you look at the distributed systems use cases, it's terrible for the 
system as a whole. People are polling your API because they want to know 
whether or not the state of the system is different from what they "remember" 
it to be. The more critical it is for them to rapidly know about changes, the 
more often they poll. Yes, there are ways to determine when changes are more 
likely to occur and thus optimize the polling interval. Some clients may be, in 
your eyes as the provider, overly aggressive. But who the hell are you to judge 
their use case and throttle them?

But here's the bottom line: The vast majority of work is completely wasted when 
polling is the change propagation method.

Push notifications don't make your core system any more complex. You push the 
change to a message queue and rely on another system to do the work.

The other system is scalable. It has no need to be stateless and can be run in 
an on-demand format using agents to handle the growing/shrinking notification 
needs.

Bryan brings up the point that some of these subscription endpoints may go 
away. That's a total red-herring. You have mechanisms in place to detect failed 
deliveries and unsubscribe after a time (among other strategies). 

The bottom line: When you push changes, the vast majority of your work is 
meaningful work when pushing is the change propagation method.

Let's do the math. Let's say I am interested in any change in VM state so I can 
auto-scale/auto-recover. Let's use a fairly simplistic polling strategy, but do 
it efficiently (and assume the API enables me to make a single API call to get 
state for all VMs). Let's pick 1 query/minute (in reality, you wouldn't pick a 
flat polling rate like this, but it is useful for this thought experiment).

Now multiply that times 1,000 customers. Or 100,000. Or 1,000,000. 

Now let's say that the client is going through a cloud management service. And 
that service is serving 20% of your customer base. They are likely making 
queries across a wide range of resources, not just VMs. And they have to scale 
the polling from their end.

Both sides are thus engaged in trying to figure out a way to scale work that is 
almost entirely pointless work. 

There's a reason you see the cloud management tools "pushing push". We've seen 
this IaaS polling across a bunch of clouds. It sucks.

-George

--
George Reese - Chief Technology Officer, enStratus
e: george.re...@enstratus.com    t: @GeorgeReese    p: +1.207.956.0217    f: 
+1.612.338.5041
enStratus: Governance for Public, Private, and Hybrid Clouds - @enStratus - 
http://www.enstratus.com
To schedule a meeting with me: http://tungle.me/GeorgeReese

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to