Hey Ed,

Great idea. I think there are a lot of ways we can go with this.  I started 
working on a branch called tasks that did something like this.  It essentially 
allowed you to put every action into a tasks database and you could re-run 
tasks that didn't complete properly.  It was based on the idea that every 
action should be composed of a bunch of idempotent chunks.

I have since discovered that a lot of what I had done is very similar to the 
concept of a task in celery.  In general i think that actions should become 
first class citizens.  Action requests to the api should get back a UUID 
representing the "task" (or "reservation" if you prefer), and we should use the 
task object to keep track of the status of the action.  Right now status of 
actions are stored in status fields in the objects that are acted on.  This 
makes for difficult logic when tasks involve multiple objects (boot from volume 
for example). If we have rich information in a task object, a recovery service 
could be written to retry or fix failing tasks.

Regardless of the approach, this seems like a great topic for the summit.  So 
coming up with some proposals would be awesome.

Vish

On Aug 26, 2011, at 12:22 PM, Ed Leafe wrote:

>       Sorry I haven't come up with a snazzy name for it yet, but what I have 
> in mind is a new service that is essential for my employer (Rackspace), and 
> might be important for other OpenStack deployments. This new service would be 
> completely optional, of course - only those for whom it is relevant would run 
> it.
> 
>       Let me start by stating the problem: when a customer requests that we 
> create instances for them, nova casts those requests into the queue, where 
> they are eventually acted upon. That usually works great, but in cases where 
> the instance creation fails, we need to detect that failure and re-issue the 
> create request with a different host. This is currently not possible with the 
> asynchronous design of the compute-scheduler interactions.
> 
>       So what I envision is a service that scans a list of recent requests' 
> reservation IDs, and follows up to see if the request was successful or not, 
> and takes action if needed. The blueprint for this can be found at 
> https://blueprints.launchpad.net/nova/+spec/instance-creation-assurance, with 
> an Etherpad created for ongoing idea exchange at 
> http://etherpad.openstack.org/instance-creation-assurance
> 
> 
> 
> -- Ed Leafe
> 
> 
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack@lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to     : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp

Reply via email to