Thanks, It was along the lines of what I was thinking.
If messages are made persistent, which I hope is planned, or made a configuration option what would be the effects of them not being made persistent. Right now if a message is lost, it seems the DB/other nodes are left in a bad state, is there any plan to have a "reaper" python object that will reap this bad data/instances.... On 8/18/11 4:54 PM, "Edward "koko" Konetzko" <konet...@quixoticagony.com> wrote: On 08/16/2011 04:50 PM, Joshua Harlow wrote: > Are there any good documentations on making openstack fault tolerant or > exactly how it will handle failures? > > Like say the mq server dies, can another mq server take over. Similar > with the database (mysql replication?).... > > Seems like having that kind of information for corporate users would be > nice, at least a recommended "guide". > > -Josh > > > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp Josh I have a very bare bones start of a doc on making parts of Nova HA. The problem is this document is no where near ready for release as I am probably the only person who can understand it. I will try to point you in the right direction on things I have done that work pretty well. Rabbitmq http://www.rabbitmq.com/pacemaker.html Right now in the version of Nova the team I am working with nothing is marked 'persistent'. Right now in this use case if a node fails rabbitmq moves over and all the managers reconnect with no issues but all in flight messages are lost. Maybe someone here can clarify on the direction of this. I we are using Ubuntu 10.04 and the version of Rabbitmq in that release does not have the pacemaker scripts, I just pulled the current package from rabbitmq.com apt repo after that the pacemaker setup worked perfect. MySQL For MySQL I just did a simple setup using DRDB to replicate /var/lib/mysql and setup corosync/pacemaker to manage all the MySQL resources between two nodes. Again with this situation in failover I had no issues with clients reconnecting to the vip. I hope this points you in the right direction, I know its not exactly what you wanted. Maybe next week I can clean up my documentation and send it out to the list. Edward Konetzko _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp