Hey Kyrylo,
On Tue, Nov 17, 2015 at 8:28 AM, Kyrylo Galanov <kgala...@mirantis.com> wrote: > Hi Team, > > I have been testing fail-over after free disk space is less than 512 mb. > (https://review.openstack.org/#/c/240951/) > Affected node is stopped correctly and services migrate to a healthy node. > > However, after free disk space is more than 512 mb again the node does not > recover it's state to operating. Moreover, starting the resources manually > would rather fail. In a nutshell, the pacemaker service / node should be > restarted. Detailed information is available here: > https://www.suse.com/documentation/sle_ha/book_sleha/data/sec_ha_configuration_basics_monitor_health.html > > How do we address this issue? > So the original change for this was https://review.openstack.org/#/c/226062/. As indicated by the commit message, the only way pacemaker will recover is that the operator must run a pacemaker command to clear the disk alert. crm node status-attr <hostname> delete "#health_disk" Once the operator has cleared up the diskspace issue and run the above command, pacemaker will rejoin the cluster and start services again. The documentation bug for this is https://bugs.launchpad.net/fuel/+bug/1500422. Thanks, -Alex > > Best regards, > Kyrylo > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev