Hello,
We have an Openstack Icehouse cluster setup with two management nodes (Nova, Neutron, Horizon etc. ) as well as qpidd (version 0.18) for the message queue. Everything sits behind an HAproxy setup which round robins the request to the both nodes. It works fine until a certain amount of time (couple of days), all the agents from the compute nodes (Nova, Neutron) shows as down in the Horizon web interface. A "openstack-services restart" on both management nodes fixes it normally and the agents are shown as up. In the Nova logs on the compute nodes I see a lot of messages like the ones below, its seems like the connection to the message queue is lost.: ERROR nova.openstack.common.periodic_task [-] Error during ComputeManager.update_available_resource: Timed out waiting for a reply to message ID b28ae4098c31453c83d963c2a9d6c1ee [.] TRACE nova.openstack.common.periodic_task reply, ending = self._poll_connection(msg_id, timeout) TRACE nova.openstack.common.periodic_task File "/usr/lib/python2.6/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 217, in _poll_connection TRACE nova.openstack.common.periodic_task % msg_id) TRACE nova.openstack.common.periodic_task MessagingTimeout: Timed out waiting for a reply to message ID b28ae4098c31453c83d963c2a9d6c1ee Here the Haproxy port for qpidd: listen qpid_message_broker bind 10.xxx.xxx.xxx:5672 timeout server 1h timeout client 1h timeout connect 240s server xx-xxxxx-x001 10.xxx.xxx.xx1:5672 check inter 10s rise 9999999 fall 5 server xx-xxxxx-x002 10.xxx.xxx.xx2:5672 check backup Any ideas or experiences you had with setting up HAproxy for qpidd? Any help appreciated! Cheers, Chris
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack