I would verify that the VIP failover is occurring. Your master should have the IP address. If you shut down keepalived the VIP should move to one of the others. I generally set the state to MASTER on all systems, and have one with a higher priority than the others (e.g. 100 vs 150 on others).
On Tuesday, January 13, 2015 at 12:18 PM, Pedro Sousa wrote: > As expected If I reboot the Keepalived MASTER node, I get timeouts again, so > my understanding is that this happens when the VIP fails over to another > node. Anyone has explanation for this? > > Thanks > > On Tue, Jan 13, 2015 at 8:08 PM, Pedro Sousa <pgso...@gmail.com > (mailto:pgso...@gmail.com)> wrote: > > Hi, > > > > I think I found out the issue, as I have all the 3 nodes running Keepalived > > as MASTER, when I reboot one of the servers, one of the VIPS failsover to > > it, causing the timeout issues. So I left only one server as MASTER and the > > other 2 as BACKUP, and If I reboot the BACKUP servers everything will work > > fine. > > > > As a note aside, I don't know if this is some ARP issue because I have a > > similar problem with Neutron L3 running in HA Mode. If I reboot the server > > that is running as MASTER I loose connection to my floating IPS because the > > switch doesn't know yet that the Mac Addr has changed. To everything start > > working I have to ping an outside host like google from an instance. > > > > Maybe someone could share some experience on this, > > > > Thank you for your help. > > > > > > > > > > On Tue, Jan 13, 2015 at 7:18 PM, Pedro Sousa <pgso...@gmail.com > > (mailto:pgso...@gmail.com)> wrote: > > > Jesse, > > > > > > I see a lot of these messages in glance-api: > > > > > > 2015-01-13 19:16:29.084 29269 DEBUG > > > glance.api.middleware.version_negotiation > > > [29d94a9a-135b-4bf2-a97b-f23b0704ee15 eb7ff2b5f0f34f51ac9ea0f75b60065d > > > 2524b02b63994749ad1fed6f3a825c15 - - -] Unknown version. Returning > > > version choices. process_request > > > /usr/lib/python2.7/site-packages/glance/api/middleware/version_negotiation.py:64 > > > > > > While running openstack-status (glance image-list) > > > > > > == Glance images == > > > Error finding address for > > > http://172.16.21.20:9292/v1/images/detail?sort_key=name&sort_dir=asc&limit=20: > > > HTTPConnectionPool(host='172.16.21.20', port=9292): Max retries exceeded > > > with url: /v1/images/detail?sort_key=name&sort_dir=asc&limit=20 (Caused > > > by <class 'httplib.BadStatusLine'>: '') > > > > > > > > > > > > Thanks > > > > > > > > > On Tue, Jan 13, 2015 at 6:52 PM, Jesse Keating <j...@bluebox.net > > > (mailto:j...@bluebox.net)> wrote: > > > > On 1/13/15 10:42 AM, Pedro Sousa wrote: > > > > > Hi > > > > > > > > > > > > > > > I've changed some haproxy confs, now I'm getting a different > > > > > error: > > > > > > > > > > *== Nova networks ==* > > > > > *ERROR (ConnectionError): HTTPConnectionPool(host='172.16.21.20', > > > > > port=8774): Max retries exceeded with url: > > > > > /v2/2524b02b63994749ad1fed6f3a825c15/os-networks (Caused by <class > > > > > 'httplib.BadStatusLine'>: '')* > > > > > *== Nova instance flavors ==* > > > > > > > > > > If I restart my openstack services everything will start working. > > > > > > > > > > I'm attaching my new haproxy conf. > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > Sounds like your services are losing access to something, like rabbit > > > > or the database. What do your service logs show prior to restart? Are > > > > they throwing any errors? > > > > > > > > > > > > -- > > > > -jlk > > > > > > > > > > > > _______________________________________________ > > > > OpenStack-operators mailing list > > > > OpenStack-operators@lists.openstack.org > > > > (mailto:OpenStack-operators@lists.openstack.org) > > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > > > > > >
_______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators