----- Original Message ----- > Hi Britt, > > some update on this after running tcpdump: > > I have keepalived master running on controller01, If I reboot this server it > failovers to controller02 which now becomes Keepalived Master, then I see > ping packets arriving to controller02, this is good. > > However when the controller01 comes online I see that ping requests stop > being forwarded to controller02 and start being sent to controller01 that is > now in Backup State, so it stops working. >
If traffic is being forwarded to a backup node, that sounds like L2pop is on. Is that true by chance? > Any hint for this? > > Thanks > > > > On Mon, Dec 29, 2014 at 11:06 AM, Pedro Sousa < pgso...@gmail.com > wrote: > > > > Yes, > > I was using l2pop, disabled it, but the issue remains. > > I also stopped "bogus VRRP" messages configuring a user/password for > keepalived, but when I reboot the servers, I see keepalived process running > on them but I cannot ping the virtual router ip address anymore. > > So I rebooted the node that is running Keepalived as Master, starts pinging > again, but when that node comes online, everything stops working. Anyone > experienced this? > > Thanks > > > On Tue, Dec 23, 2014 at 5:03 PM, David Martin < dmart...@gmail.com > wrote: > > > > Are you using l2pop? Until https://bugs.launchpad.net/neutron/+bug/1365476 is > fixed it's pretty broken. > > On Tue, Dec 23, 2014 at 10:48 AM, Britt Houser (bhouser) < bhou...@cisco.com > > wrote: > > > > Unfortunately I've not had a chance yet to play with neutron router HA, so no > hints from me. =( Can you give a little more details about "it stops > working"? I.e. You see packets dropped while controller 1 is down? Do > packets begin flowing before controller1 comes back online? Does controller1 > come back online successfully? Do packets begin to flow after controller1 > comes back online? Perhaps that will help. > > Thx, > britt > > From: Pedro Sousa < pgso...@gmail.com > > Date: Tuesday, December 23, 2014 at 11:14 AM > To: Britt Houser < bhou...@cisco.com > > Cc: " OpenStack-operators@lists.openstack.org " < > OpenStack-operators@lists.openstack.org > > Subject: Re: [Openstack-operators] Neutron DVR HA > > I understand Britt, thanks. > > So I disabled DVR and tried to test L3_HA, but it's not working properly, it > seems a keepalived issue. I see that it's running on 3 nodes: > > [root@controller01 keepalived]# neutron l3-agent-list-hosting-router harouter > +--------------------------------------+--------------+----------------+-------+ > | id | host | admin_state_up | alive | > +--------------------------------------+--------------+----------------+-------+ > | 09cfad44-2bb2-4683-a803-ed70f3a46a6a | controller01 | True | :-) | > | 58ff7c42-7e71-4750-9f05-61ad5fbc5776 | compute03 | True | :-) | > | 8d778c6a-94df-40b7-a2d6-120668e699ca | compute02 | True | :-) | > +--------------------------------------+--------------+----------------+-------+ > > However if I reboot one of the l3-agent nodes it stops working. I see this in > the logs: > > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: ip address associated with > VRID not present in received packet : 172.16.28.20 > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: one or more VIP associated > with VRID mismatch actual MASTER advert > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: bogus VRRP packet received > on ha-a509de81-1c !!! > Dec 23 16:12:28 Compute02 Keepalived_vrrp[18928]: VRRP_Instance(VR_1) > ignoring received advertisment... > > Dec 23 16:13:10 Compute03 Keepalived_vrrp[12501]: VRRP_Instance(VR_1) > ignoring received advertisment... > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: ip address associated with > VRID not present in received packet : 172.16.28.20 > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: one or more VIP associated > with VRID mismatch actual MASTER advert > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: bogus VRRP packet received > on ha-d5718741-ef !!! > Dec 23 16:13:12 Compute03 Keepalived_vrrp[12501]: VRRP_Instance(VR_1) > ignoring received advertisment... > > Any hint? > > Thanks > > > > On Tue, Dec 23, 2014 at 3:17 PM, Britt Houser (bhouser) < bhou...@cisco.com > > wrote: > > > > Currently HA and DVR are mutually exclusive features. > > From: Pedro Sousa < pgso...@gmail.com > > Date: Tuesday, December 23, 2014 at 9:42 AM > To: " OpenStack-operators@lists.openstack.org " < > OpenStack-operators@lists.openstack.org > > Subject: [Openstack-operators] Neutron DVR HA > > Hi all, > > I've been trying Neutron DVR with 2 controllers + 2 computes. When I create a > router I can see that is running on all the servers: > > [root@controller01 ~]# neutron l3-agent-list-hosting-router router > +--------------------------------------+--------------+----------------+-------+ > | id | host | admin_state_up | alive | > +--------------------------------------+--------------+----------------+-------+ > | 09cfad44-2bb2-4683-a803-ed70f3a46a6a | controller01 | True | :-) | > | 0ca01d56-b6dd-483d-9c49-cc7209da2a5a | controller02 | True | :-) | > | 52379f0f-9046-4b73-9d87-bab7f96be5e7 | compute01 | True | :-) | > | 8d778c6a-94df-40b7-a2d6-120668e699ca | compute02 | True | :-) | > +--------------------------------------+--------------+----------------+-------+ > > However if controller01 server dies I cannot ping ip external gateway > anymore. Is this the expected behavior? Shouldn't it failback to the another > controller node? > > Thanks > > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > > > > > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > _______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators