Do you have that patch already in your environment? If not, can you confirm it fixes the issue?
On Tue, May 30, 2017 at 9:49 AM, Gustavo Randich <gustavo.rand...@gmail.com> wrote: > While dumping OVS flows as you suggested, we finally found the cause of > the problem: our br-ex OVS bridge lacked the secure fail mode configuration. > > May be the issue is related to this: https://bugs.launchpad.net/ > neutron/+bug/1607787 > > Thank you > > > On Fri, May 26, 2017 at 6:03 AM, Kevin Benton <ke...@benton.pub> wrote: > >> Sorry about the long delay. >> >> Can you dump the OVS flows before and after the outage? This will let us >> know if the flows Neutron setup are getting wiped out. >> >> On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich < >> gustavo.rand...@gmail.com> wrote: >> >>> Hi Kevin, here is some information aout this issue: >>> >>> - if the network outage lasts less than ~1 minute, then connectivity to >>> host and instances is automatically restored without problem >>> >>> - otherwise: >>> >>> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all >>> bridges (br-ex / br-int / br-tun) >>> >>> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected: >>> true" on every bridge >>> >>> - upon restoring physical interface (fix outage) >>> >>> - "ovs-vsctl show" now reports "is_connected: true" in all >>> bridges (br-ex / br-int / br-tun) >>> >>> - access to host and VMs is NOT restored, although some pings are >>> sporadically answered by host (~1 out of 20) >>> >>> >>> - to restore connectivity, we: >>> >>> >>> - execute "ifdown br-ex; ifup br-ex" -> access to host is >>> restored, but not to VMs >>> >>> >>> - restart neutron-openvswitch-agent -> access to VMs is restored >>> >>> Thank you! >>> >>> >>> >>> >>> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton <ke...@benton.pub> wrote: >>> >>>> With the network down, does ovs-vsctl show that it is connected to the >>>> controller? >>>> >>>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich < >>>> gustavo.rand...@gmail.com> wrote: >>>> >>>>> Exactly, we access via a tagged interface, which is part of br-ex >>>>> >>>>> # ip a show vlan171 >>>>> 16: vlan171: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue >>>>> state UNKNOWN group default qlen 1 >>>>> link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff >>>>> inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171 >>>>> valid_lft forever preferred_lft forever >>>>> inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link >>>>> valid_lft forever preferred_lft forever >>>>> >>>>> # ovs-vsctl show >>>>> ... >>>>> Bridge br-ex >>>>> Controller "tcp:127.0.0.1:6633" >>>>> is_connected: true >>>>> Port "vlan171" >>>>> tag: 171 >>>>> Interface "vlan171" >>>>> type: internal >>>>> ... >>>>> >>>>> >>>>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton <ke...@benton.pub> >>>>> wrote: >>>>> >>>>>> Ok, that's likely not the issue then. I assume the way you access >>>>>> each host is via an IP assigned to an OVS bridge or an interface that >>>>>> somehow depends on OVS? >>>>>> >>>>>> On Apr 28, 2017 12:04, "Gustavo Randich" <gustavo.rand...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Kevin, we are using the default listen address of loopback >>>>>>> interface: >>>>>>> >>>>>>> # grep -r of_listen_address /etc/neutron >>>>>>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address = >>>>>>> 127.0.0.1 >>>>>>> >>>>>>> >>>>>>> tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db >>>>>>> -vconsole:emer -vsyslog:err -vfile:info >>>>>>> --remote=punix:/var/run/openvswitch/db.sock >>>>>>> --private-key=db:Open_vSwitch,SSL,private_key >>>>>>> --certificate=db:Open_vSwitch,SSL,certificate >>>>>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir >>>>>>> --log-file=/var/log/openvswitch/ovsdb-server.log >>>>>>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton <ke...@benton.pub> >>>>>>> wrote: >>>>>>> >>>>>>>> Are you using an of_listen_address value of an interface being >>>>>>>> brought down? >>>>>>>> >>>>>>>> On Apr 25, 2017 17:34, "Gustavo Randich" <gustavo.rand...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN / >>>>>>>>> l2_population) >>>>>>>>> >>>>>>>>> This sounds very strange (to me): recently, after a switch outage, >>>>>>>>> we lost connectivity to all our Mitaka hosts. We had to enter via iLO >>>>>>>>> host >>>>>>>>> by host and restart networking service to regain access. Then restart >>>>>>>>> neutron-openvswitch-agent to regain access to VMs. >>>>>>>>> >>>>>>>>> At first glance we thought it was a problem with the NIC linux >>>>>>>>> driver of the hosts not detecting link state correctly. >>>>>>>>> >>>>>>>>> Then we reproduced the issue simply bringing down physical >>>>>>>>> interfaces for around 5 minutes, then up again. Same issue. >>>>>>>>> >>>>>>>>> And then.... we found that if instead of using native (ryu) >>>>>>>>> OpenFlow interface in Neutron Openvswitch we used ovs-ofctl, the >>>>>>>>> problem >>>>>>>>> disappears. >>>>>>>>> >>>>>>>>> Any clue? >>>>>>>>> >>>>>>>>> Thanks in advance. >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Mailing list: http://lists.openstack.org/cgi >>>>>>>>> -bin/mailman/listinfo/openstack >>>>>>>>> Post to : openstack@lists.openstack.org >>>>>>>>> Unsubscribe : http://lists.openstack.org/cgi >>>>>>>>> -bin/mailman/listinfo/openstack >>>>>>>>> >>>>>>>>> >>>>>>> >>>>> >>>> >>> >> >
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack