All, I wanted to send you yet another update regarding this networking issue. I have been unable to find a solution to the problem and have decided to move forward with the upgrade to Queens to see if the issue is resolved. I will be sending an email shortly regarding when we'll do this upgrade.
Thanks- On Tue, May 21, 2019 at 12:16 PM Lance Albertson <la...@osuosl.org> wrote: > All, > > I wanted to send you an update on where we are at on this issue. So far > I've narrowed down the problem to happening when a VM using a private > network is removed causing certain iptable rules on the hypervisor to get > out of order. It only seems to effect inbound connections to the VM as > outbound seems to still work. I haven't been able to easily reproduce the > issue unfortunately which makes it difficult to troubleshoot. I've looked > through the source code and also looked online to see if anyone else had > run into this without success. > > I've rebooted all of the hypervisors on our x86 cluster and two on our ppc > cluster (which was needed for the MDS updates). So far on the nodes that > have been rebooted we haven't seen any issues, but I need to let those run > for a few days to verify that theory. These machines were also due for a > reboot also because of the CentOS 7.5 -> 7.6 upgrade so perhaps it's > related to that. > > At any rate, I've deployed a temporary cronjob on the nodes that haven't > been rebooted which should "fix" the networking issue. I have it set to run > every minute so that the downtime should be minimal. > > I'll send another update as I have one. > > Thanks- > > On Thu, May 16, 2019 at 8:58 AM Lance Albertson <la...@osuosl.org> wrote: > >> All, >> >> Since the upgrade to Pike we've noticed virtual machines suddenly losing >> network connectivity. This issue seems to sometimes fix itself or when we >> restart the neutron-linuxbridge-agent service on the hypervisors. We >> are doing our best to track down why this is happening and how to fix it. >> Since we're not monitoring every host on the cluster, it's difficult for us >> to know when it happens so if you do have a problem with one of your VMs, >> please let us know either via IRC in #osuosl on Freenode, or via a support >> email. >> >> I'll be sending further updates as we have them. >> >> Thanks for your patience! >> >> -- >> Lance Albertson >> Director >> Oregon State University | Open Source Lab >> > > > -- > Lance Albertson > Director > Oregon State University | Open Source Lab > -- Lance Albertson Director Oregon State University | Open Source Lab
_______________________________________________ openpower mailing list openpo...@osuosl.org https://lists.osuosl.org/mailman/listinfo/openpower