Re: [Openstack-operators] How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

David Young Tue, 12 Dec 2017 01:25:11 -0800

Hey Jean-Philippe,

No, after I disasterously split-brained/partitioned my rabbitmq andgalera clusters by allowing LXC to start the containers up without thednsmasq process to address their eth0 interfaces (due to what _may_ be atemplate/Xenial bug), I've spent the last few days cleaning upthe mess:)

I have twounused hosts set aside as a test environment for pre-testing,and I'll be leveraging these in the next few days to test theissue on afresh Xenial install.


I'll update you (and the list) once I've positively confirmed the issue.

Cheers!
D




On 12/12/2017 21:52, Jean-Philippe Evrard wrote:

Hello David,

Did you solve your issue?
Did you check that it depends on the default container interface's mtu itself?

Best regards,
JP


On 6 December 2017 at 18:45, David Young <[email protected]> wrote:

So..

On 07/12/2017 03:12, Jean-Philippe Evrard wrote:

For the mtu, it would be impactful to do it on a live environment. I
expect that if you change the container configuration, it would
restart.

It’s a busy lab environment, but given that it’s fully HA (2 controllers), I
didn’t anticipate a significant problem with changing container
configuration one-at-a-time.

However, the change has had an unexpected side effect - one of the
controllers (I haven’t rebooted the other one yet) seems to have lost the
ability to bring up lxcbr0, and so while it can start all its containers,
none of them have any management connectivity on eth0, which of course
breaks all sorts of things.

I.e.

root@nbs-dh-10:~# systemctl status networking.service
● networking.service - Raise network interfaces
    Loaded: loaded (/lib/systemd/system/networking.service; enabled; vendor
preset: enabled)
   Drop-In: /run/systemd/generator/networking.service.d
            └─50-insserv.conf-$network.conf
    Active: failed (Result: exit-code) since Thu 2017-12-07 06:37:00 NZDT;
14min ago
      Docs: man:interfaces(5)
   Process: 2717 ExecStart=/sbin/ifup -a --read-environment (code=exited,
status=1/FAILURE)
   Process: 2656 ExecStartPre=/bin/sh -c [ "$CONFIGURE_INTERFACES" != "no" ]
&& [ -n "$(ifquery --read-environment --list --exclude=lo)" ] && udevadm
settle (code=e
  Main PID: 2717 (code=exited, status=1/FAILURE)

Dec 07 06:36:58 nbs-dh-10 systemd[1]: Starting Raise network interfaces...
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: RTNETLINK answers: Invalid argument
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
/run/network/ifstate.enp4s0
Dec 07 06:36:58 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
/run/network/ifstate.br-mgmt
Dec 07 06:37:00 nbs-dh-10 ifup[2717]: /sbin/ifup: waiting for lock on
/run/network/ifstate.br-vlan
Dec 07 06:37:00 nbs-dh-10 ifup[2717]: Failed to bring up lxcbr0.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Main process
exited, code=exited, status=1/FAILURE
Dec 07 06:37:00 nbs-dh-10 systemd[1]: Failed to start Raise network
interfaces.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Unit entered
failed state.
Dec 07 06:37:00 nbs-dh-10 systemd[1]: networking.service: Failed with result
'exit-code'.
root@nbs-dh-10:~#

I’ve manually reversed the “lxc.network.mtu = 1550” entry in
/etc/lxc/lxc-openstack.conf, but this doesn’t seem to have made a
difference.

What’s also odd is that lxcbr0 appears to be perfectly normal:

root@nbs-dh-10:~# brctl show lxcbr0
bridge name    bridge id        STP enabled    interfaces
lxcbr0        8000.fe0a7fa28303    no        04063403_eth0
                             075266dc_eth0
                             160c9b30_eth0
                             38ac19ae_eth0
                             4f57300f_eth0
                             59b2b5a5_eth0
                             5b7bbeb4_eth0
                             64a1fcdd_eth0
                             6c99f5fe_eth0
                             6f93ebb2_eth0
                             70ce61e5_eth0
                             745ba80d_eth0
                             85df2fa5_eth0
                             99e6adf8_eth0
                             cbdfa2f3_eth0
                             e15dc279_eth0
                             ea67ce7e_eth0
                             ed5c7af9_eth0
root@nbs-dh-10:~#

… But, no matter the value of lxc.network.mtu, it doesn’t change from 1500
(I suppose this could actually have reduced itself based on the lower MTUs
of the member interfaces though):

root@nbs-dh-10:~# ifconfig lxcbr0
lxcbr0    Link encap:Ethernet  HWaddr fe:0c:5d:1c:36:da
           inet addr:10.0.3.1  Bcast:10.0.3.255  Mask:255.255.255.0
           inet6 addr: fe80::f4b0:bff:fec3:63b0/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:499 errors:0 dropped:0 overruns:0 frame:0
           TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:128882 (128.8 KB)  TX bytes:828 (828.0 B)

root@nbs-dh-10:~#

Any debugging suggestions?

Thanks,
D

_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Re: [Openstack-operators] How to deal with MTU issue on routers servicing vxlan tenant networks? (openstack-ansible with LXC containers)

Reply via email to