** Also affects: neutron/juno
   Importance: Undecided
       Status: New

** Changed in: neutron/juno
   Importance: Undecided => High

** Changed in: neutron/juno
       Status: New => Fix Committed

** Changed in: neutron/juno
    Milestone: None => 2014.2.3

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1403860

Title:
  L3 HA routers have IPv6 link local address on devices, periodically
  send traffic, moving MACs around and disrupting traffic

Status in OpenStack Neutron (virtual network service):
  Fix Released
Status in neutron juno series:
  Fix Committed

Bug description:
  In the HA routers case we place the same Neutron port on all HA router
  instances. This means that they share the same MAC and IP addresses.
  We configure all IP addresses in keepalived.conf so that keepalived
  takes care to move the IP addresses, and configure them only on the
  master instance. The MAC address, however, is present on all HA router
  devices on all network nodes, and so is the IPv6 link local address
  that is generated from that MAC address. This means that we have an
  active (IPv6) address in multiple places in the network. Any traffic
  generated from said address on a standby node will change the MAC
  tables of the underlay network, causing it to think that the MAC
  address has moved from the master instance to any of the standbys.
  This causes network disruption.

  Severity / reproduction:
  Create an HA router on a setup with 3 network nodes. The HA router is created 
on all nodes. Connect it to an internal and external network. Create an 
instance and configure it with a floating IP. Ping the floating IP: Every two 
minutes, we've observed the standby nodes sending an ICMPv6 multicast listener 
report. The MAC address of the external interface of the master router will now 
move (From the perspective of the underlay), causing traffic to not reach the 
correct (Master) node. After 30 seconds of packet loss the client will re-issue 
an ARP request for the IPv4 address, which the master will answer, moving the 
MAC back and fixing the issue. This repeats every 2 minutes, with 30 seconds of 
packet loss, resulting in 75% up-time. Note: I think we can do better than 75%.

  Solutions:
  The sledgehammer solution would be to shut down all NICs on standby routers 
and open them on the master instance using the keepalived notifier scripts. In 
the spirit of keeping these scripts as lightweight as possible, I'd like to 
solve this issue instead by handling the IPv6 link local address like we do 
with IPv4 addresses: Not configuring them on the device, but adding them as a 
VIP to keepalived.conf and let keepalived configure the address on the master 
node only.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1403860/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to