Reviewed: https://review.opendev.org/c/openstack/neutron/+/861124 Committed: https://opendev.org/openstack/neutron/commit/edf48e46a1f0227f84b05ab39da005393e5fa73f Submitter: "Zuul (22348)" Branch: master
commit edf48e46a1f0227f84b05ab39da005393e5fa73f Author: Miro Tomaska <[email protected]> Date: Wed Oct 12 08:42:18 2022 -0500 Improve agent provision performance for large networks Before this patch, the metadata agent would provision network namespace for all subnets under a network(datapath) as soon as the first VM(vif port) was mounted on the chassis. This operation can take very long time for networks with lots of subnets. See the linked bug for more details. This patch changes this mechanism to "lazy load" where metadata agent provisions metadata namespace with only the subnets belonging to the active ports on the chassis. This results in virtually constant throughput not effected by the number of subnets. Closes-Bug: #1981113 Change-Id: Ia2a66cfd3fd1380c5204109742d44f09160548d2 ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1981113 Title: OVN metadata agent can be slow with large amount of subnets Status in neutron: Fix Released Bug description: OVN metadata agent can take very long time (observed ~40s) to add cidrs under a metadata namespace tap interface when a network consist of many subnets (observed ~1700 subnets). The long processing time can result in ovn-metada-agent not having haproxy ready by the time the first VM cloud-init requests for its metadata. Thus resulting in VM missing metadata for proper operation. Reproducing step: - Create a network with hundreds or thousands of subnets under this network. The more subnets the more obvious the problem is - Create a VM connected to the network from above. Make sure this is the first VM on the deployed compute node(hypervisor). - Once VM is created, observe that VM's cloud-init request time out due to no response from 169.256.169.256/openstack - Inspect ovn-metadata-agent log and notice this is due to ovn-metadata-agent taking very long time to process [1] Possible solutions: 1. (Low hanging fruit?) See if there is a way to improve execution time of `ip.add` call. Perhaps passing a list of cidrs instead of a single cidr at the time can improve performance? 2. (more involved) refactor the code such that ovn-metadata-agent only adds a single cidr which belongs to the VM being created. Instead of unconditionally adding all cidrs for the network when the first VM is created(current implementation) [1] https://github.com/openstack/neutron/blob/41bf8054017c72815226d5df50fd321b30fcba13/neutron/agent/ovn/metadata/agent.py#L488-L495 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1981113/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

