We can confirm that this change is fixing the issue on 2024.1. It has been tested on two different environments.
** Changed in: neutron Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2096802 Title: keepalived spawn failures in neutron-l3-agent Status in neutron: Fix Released Bug description: On ML2/OVS deployments with many Neutron L3 routers in HA mode, we can see the following kind of errors in neutron-l3-agent logs: 2024-11-07 03:14:58.109 1289 ERROR neutron.agent.linux.external_process [-] keepalived for router with uuid d34b2bf3-878c-431d-946f-b8766555f5dc not found. The process should not have died 2024-11-07 03:14:58.110 1289 WARNING neutron.agent.linux.external_process [-] Respawning keepalived for uuid d34b2bf3-878c-431d-946f-b8766555f5dc Only a small, random subset of all the routers is affected. This appears to be due to the presence of old PID files for keepalived, which can make neutron-l3-agent fail to properly detect that keepalived as not yet been started for a specific router, if another keepalived process (for another router) has already been started using the same PID. I suspect that change https://review.opendev.org/c/openstack/neutron/+/895832 might be the source of the issue (introduced in Caracal but backported to Antelope). A workaround is to delete all the PID files before restarting neutron-l3-agent, which is being proposed in kolla-ansible: https://review.opendev.org/c/openstack/kolla-ansible/+/934383 It is probably easier for this bug to happen in a containerized environment because the PIDs start from 1 after each restart of the containers. Version: Seen with both 2023.1 and 2024.1 using recent code. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2096802/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp