[Yahoo-eng-team] [Bug 2086691] [NEW] nova-novncproxy is printing symbols to journal as I open more than 1 VNC console
Public bug reported: Description === I am getting this kind of output in the journal. As I'm using the VNC, the journal is filling up with these logs and consuming space filled with unnecessary symbols. Logs & Configs == Here's a snippet of it: Nov 05 08:10:39 226host1 nova-novncproxy[2456742]: <}<}<{<}>{< Nov 05 08:10:39 226host1 nova-novncproxy[2456744]: < Nov 05 08:10:39 226host1 nova-novncproxy[2456742]: <}>>}> Nov 05 08:10:39 226host1 nova-novncproxy[2456744]: << Nov 05 08:10:39 226host1 nova-novncproxy[2456742]: <{<<{<}>}>}>{< Nov 05 08:10:39 226host1 nova-novncproxy[2456744]: <<{< Nov 05 08:10:40 226host1 nova-novncproxy[2456742]: <<}>}>}>}>}>}>{< Nov 05 08:10:40 226host1 nova-novncproxy[2456744]: <>{< Nov 05 08:10:40 226host1 nova-novncproxy[2456742]: <> Nov 05 08:10:41 226host1 nova-novncproxy[2456744]: <}>}>>{<{<{<}}>>}>}>}> Nov 05 08:10:52 226host1 nova-novncproxy[2456742]: <>{<}>{<}>}>}>}>{<}>{<}>{<}>}>}>{<{<}>{<}>{<}>{<}>{<}}>>}>}>{<}>{<}>{<}>}>{<{<}>{<}>{<}>{<}>{<}>{<}>{<}>{}<>}>}>{<}>}>{<{<}>{<}>{<}>}>}>}>{<}>{<}>{<}>{<}>{<}>}>}>}>}>}>}>}>}>}>{<}>}>}>{{<<{{<<{{<<}>}}>}>}>}>}>>}>}>}>}>}>}}>>}>}>{<}>{<}>}> Nov 05 08:11:02 226host1 nova-novncproxy[2456744]: <>}>}>}>{<}>{<}>{<{<}>{<}>{<{{<<{{<<{{<<}>}>}>{<}>{<}>{<}>{<}>{<}>}>{<{<}>{<}>{<}>{<}>{<}>}>}>}>}>}>{<{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>}>}>}>}>}>}>}>}>}>}>}>}}>>}>}>}>}>}>}>}>}> Nov 05 08:11:02 226host1 nova-novncproxy[2456742]: << Nov 05 08:11:02 226host1 nova-novncproxy[2456744]: << Nov 05 08:11:02 226host1 nova-novncproxy[2456742]: <> Nov 05 08:12:04 226host1 nova-novncproxy[2456744]: <>}>}>}>{<}>{<}>{<}>{<}>{<{<}>{<}>{<}>{<}>}>}>}>{<}>{<}>}>}}>>}>}>}>}>}> Nov 05 08:13:04 226host1 nova-novncproxy[2456742]: <<}> Nov 05 08:13:06 226host1 nova-novncproxy[2456744]: <<}> Nov 05 08:14:05 226host1 nova-novncproxy[2456742]: <<}>{<}> Nov 05 08:15:03 226host1 nova-novncproxy[2456744]: <<}> Nov 05 08:15:05 226host1 nova-novncproxy[2456742]: <<}> Nov 05 08:16:04 226host1 nova-novncproxy[2456744]: <<}> Nov 05 08:16:06 226host1 nova-novncproxy[2456742]: <<}> Nov 05 08:17:04 226host1 nova-novncproxy[2456744]: <<}>{<}> Nov 05 08:17:20 226host1 nova-novncproxy[2456742]: <<}> Nov 05 08:23:31 226host1 nova-novncproxy[2456742]: <<}>}>}>}>}>{<}>{<}>{<}>}>}>}>{<}>{<}>{<}>}>{<}>}>{<}>}>}>{<}>{<}>{<}>{<}>{<}}>>}>}>}>}>{<}>{<}>{<}>{<}}>>}>}>{<}>{<}>}>}>}>}>}>}>}>}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}}>>}>}>}>{<}>}>}>}>}>{<}>{<}>}>}>}>}}>>}>}>}>}>}>}>}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>{<}>}>}>}>}>}>}>}>}>}>}>}>}>}>{<}>}>}>}>}>{<}>{<}>}>{<}>{<}>{<}> Environment === 1. Openstack Version: $ rpm -qa | grep nova openstack-nova-common-24.1.1-1.el8.noarch openstack-nova-conductor-24.1.1-1.el8.noarch openstack-nova-scheduler-24.1.1-1.el8.noarch python3-novaclient-17.6.0-1.el8.noarch openstack-nova-novncproxy-24.1.1-1.el8.noarch openstack-nova-compute-24.1.1-1.el8.noarch python3-nova-24.1.1-1.el8.noarch openstack-nova-api-24.1.1-1.el8.noarch 2. Hypervisor: Libvirt + KVM $ libvirtd --version libvirtd (libvirt) 8.0.0 $ rpm -qa | grep kvm qemu-kvm-ui-spice-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-core-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-block-curl-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-hw-usbredir-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-common-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-docs-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-block-rbd-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-block-ssh-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 libvirt-daemon-kvm-8.0.0-10.module_el8.7.0+1218+f626c2ff.x86_64 qemu-kvm-block-gluster-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-block-iscsi-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 qemu-kvm-ui-opengl-6.2.0-20.module_el8.7.0+1218+f626c2ff.1.x86_64 3. Storage: SAN storage connected through iscsi and using LVM to manage it $ lvm version LVM version: 2.03.14(2)-RHEL8 (2021-10-20) Library version: 1.02.181-RHEL8 (2021-10-20) Driver version: 4.46.0 $ iscsiadm --version iscsiadm version 6.2.1.4-1 Steps to reproduce == - Launch a openstack on CentOS Stream 8. - Upload a custom Windows Image in .raw format with spice guest tools (latest) and cloudinit. - Create a VM and open the VM console from the Horizon dashboard. Expected result === No random output in the journal Actual result = Seeing random outout in the journal logs by running "journalctl -u openstack-nova-novncproxy.service -f" I figured it's only the STDOUT since when I mentioned StandardOutput=null in the service file, it stopped. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2086691 Title: no
[Yahoo-eng-team] [Bug 2072154] Re: Port status goes BUILD when migrating non-sriov instance in sriov setting.
** Also affects: cloud-archive Importance: Undecided Status: New ** Also affects: cloud-archive/caracal Importance: Undecided Status: New ** Also affects: cloud-archive/yoga Importance: Undecided Status: New ** Also affects: cloud-archive/epoxy Importance: Undecided Status: New ** Also affects: cloud-archive/zed Importance: Undecided Status: New ** Also affects: cloud-archive/bobcat Importance: Undecided Status: New ** Also affects: cloud-archive/antelope Importance: Undecided Status: New ** Also affects: cloud-archive/dalmation Importance: Undecided Status: New ** No longer affects: neutron (Ubuntu Focal) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2072154 Title: Port status goes BUILD when migrating non-sriov instance in sriov setting. Status in Ubuntu Cloud Archive: New Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: New Status in Ubuntu Cloud Archive caracal series: New Status in Ubuntu Cloud Archive dalmation series: New Status in Ubuntu Cloud Archive epoxy series: New Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in neutron: Fix Released Status in neutron package in Ubuntu: In Progress Status in neutron source package in Jammy: In Progress Status in neutron source package in Noble: In Progress Status in neutron source package in Oracular: In Progress Bug description: [ Impact ] Port status goes BUILD when migrating non-sriov instance in sriov setting. [ Test Plan ] 1. Deploy OpenStack using Juju & Charms ( upstream also has the same code ) 2. Enable SRIOV 3. create a vm without sriov nic. (test) 4. migrate it to another host - openstack server migrate --live-migration --os-compute-api-version 2.30 --host node-04.maas test 5. check port status - https://paste.ubuntu.com/p/RKGnP76MvB/ [ Where problems could occur ] this patch is related to sriov agent. it adds checking if port is sriov or not. so it could be possible that sriov port can be handled inproperly. [ Other Info ] nova-compute has neutron-sriov-nic-agent and neutron-ovn-metadata- agent So far, I've checked that ovn_monitor change it to ACTIVE but sriov-nic-agent change it back to BUILD by calling _get_new_status ./plugins/ml2/drivers/mech_sriov/agent/sriov_nic_agent.py binding_activate - get_device_details_from_port_id - get_device_details - _get_new_status < this makes status BUILD. so as running order is not fixed, sometimes it goes ACTIVE, sometimes BUILD. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2072154/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2085543] Re: [OVN] Port device_owner is not set in the Trunk subport
Reviewed: https://review.opendev.org/c/openstack/neutron/+/933836 Committed: https://opendev.org/openstack/neutron/commit/c0bdb0c8a33286acb4d44ad865f309fc79b6 Submitter: "Zuul (22348)" Branch:master commit c0bdb0c8a33286acb4d44ad865f309fc79b6 Author: Rodolfo Alonso Hernandez Date: Wed Oct 30 18:08:15 2024 + [OVN] Check LSP.up status before setting the port host info Before executing updating the Logical_Swith_Port host information, it is needed to check the current status of the port. If it doesn't match with the event calling this update, the host information is not updated. Closes-Bug: #2085543 Change-Id: I92afb190375caf27c815f9fe1cb627e87c49d4ca ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2085543 Title: [OVN] Port device_owner is not set in the Trunk subport Status in neutron: Fix Released Bug description: This issue was found in the test ``neutron_tempest_plugin.scenario.test_trunk.TrunkTest.test_trunk_subport_lifecycle``. The subport "463f7c45-3c06-4340-a509-e96b2faae525" [1] is created and then assigned to a trunk. When the port is assigned to the trunk, the device owner is not updated to "trunk:subport" UPDATE: the description is not accurate. The problem is that the port deactivation is executed when the port activation didn't finish. When the port is activated (the VM starts and binds the parent port and subports), the method ``set_port_status_up`` is called from the event ``LogicalSwitchPortUpdateUpEvent``. The problem is that the event actions are executed in a loop thread (``RowEventHandler.notify_loop``) that is not synchronous with the API call. The API call exists before the ``set_port_status_up`` finishes. The tempest test checks that the subport is ACTIVE and proceeds to unbind it (remove from the trunk). That removes the port device_owner and binding host. That's a problem because the method ``set_port_status_up`` is still being executed and needs the "older" values (device_owner="trunk:subport"). In a nutshell, this is a race condition because the OVN event processing is done asynchronously to the API call. Logs: https://f918f4eca95000e5dd6c-6bcda3a769a6c31ee12f465dd60bb9a2.ssl.cf5.rackcdn.com/933210/3/check/neutron- tempest-plugin-ovn-10/43a1557/testr_results.html [1]https://paste.opendev.org/show/bzmtiytDBKKkgi4IgZ15/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2085543/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2086675] [NEW] Suspected performance regression for RBD back end linked to location sorting
Public bug reported: Hi, For the past few releases (Caracal, Bobcat, maybe earlier) we have noticed that listing images is particularly slow. I've finally had some time to dig into this, and I believe I've tracked it to https://review.opendev.org/c/openstack/glance/+/886811 We're running the RBD back end primarily, but also have HTTP and Cinder listed in 'enabled_backends', meaning that sort_image_locations executes in full. It appears that when this function executes for the RBD back end, it causes a connection to be opened to the back end. When doing a full image list operation, this happens once for every image in the list (the connection is not re-used). This appears to carry a 20-30ms time penalty per image. As such, for any reasonable set of images the response ends up taking several seconds. In our case, images are unlikely to be held in more than one back end at a time, and I noted that adding a length check to the locations list in https://github.com/openstack/glance/blob/master/glance/common/utils.py#L718 so that the sorting doesn't occur when the list has just one element resolves the performance issue entirely. Whilst a length check is a workaround, does the sorting operation actually require connections to be opened to the RBD back end? If they are required, could the connections at least be re-used to avoid this time penalty growing linearly with the number of images held by Glance? Thanks ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/2086675 Title: Suspected performance regression for RBD back end linked to location sorting Status in Glance: New Bug description: Hi, For the past few releases (Caracal, Bobcat, maybe earlier) we have noticed that listing images is particularly slow. I've finally had some time to dig into this, and I believe I've tracked it to https://review.opendev.org/c/openstack/glance/+/886811 We're running the RBD back end primarily, but also have HTTP and Cinder listed in 'enabled_backends', meaning that sort_image_locations executes in full. It appears that when this function executes for the RBD back end, it causes a connection to be opened to the back end. When doing a full image list operation, this happens once for every image in the list (the connection is not re-used). This appears to carry a 20-30ms time penalty per image. As such, for any reasonable set of images the response ends up taking several seconds. In our case, images are unlikely to be held in more than one back end at a time, and I noted that adding a length check to the locations list in https://github.com/openstack/glance/blob/master/glance/common/utils.py#L718 so that the sorting doesn't occur when the list has just one element resolves the performance issue entirely. Whilst a length check is a workaround, does the sorting operation actually require connections to be opened to the RBD back end? If they are required, could the connections at least be re-used to avoid this time penalty growing linearly with the number of images held by Glance? Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/2086675/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2072154] Re: Port status goes BUILD when migrating non-sriov instance in sriov setting.
** Changed in: cloud-archive/epoxy Status: New => Fix Released ** Changed in: cloud-archive/dalmation Status: New => Fix Released ** Changed in: neutron (Ubuntu Oracular) Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2072154 Title: Port status goes BUILD when migrating non-sriov instance in sriov setting. Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: New Status in Ubuntu Cloud Archive caracal series: New Status in Ubuntu Cloud Archive dalmation series: Fix Released Status in Ubuntu Cloud Archive epoxy series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in neutron: Fix Released Status in neutron package in Ubuntu: In Progress Status in neutron source package in Jammy: In Progress Status in neutron source package in Noble: In Progress Status in neutron source package in Oracular: Fix Released Bug description: [ Impact ] Port status goes BUILD when migrating non-sriov instance in sriov setting. [ Test Plan ] 1. Deploy OpenStack using Juju & Charms ( upstream also has the same code ) 2. Enable SRIOV 3. create a vm without sriov nic. (test) 4. migrate it to another host - openstack server migrate --live-migration --os-compute-api-version 2.30 --host node-04.maas test 5. check port status - https://paste.ubuntu.com/p/RKGnP76MvB/ [ Where problems could occur ] this patch is related to sriov agent. it adds checking if port is sriov or not. so it could be possible that sriov port can be handled inproperly. [ Other Info ] nova-compute has neutron-sriov-nic-agent and neutron-ovn-metadata- agent So far, I've checked that ovn_monitor change it to ACTIVE but sriov-nic-agent change it back to BUILD by calling _get_new_status ./plugins/ml2/drivers/mech_sriov/agent/sriov_nic_agent.py binding_activate - get_device_details_from_port_id - get_device_details - _get_new_status < this makes status BUILD. so as running order is not fixed, sometimes it goes ACTIVE, sometimes BUILD. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2072154/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2085946] Re: [OVN] Revision number registers must be filtered by resource ID and type
Reviewed: https://review.opendev.org/c/openstack/neutron/+/933752 Committed: https://opendev.org/openstack/neutron/commit/a298a37fe7ee41d25db02fdde36e134b01ef5d9a Submitter: "Zuul (22348)" Branch:master commit a298a37fe7ee41d25db02fdde36e134b01ef5d9a Author: Rodolfo Alonso Hernandez Date: Wed Oct 30 00:58:16 2024 + [OVN] Fix the revision number retrieval method The "ovn_revision_numbers" table has a unique constraint that is a combination of the "resource_uuid" and the "resource_type". There is a case where the resource_uuid can be the same for two registers. A router interface will create a single Neutron DB register ("ports") but it will require two OVN DB registers ("Logical_Switch_Port" and "Logical_Router_Ports"). In this case is needed to define the "resource_type" when retrieving the revision number. The exception "RevisionNumberNotDefined" will be thrown if only the "resource_uuid" is provided in the related case. Closes-Bug: #2085946 Change-Id: I12079de78773f7409503392d4791848aea90cb7b ** Changed in: neutron Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2085946 Title: [OVN] Revision number registers must be filtered by resource ID and type Status in neutron: Fix Released Bug description: The OVN revision numbers have a multicolumn index: (resource_uuid, resource_type) [1]. In particular because of the Neutron ports that belong to a router. A router interface is a single Neutron register ("ports"). But in OVN two registers are created: "Logical_Switch_Ports" and "Logical_Router_Ports". When retrieving a register "ovn_revision_numbers" from the Neutron database, it is needed to provide both the resource_uuid and the resource_type [2]. [1]https://github.com/openstack/neutron/blob/febdfb5d8b1cf261c13b40e330d91a5bcb6c7642/neutron/db/models/ovn.py#L41-L46 [2]https://github.com/openstack/neutron/blob/febdfb5d8b1cf261c13b40e330d91a5bcb6c7642/neutron/db/ovn_revision_numbers_db.py#L159-L167 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2085946/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2017748] Re: [SRU] OVN: ovnmeta namespaces missing during scalability test causing DHCP issues
Deleted debdiff previously submitted as per @haleyb since they need resubmitting now that the upstream backports have been done. ** Changed in: cloud-archive/yoga Status: Fix Committed => New ** Also affects: neutron (Ubuntu Noble) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Oracular) Importance: Undecided Status: New ** Also affects: cloud-archive/epoxy Importance: Undecided Status: New ** Also affects: cloud-archive/zed Importance: Undecided Status: New ** Also affects: cloud-archive/bobcat Importance: Undecided Status: New ** Also affects: cloud-archive/dalmation Importance: Undecided Status: New ** Also affects: cloud-archive/antelope Importance: Undecided Status: New ** Also affects: cloud-archive/caracal Importance: Undecided Status: New ** Patch removed: "focal_yoga.debdiff" https://bugs.launchpad.net/cloud-archive/+bug/2017748/+attachment/5759481/+files/focal_yoga.debdiff ** Patch removed: "jammy.debdiff" https://bugs.launchpad.net/cloud-archive/+bug/2017748/+attachment/5778137/+files/jammy.debdiff ** Changed in: cloud-archive/yoga Assignee: Hua Zhang (zhhuabj) => (unassigned) ** Changed in: cloud-archive/epoxy Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2017748 Title: [SRU] OVN: ovnmeta namespaces missing during scalability test causing DHCP issues Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: New Status in Ubuntu Cloud Archive caracal series: New Status in Ubuntu Cloud Archive dalmation series: New Status in Ubuntu Cloud Archive epoxy series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in neutron: New Status in neutron ussuri series: Fix Released Status in neutron victoria series: New Status in neutron wallaby series: New Status in neutron xena series: New Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: New Status in neutron source package in Jammy: New Status in neutron source package in Noble: New Status in neutron source package in Oracular: New Bug description: [Impact] ovnmeta- namespaces are missing intermittently then can't reach to VMs [Test Case] Not able to reproduce this easily, so I run charmed-openstack-tester, the result is below: == Totals == Ran: 469 tests in 4273.6309 sec. - Passed: 398 - Skipped: 69 - Expected Fail: 0 - Unexpected Success: 0 - Failed: 2 Sum of execute time for each test: 4387.2727 sec. 2 failed tests (tempest.api.object_storage.test_account_quotas.AccountQuotasTest and octavia_tempest_plugin.tests.scenario.v2.test_traffic_ops.TrafficOperationsScenarioTest) is not related to the fix [Where problems could occur] This patches are related to ovn metadata agent in compute. VM's connectivity can possibly be affected by this patch when ovn is used. Biding port to datapath could be affected. [Others] == ORIGINAL DESCRIPTION == Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2187650 During a scalability test it was noted that a few VMs where having issues being pinged (2 out of ~5000 VMs in the test conducted). After some investigation it was found that the VMs in question did not receive a DHCP lease: udhcpc: no lease, failing FAIL checking http://169.254.169.254/2009-04-04/instance-id failed 1/20: up 181.90. request failed And the ovnmeta- namespaces for the networks that the VMs was booting from were missing. Looking into the ovn-metadata-agent.log: 2023-04-18 06:56:09.864 353474 DEBUG neutron.agent.ovn.metadata.agent [-] There is no metadata port for network 9029c393-5c40-4bf2-beec-27413417eafa or it has no MAC or IP addresses configured, tearing the namespace down if needed _get_provision_params /usr/lib/python3.9/site- packages/neutron/agent/ovn/metadata/agent.py:495 Apparently, when the system is under stress (scalability tests) the
[Yahoo-eng-team] [Bug 2017748] Re: [SRU] OVN: ovnmeta namespaces missing during scalability test causing DHCP issues
** Changed in: cloud-archive/dalmation Status: New => Fix Released ** Changed in: cloud-archive/caracal Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2017748 Title: [SRU] OVN: ovnmeta namespaces missing during scalability test causing DHCP issues Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive antelope series: New Status in Ubuntu Cloud Archive bobcat series: New Status in Ubuntu Cloud Archive caracal series: Fix Released Status in Ubuntu Cloud Archive dalmation series: Fix Released Status in Ubuntu Cloud Archive epoxy series: Fix Released Status in Ubuntu Cloud Archive yoga series: New Status in Ubuntu Cloud Archive zed series: New Status in neutron: New Status in neutron ussuri series: Fix Released Status in neutron victoria series: New Status in neutron wallaby series: New Status in neutron xena series: New Status in neutron package in Ubuntu: Fix Released Status in neutron source package in Focal: New Status in neutron source package in Jammy: New Status in neutron source package in Noble: New Status in neutron source package in Oracular: New Bug description: [Impact] ovnmeta- namespaces are missing intermittently then can't reach to VMs [Test Case] Not able to reproduce this easily, so I run charmed-openstack-tester, the result is below: == Totals == Ran: 469 tests in 4273.6309 sec. - Passed: 398 - Skipped: 69 - Expected Fail: 0 - Unexpected Success: 0 - Failed: 2 Sum of execute time for each test: 4387.2727 sec. 2 failed tests (tempest.api.object_storage.test_account_quotas.AccountQuotasTest and octavia_tempest_plugin.tests.scenario.v2.test_traffic_ops.TrafficOperationsScenarioTest) is not related to the fix [Where problems could occur] This patches are related to ovn metadata agent in compute. VM's connectivity can possibly be affected by this patch when ovn is used. Biding port to datapath could be affected. [Others] == ORIGINAL DESCRIPTION == Reported at: https://bugzilla.redhat.com/show_bug.cgi?id=2187650 During a scalability test it was noted that a few VMs where having issues being pinged (2 out of ~5000 VMs in the test conducted). After some investigation it was found that the VMs in question did not receive a DHCP lease: udhcpc: no lease, failing FAIL checking http://169.254.169.254/2009-04-04/instance-id failed 1/20: up 181.90. request failed And the ovnmeta- namespaces for the networks that the VMs was booting from were missing. Looking into the ovn-metadata-agent.log: 2023-04-18 06:56:09.864 353474 DEBUG neutron.agent.ovn.metadata.agent [-] There is no metadata port for network 9029c393-5c40-4bf2-beec-27413417eafa or it has no MAC or IP addresses configured, tearing the namespace down if needed _get_provision_params /usr/lib/python3.9/site- packages/neutron/agent/ovn/metadata/agent.py:495 Apparently, when the system is under stress (scalability tests) there are some edge cases where the metadata port information has not yet being propagated by OVN to the Southbound database and when the PortBindingChassisEvent event is being handled and try to find either the metadata port of the IP information on it (which is updated by ML2/OVN during subnet creation) it can not be found and fails silently with the error shown above. Note that, running the same tests but with less concurrency did not trigger this issue. So only happens when the system is overloaded. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/2017748/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2086740] [NEW] Deadlock when metadata agent starts
Public bug reported: There is a small chance that if a port binding update event occurs right after SB IDL was instantiated and before the post_fork event is set, the event match function calls to sb_idl attribute that waits indefinitely. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/2086740 Title: Deadlock when metadata agent starts Status in neutron: New Bug description: There is a small chance that if a port binding update event occurs right after SB IDL was instantiated and before the post_fork event is set, the event match function calls to sb_idl attribute that waits indefinitely. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/2086740/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2086750] [NEW] VlanTransparencyTest.test_vlan_transparent_port_sec_disabled failed to create a server, n-api returns 500, "nova.exception.NovaException: Failed to access port" re
Public bug reported: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_92f/931993/3/check/neutron- tempest-plugin-ovn/92fc4e3/testr_results.html 2024-11-06 01:24:07,246 90683 INFO [tempest.lib.common.rest_client] Request (VlanTransparencyTest:test_vlan_transparent_port_sec_disabled): 500 POST https://158.69.77.94/compute/v2.1/servers 300.064s 2024-11-06 01:24:07,247 90683 DEBUG[tempest.lib.common.rest_client] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': ''} Body: {"server": {"flavorRef": "d6da27ed-2673-4a67-9d11-2163c67f61d3", "imageRef": "89db8c48-81a7-4dd2-9192-c41d4cddabbf", "key_name": "tempest-VlanTransparencyTest-1374679314", "networks": [{"port": "a68e5f79-48b1-41bf-9de3-e1e713380ef1"}], "name": "server-tempest-VlanTransparencyTest-1374679314-0", "security_groups": [{"name": "default"}]}} Response - Headers: {'date': 'Wed, 06 Nov 2024 01:19:07 GMT', 'server': 'Apache/2.4.52 (Ubuntu)', 'content-length': '610', 'content-type': 'text/html; charset=iso-8859-1', 'connection': 'close', 'status': '500', 'content-location': 'https://158.69.77.94/compute/v2.1/servers'} Body: b'\n\n500 Internal Server Error\n\nInternal Server Error\nThe server encountered an internal error or\nmisconfiguration and was unable to complete\nyour request.\nPlease contact the server administrator at \n webmaster@localhost to inform them of the time this error occurred,\n and the actions you performed just before this error.\nMore information about this error may be available\nin the server error log.\n\nApache/2.4.52 (Ubuntu) Server at 158.69.77.94 Port 80\n\n' }}} Traceback (most recent call last): File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/neutron_tempest_plugin/scenario/test_vlan_transparency.py", line 178, in test_vlan_transparent_port_sec_disabled self._test_basic_vlan_transparency_connectivity( File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/neutron_tempest_plugin/scenario/test_vlan_transparency.py", line 133, in _test_basic_vlan_transparency_connectivity vms.append(self._create_port_and_server( File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/neutron_tempest_plugin/scenario/test_vlan_transparency.py", line 86, in _create_port_and_server return self.create_server(flavor_ref=self.flavor_ref, File "/opt/stack/tempest/.tox/tempest/lib/python3.10/site-packages/neutron_tempest_plugin/scenario/base.py", line 135, in create_server server = client.create_server( File "/opt/stack/tempest/tempest/lib/services/compute/servers_client.py", line 119, in create_server resp, body = self.post('servers', post_body) File "/opt/stack/tempest/tempest/lib/common/rest_client.py", line 314, in post resp_header, resp_body = self.request( File "/opt/stack/tempest/tempest/lib/services/compute/base_compute_client.py", line 47, in request resp, resp_body = super(BaseComputeClient, self).request( File "/opt/stack/tempest/tempest/lib/common/rest_client.py", line 762, in request self._error_checker(resp, resp_body) File "/opt/stack/tempest/tempest/lib/common/rest_client.py", line 856, in _error_checker raise exceptions.UnexpectedContentType(str(resp.status), tempest.lib.exceptions.UnexpectedContentType: Unexpected content type provided Details: 500 In n-api log: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_92f/931993/3/check/neutron- tempest-plugin-ovn/92fc4e3/controller/logs/screen-n-api.txt Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: DEBUG neutronclient.v2_0.client [None req-eb975fa0-ecd2-44dc-968c-b905124050d0 tempest-VlanTransparencyTest-258921115 tempest-VlanTransparencyTest-258921115-project-member] Error message: Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: 500 Internal Server Error Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: Internal Server Error Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: The server encountered an internal error or Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: misconfiguration and was unable to complete Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: your request. Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: Please contact the server administrator at Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: webmaster@localhost to inform them of the time this error occurred, Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: and the actions you performed just before this error. Nov 06 01:24:07.353465 np0038980715 devstack@n-api.service[59370]: More information about this error may be a
[Yahoo-eng-team] [Bug 2077897] Re: Cannot deploy Ubuntu 24.10
** Changed in: ubuntu-power-systems Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/2077897 Title: Cannot deploy Ubuntu 24.10 Status in cloud-init: Unknown Status in MAAS: Invalid Status in The Ubuntu-power-systems project: Fix Released Bug description: Failed deployment when trying to install a POWER9 and POWER10 LPAR with Ubuntu Oracular Oriole 24.10: The following error after Rebooting step: ``` [ OK ] Finished snapd.seeded.service - Wait until snapd is fully seeded. [ OK ] Reached target multi-user.target - Multi-User System. [ OK ] Reached target graphical.target - Graphical Interface. Starting cloud-final.service - Cloud-init: Final Stage... Starting systemd-update-utmp-runle…- Record Runlevel Change in UTMP... [ OK ] Finished systemd-update-utmp-runle…e - Record Runlevel Change in UTMP. [7.734835] cloud-init[1043]: 2024-08-26 13:24:42,056 - util.py[WARNING]: Can not apply stage final, no datasource found! Likely bad things to come! [7.735125] cloud-init[1043]: Can not apply stage final, no datasource found! Likely bad things to come! [7.735188] cloud-init[1043]: [7.735314] cloud-init[1043]: Traceback (most recent call last): [7.735432] cloud-init[1043]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 630, in main_modules [7.735550] cloud-init[1043]: init.fetch(existing="trust") [7.735706] cloud-init[1043]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 552, in fetch [7.735791] cloud-init[1043]: return self._get_data_source(existing=existing) [7.735843] cloud-init[1043]: [7.735902] cloud-init[1043]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 403, in _get_data_source [7.735961] cloud-init[1043]: raise e [7.736025] cloud-init[1043]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 390, in _get_data_source [7.736135] cloud-init[1043]: ds, dsname = sources.find_source( [7.736227] cloud-init[1043]: [7.736302] cloud-init[1043]: File "/usr/lib/python3/dist-packages/cloudinit/sources/__init__.py", line 1063, in find_source [7.736382] cloud-init[1043]: raise DataSourceNotFoundException(msg) [7.736486] cloud-init[1043]: cloudinit.sources.DataSourceNotFoundException: Did not find any data source, searched classes: () [7.736564] cloud-init[1043]: [7.737018] sh[1789]: Completed socket interaction for boot stage final [FAILED] Failed to start cloud-final.service - Cloud-init: Final Stage. See 'systemctl status cloud-final.service' for details. [ OK ] Reached target cloud-init.target - Cloud-init target. ``` Able to install the same partitions with Ubuntu 22.04 and 24.04 (via MAAS). And also able to boot Oracular 24.10 on the same partitions installing via .ISO (not via MAAS). Debian-based MAAS: 3.2.11 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/2077897/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp