[Yahoo-eng-team] [Bug 1503708] [NEW] InstanceV2 backports to V1 lack a context
Public bug reported: When we convert a V2 instance to a V1 instance, we don't provide it a context, which could, under some circumstances, cause a failure to lazy- load things we need to construct the older instance. ** Affects: nova Importance: High Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1503708 Title: InstanceV2 backports to V1 lack a context Status in OpenStack Compute (nova): In Progress Bug description: When we convert a V2 instance to a V1 instance, we don't provide it a context, which could, under some circumstances, cause a failure to lazy-load things we need to construct the older instance. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1503708/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1506089] [NEW] Nova incorrectly calculates service version
Public bug reported: Nova will incorrectly calculate the service version from the database, resulting in improper upgrade decisions like automatic compute rpc version pinning. For a dump that looks like this: 2015-10-13 23:53:15.824 | created_atupdated_at deleted_at id hostbinary topic report_countdisableddeleted disabled_reason last_seen_upforced_down version 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL 1 devstack-trusty-hpcloud-b2-5398906 nova-conductor conductor 49 0 0 NULL2015-10-13 23:50:39 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL 2 devstack-trusty-hpcloud-b2-5398906 nova-cert cert49 0 0 NULL2015-10-13 23:50:39 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL 3 devstack-trusty-hpcloud-b2-5398906 nova-scheduler scheduler 49 0 0 NULL2015-10-13 23:50:39 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:40 NULL 4 devstack-trusty-hpcloud-b2-5398906 nova-computecompute 49 0 0 NULL2015-10-13 23:50:40 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:44 2015-10-13 23:50:39 NULL 5 devstack-trusty-hpcloud-b2-5398906 nova-networknetwork 48 0 0 NULL2015-10-13 23:50:39 0 2 Where all versions are 2, this is displayed in logs that load the compute rpcapi module: 2015-10-13 23:56:05.149 INFO nova.compute.rpcapi [req- d3601f93-73a2-4427-91d0-bb5964002592 None None] Automatically selected compute RPC version 4.0 from minimum service version 0 Which is clearly wrong (service_version minimum should be 2 not 0) ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1506089 Title: Nova incorrectly calculates service version Status in OpenStack Compute (nova): In Progress Bug description: Nova will incorrectly calculate the service version from the database, resulting in improper upgrade decisions like automatic compute rpc version pinning. For a dump that looks like this: 2015-10-13 23:53:15.824 | created_at updated_at deleted_at id hostbinary topic report_countdisableddeleted disabled_reason last_seen_upforced_down version 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL 1 devstack-trusty-hpcloud-b2-5398906 nova-conductor conductor 49 0 0 NULL2015-10-13 23:50:39 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL 2 devstack-trusty-hpcloud-b2-5398906 nova-cert cert49 0 0 NULL2015-10-13 23:50:39 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL 3 devstack-trusty-hpcloud-b2-5398906 nova-scheduler scheduler 49 0 0 NULL2015-10-13 23:50:39 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:40 NULL 4 devstack-trusty-hpcloud-b2-5398906 nova-computecompute 49 0 0 NULL2015-10-13 23:50:40 0 2 2015-10-13 23:53:15.824 | 2015-10-13 23:42:44 2015-10-13 23:50:39 NULL 5 devstack-trusty-hpcloud-b2-5398906 nova-networknetwork 48 0 0 NULL2015-10-13 23:50:39 0 2 Where all versions are 2, this is displayed in logs that load the compute rpcapi module: 2015-10-13 23:56:05.149 INFO nova.compute.rpcapi [req- d3601f93-73a2-4427-91d0-bb5964002592 None None] Automatically selected compute RPC version 4.0 from minimum service version 0 Which is clearly wrong (service_version minimum should be 2 not 0) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1506089/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1351020] [NEW] FloatingIP fails to load from database when not associated
Public bug reported: A FloatingIP can be not associated with an FixedIP, which will cause its fixed_ip field in the database model to be None. Currently, FloatingIP's _from_db_object() method always assumes it's non-None and thus tries to load a FixedIP from None, which fails. ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1351020 Title: FloatingIP fails to load from database when not associated Status in OpenStack Compute (Nova): In Progress Bug description: A FloatingIP can be not associated with an FixedIP, which will cause its fixed_ip field in the database model to be None. Currently, FloatingIP's _from_db_object() method always assumes it's non-None and thus tries to load a FixedIP from None, which fails. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1351020/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1360320] [NEW] Unit tests fail in handle_schedule_error with wrong instance
Public bug reported: >From http://logs.openstack.org/70/113270/3/check/gate-nova- python26/038b3fa/console.html: 2014-08-21 20:08:33.507 | Traceback (most recent call last): 2014-08-21 20:08:33.507 | File "nova/tests/conductor/test_conductor.py", line 1343, in test_build_instances_scheduler_failure 2014-08-21 20:08:33.507 | legacy_bdm=False) 2014-08-21 20:08:33.507 | File "nova/conductor/rpcapi.py", line 415, in build_instances 2014-08-21 20:08:33.507 | cctxt.cast(context, 'build_instances', **kw) 2014-08-21 20:08:33.508 | File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/rpc/client.py", line 152, in call 2014-08-21 20:08:33.508 | retry=self.retry) 2014-08-21 20:08:33.508 | File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/transport.py", line 90, in _send 2014-08-21 20:08:33.508 | timeout=timeout, retry=retry) 2014-08-21 20:08:33.508 | File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/_drivers/impl_fake.py", line 194, in send 2014-08-21 20:08:33.508 | return self._send(target, ctxt, message, wait_for_reply, timeout) 2014-08-21 20:08:33.508 | File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/_drivers/impl_fake.py", line 181, in _send 2014-08-21 20:08:33.508 | raise failure 2014-08-21 20:08:33.509 | UnexpectedMethodCallError: Unexpected method call. unexpected:- expected:+ 2014-08-21 20:15:52.443 | - handle_schedule_error.__call__(, NoValidHost(u'No valid host was found. fake-reason',), '8eb9d649-0985-43libvir: error : internal error could not initialize domain event timer 2014-08-21 20:16:46.065 | Exception TypeError: "'NoneType' object is not callable" in > ignored 2014-08-21 20:19:45.254 | 50-8946-570ce100534c', {'instance_properties': Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone=None,cell_name=None,cleaned=False,config_drive=None,created_at=1955-11-05T00:00:00Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description=None,display_name=None,ephemeral_gb=0,ephemeral_key_uuid=None,fault=,host='fake-host',hostname=None,id=1,image_ref=None,info_cache=,instance_type_id=None,kernel_id=None,key_data=None,key_name=None,launch_index=None,launched_at=None,launched_on=None,locked=False,locked_by=None,memory_mb=None,metadata=,node=None,os_type=None,pci_devices=,power_state=None,progress=None,project_id='fake-project',ramdisk_id=None,reservation_id=None,root_device_name=None,root_gb=0,scheduled_at=None,security_groups=,shutdown_terminate=False,system_metadata=,task_state=None,terminated_at=None,updated_at=None ,user_data=None,user_id='fake-user',uuid=8eb9d649-0985-4350-8946-570ce100534c,vcpus=None,vm_mode=None,vm_state=None), 'fake': 'specs'}) -> None 2014-08-21 20:19:45.254 | + handle_schedule_error.__call__(, NoValidHost(u'No valid host was found. fake-reason',), '712006a3-7ca5-4350-8f7f-028a9e4c78b2', {'instance_properties': Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone=None,cell_name=None,cleaned=False,config_drive=None,created_at=1955-11-05T00:00:00Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description=None,display_name=None,ephemeral_gb=0,ephemeral_key_uuid=None,fault=,host='fake-host',hostname=None,id=1,image_ref=None,info_cache=,instance_type_id=None,kernel_id=None,key_data=None,key_name=None,launch_index=None,launched_at=None,launched_on=None,locked=False,locked_by=None,memory_mb=None,metadata=,node=None,os_type=None,pci_devices=,power_state=None,progress=None,project_id='fake-project',ramdisk_id=None,reservati on_id=None,root_device_name=None,root_gb=0,scheduled_at=None,security_groups=,shutdown_terminate=False,system_metadata=,task_state=None,terminated_at=None,updated_at=None,user_data=None,user_id='fake-user',uuid=8eb9d649-0985-4350-8946-570ce100534c,vcpus=None,vm_mode=None,vm_state=None), 'fake': 'specs'}) -> None ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1360320 Title: Unit tests fail in handle_schedule_error with wrong instance Status in OpenStack Compute (Nova): New Bug description: From http://logs.openstack.org/70/113270/3/check/gate-nova- python26/038b3fa/console.html: 2014-08-21 20:08:33.507 | Traceback (most recent call last): 2014-08-21 20:08:33.507 | File "nova/tests/conductor/test_conductor.py", line 1343, in test_build_instances_scheduler_failure 2014-08-21 20:08:33.507 | legacy_bdm=False) 2014-08-21 20:08:33.507 | File "nova/conductor/rpca
[Yahoo-eng-team] [Bug 1360333] [NEW] Object hash test fails to detect changes when serialize_args is used
Public bug reported: The object hash test will fail to detect method signature changes when something like the serialize_args decorator is used. The test needs to drill down until it finds the remotable level and do the calculation there. ** Affects: nova Importance: Low Assignee: Dan Smith (danms) Status: Confirmed ** Tags: testing unified-objects -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1360333 Title: Object hash test fails to detect changes when serialize_args is used Status in OpenStack Compute (Nova): Confirmed Bug description: The object hash test will fail to detect method signature changes when something like the serialize_args decorator is used. The test needs to drill down until it finds the remotable level and do the calculation there. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1360333/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1361683] [NEW] Instance pci_devices and security_groups refreshing can break backporting
Public bug reported: In the Instance object, on a remotable operation such as save(), we refresh the pci_devices and security_groups with the information we get back from the database. Since this *replaces* the objects currently attached to the instance object (which might be backlevel) with current versions, an older client could get a failure upon deserializing the result. We need to figure out some way to either backport the results of remoteable methods, or put matching backlevel objects into the instance during the refresh in the first place. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Confirmed ** Tags: unified-objects -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1361683 Title: Instance pci_devices and security_groups refreshing can break backporting Status in OpenStack Compute (Nova): Confirmed Bug description: In the Instance object, on a remotable operation such as save(), we refresh the pci_devices and security_groups with the information we get back from the database. Since this *replaces* the objects currently attached to the instance object (which might be backlevel) with current versions, an older client could get a failure upon deserializing the result. We need to figure out some way to either backport the results of remoteable methods, or put matching backlevel objects into the instance during the refresh in the first place. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1361683/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1155800] Re: Cannot delete / confirm / revert resize an instance if nova-compute crashes after VERIFY_RESIZE
This is super old, lots has changed since then, and several folks have not been able to reproduce. Please re-open if this is still valid. ** Changed in: nova Importance: High => Undecided ** Changed in: nova Status: Triaged => Invalid ** Changed in: nova Assignee: Dan Smith (danms) => (unassigned) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1155800 Title: Cannot delete / confirm / revert resize an instance if nova-compute crashes after VERIFY_RESIZE Status in OpenStack Compute (Nova): Invalid Bug description: How to reproduce the bug: nova boot ... vm1 nova migrate vm1 (or resize) wait for the vm status to reach VERIFY_RESIZE stop nova-compute on the host where vm1 is running nova delete vm1 Error: The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-be1379bc-6a5b- 41f5-a554-60e02acfdb79) restart quickly the nova-compute service, before the status becomes "XXX" in: nova-manage service list Note: the vm is still running on the hypervisor. nova show vm1 VM status is still: VERIFY_RESIZE nova resize-confirm vm1 ERROR: Cannot 'confirmResize' while instance is in task_state deleting (HTTP 409) (Request-ID: req-9660c776-ebc3-4397-a8e2-7ad83e8b6a0f) nova resize-revert vm1 ERROR: Cannot 'revertResize' while instance is in task_state deleting (HTTP 409) (Request-ID: req-3cf0141b-ee3d-478f-8aa0-89091028a227) nova delete vm1 The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-2cb17333-6cc9-42ca- baaa-da88ec90153f) nova-api log when running nova delete: http://paste.openstack.org/show/33783/ Notes: Tests have been performed using the Hyper-V driver, but the issue seems to be unrelated to the driver. After stopping nova-compute, by waiting long enough for the service to be marked as XXX in "nova-manage service list", issuing "nova delete vm1" succeeds. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1155800/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1370536] [NEW] DB migrations can go unchecked
Public bug reported: Currently DB migrations can be added to the tree without the corresponding migration tests. This is bad and means that we have some that are untested in the tree already. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: In Progress ** Tags: db -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1370536 Title: DB migrations can go unchecked Status in OpenStack Compute (Nova): In Progress Bug description: Currently DB migrations can be added to the tree without the corresponding migration tests. This is bad and means that we have some that are untested in the tree already. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1370536/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1198142] Re: server fails to start after a "stop" action
I'm marking this as invalid given my last findings and the lack of any response. We can reopen it if new details become available. ** Changed in: nova Status: Triaged => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1198142 Title: server fails to start after a "stop" action Status in OpenStack Compute (Nova): Invalid Bug description: After a shutoff operation, the server fails to start, even though "Success:Start Instace" is reported in GUI. The issue is reproducible in CLI as well. Steps to reproduce: 1. Stop the server using command # nova stop 2. Start the server back after the server status shows "SHUTOFF" # nova start To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1198142/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1201784] Re: Resize doesn't fail when the operation doesn't complete
This *is* by design because the call to start the resize is cast-ed (like almost everything else) from the api node and returns immediately. We don't know that it failed until potentially much later. I'm going to mark this as invalid, but if I'm missing something, please reopen. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1201784 Title: Resize doesn't fail when the operation doesn't complete Status in OpenStack Compute (Nova): Invalid Bug description: I've noticed nova resize doesn't fail on the client side when the server doesn't actually do the resize. 2 examples: * resizing to a flavor with too much RAM. The scheduler can't find a host, but the command line call succeeds, and the server state stays the same. * resizing a shutdown server, where nothing seems to be happening. Using devstack and latest master. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1201784/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1201873] Re: dnsmasq does not use -h, so /etc/hosts sends folks to loopback when they look up the machine it's running on
This sounds like a JuJu problem to me :) IMHO, /etc/hosts should not redirect $HOSTNAME to anything other than a routable external interface in a real environment with working DNS. Assuming your machine is not called "localhost" I think that this is a configuration issue. ** Changed in: nova Status: New => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1201873 Title: dnsmasq does not use -h, so /etc/hosts sends folks to loopback when they look up the machine it's running on Status in OpenStack Compute (Nova): Opinion Bug description: from dnsmasq(8): -h, --no-hosts Don't read the hostnames in /etc/hosts. I reliably get bit by this during certain kinds of deployments, where my nova-network/dns host has an entry in /etc/hosts such as: 127.0.1.1hostname.example.com hostname I keep having to edit /etc/hosts on that machine to use a real IP, because juju gets really confused when it looks up certain openstack hostnames and gets sent to its own instance! To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1201873/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1258256] [NEW] Live upgrade from Havana broken by commit 62e9829
Public bug reported: Commit 62e9829 inadvertently broke live upgrades from Havana to master. This was not really related to the patch itself, other than that it bumped the Instance version which uncovered a bunch of issues in the object infrastructure that weren't yet ready to handle this properly. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Confirmed ** Tags: unified-objects -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1258256 Title: Live upgrade from Havana broken by commit 62e9829 Status in OpenStack Compute (Nova): Confirmed Bug description: Commit 62e9829 inadvertently broke live upgrades from Havana to master. This was not really related to the patch itself, other than that it bumped the Instance version which uncovered a bunch of issues in the object infrastructure that weren't yet ready to handle this properly. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1258256/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1265607] [NEW] Instance.refresh() sends new info_cache objects
Public bug reported: If an older node does an Instance.refresh() it will fail because conductor will overwrite the info_cache field with a new InstanceInfoCache object. This happens during the LifecycleEvent handler in nova-compute. ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: Confirmed ** Tags: unified-objects -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1265607 Title: Instance.refresh() sends new info_cache objects Status in OpenStack Compute (Nova): Confirmed Bug description: If an older node does an Instance.refresh() it will fail because conductor will overwrite the info_cache field with a new InstanceInfoCache object. This happens during the LifecycleEvent handler in nova-compute. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1265607/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1265618] [NEW] image_snapshot_pending state breaks havana nodes
Public bug reported: Icehouse introduced a state called image_snapshot_pending which havana nodes do not understand. If they call save with expected_task_state="image_snapshot" they will crash on the new state. 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2341, in _snapshot_instance 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp update_task_state) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1386, in snapshot 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp update_task_state(task_state=task_states.IMAGE_PENDING_UPLOAD) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/compute/manager.py", line 2338, in update_task_state 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp instance.save(expected_task_state=expected_state) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/objects/base.py", line 139, in wrapper 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp ctxt, self, fn.__name__, args, kwargs) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 497, in object_action 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp objmethod=objmethod, args=args, kwargs=kwargs) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/rpcclient.py", line 85, in call 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return self._invoke(self.proxy.call, ctxt, method, **kwargs) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/rpcclient.py", line 63, in _invoke 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return cast_or_call(ctxt, msg, **self.kwargs) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/proxy.py", line 126, in call 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp result = rpc.call(context, real_topic, msg, timeout) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/__init__.py", line 139, in call 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return _get_impl().call(CONF, context, topic, msg, timeout) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/impl_kombu.py", line 816, in call 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp rpc_amqp.get_connection_pool(conf, Connection)) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/amqp.py", line 574, in call 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp rv = list(rv) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/amqp.py", line 539, in __iter__ 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp raise result 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp UnexpectedTaskStateError_Remote: Unexpected task state: expecting (u'image_snapshot',) but the actual state is image_snapshot_pending 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last): 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova/nova/conductor/manager.py", line 576, in _object_dispatch 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return getattr(target, method)(context, *args, **kwargs) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova/nova/objects/base.py", line 152, in wrapper 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return fn(self, ctxt, *args, **kwargs) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp File "/opt/upstack/nova/nova/objects/instance.py", line 459, in save 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp columns_to_join=_expected_cols(expected_attrs)) 2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp
[Yahoo-eng-team] [Bug 981263] Re: Nova API should present deleted flavors (instance_types) in some cases
This was fixed at some point, probably after several recent changes, and is no longer an issue according to the reporter. ** Changed in: nova Status: Triaged => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/981263 Title: Nova API should present deleted flavors (instance_types) in some cases Status in OpenStack Compute (Nova): Invalid Bug description: In certain cases Nova API should return instance flavors (instance_types) that are deleted. Notably if there is an instance that is "active" and the flavor has been deleted, we need to pull the instance_type data down to ensure that we can apply network specifics attached to that instance_type on startup of nova-compute. The second case that a deleted flavor should be returned is if the instance_type is being requested by ID, as IDs should not be reused. This is important for Horizon to be able to properly retrieve "instances" for a given project (in Nova Dashboard and Syspanel Dashboard). Example traceback you can see if you delete a flavor and restart nova compute: resource: 'NoneType' object is not subscriptable 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi Traceback (most recent call last): 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 851, in _process_stack 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi action_result = self.dispatch(meth, request, action_args) 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/wsgi.py", line 926, in dispatch 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return method(req=request, **action_args) 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/servers.py", line 382, in detail 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi servers = self._get_servers(req, is_detail=True) 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/servers.py", line 465, in _get_servers 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return self._view_builder.detail(req, limited_list) 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 123, in detail 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return self._list_view(self.show, request, instances) 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 127, in _list_view 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi server_list = [func(request, server)["server"] for server in servers] 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 61, in wrapped 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return func(self, request, instance) 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 97, in show 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi "flavor": self._get_flavor(request, instance), 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi File "/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 172, in _get_flavor 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi flavor_id = instance["instance_type"]["flavorid"] 2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi TypeError: 'NoneType' object is not subscriptable To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/981263/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1089386] Re: destroying an instance not possible if a broken cinder volume is attached
Unable to reproduce and original submitter unable to provide more information. ** Changed in: nova Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1089386 Title: destroying an instance not possible if a broken cinder volume is attached Status in OpenStack Compute (Nova): Invalid Bug description: I just managed it to have an instance (in state shutoff) on nova with an attached volume of cinder which is no longer available. It's not possible to destroy this instance, I got an exception in class ComputeManager in method _shutdown_instance (file nova/compute/manager.py). Problem is the call to cinder to deattach the volume, which will fail because the volume no longer exists. The exception (ClientException of cinder) is not handled in the try- except-block and should be added. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1089386/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1119873] Re: nova-compute crashes if restarted with an instance in VERIFY_RESIZE state
** Changed in: nova Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1119873 Title: nova-compute crashes if restarted with an instance in VERIFY_RESIZE state Status in OpenStack Compute (Nova): Invalid Bug description: steps to reproduce the issue: boot a vm nova migrate vm1 (or nova resize) wait for the vm to reach the VERIFY_RESIZE state stop nova-compute (kill -9 or similar) restart nova-compute The process will terminate after a few seconds with the following error: http://paste.openstack.org/show/30836/ The only workaround I found consists in changing the VM status in the database. nova delete before starting the service is not enough. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1119873/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1161538] Re: migrate fails with 'ProcessExecutionError'
The log shows you're out of space on a disk that is trying to get something copied to it. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1161538 Title: migrate fails with 'ProcessExecutionError' Status in OpenStack Compute (Nova): Invalid Bug description: I applied the latest build to my vs347 system in an attempt to verify a fix for Bug 1160489. I ended up hitting a different error: nova boot --image be8b6475-26d8-410f-aaa5-8b278d98c8f9 --flavor 1 MIGRATE1 [root@vs347 ~]# nova show MIGRATE1 +-+--+ | Property| Value | +-+--+ | status | BUILD | | updated | 2013-03-28T16:48:22Z | | OS-EXT-STS:task_state | networking | | OS-EXT-SRV-ATTR:host| vs342.rch.kstart.ibm.com | | key_name| None | | image | Rhel6MasterFile (be8b6475-26d8-410f-aaa5-8b278d98c8f9) | | hostId | 7ddf45b44e3e8078fa9401525a630083670fdf5a5792784c506a73f7 | | OS-EXT-STS:vm_state | building | | OS-EXT-SRV-ATTR:instance_name | bvt-instance-003d | | OS-EXT-SRV-ATTR:hypervisor_hostname | olyblade02.rch.stglabs.ibm.com | | flavor | m1.tiny (1) | | id | 29931d39-8d18-4a25-b733-59e894d94731 | | security_groups | [{u'name': u'default'}] | | user_id | 3ccf55fd609b45319f24fe681338886d | | name| MIGRATE1 | | created | 2013-03-28T16:48:20Z | | tenant_id | 67b1c37f4ca64283908c7077e9e59997 | | OS-DCF:diskConfig | MANUAL | | metadata| {} | | accessIPv4 | | | accessIPv6 | | | progress| 0 | | OS-EXT-STS:power_state | 0 | | OS-EXT-AZ:availability_zone | nova | | config_drive| | +-+--+ [root@vs347 ~]# nova list +--+--++---+ | ID | Name | Status | Networks | +--+--++---+ | 29931d39-8d18-4a25-b733-59e894d94731 | MIGRATE1 | ACTIVE | demonet=172.0.0.5 | +--+--++---+ [root@vs347 ~]# nova migrate MIGRATE1 [root@vs347 ~]# * State as of 11:53 a.m. [root@vs347 ~]# nova list +--+--++---+ | ID | Name | Status | Networks | +--+--++---+ | 29931d39-8d18-4a25-b733-59e894d94731 | MIGRATE1 | RESIZE | demonet=172.0.0.5 | +--+--++---+ [root@vs347 ~]# nova list +--+--++---+ | ID | Name | Status | Networks | +--+--++---+ | 29931d39-8d18-4a25-b733-59e894d94731 | MIGRATE1 | ERROR | demonet=172.0.
[Yahoo-eng-team] [Bug 1161496] Re: Boot from volume will attach the VM to all networks
OP realized this is a dupe ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1161496 Title: Boot from volume will attach the VM to all networks Status in OpenStack Compute (Nova): Invalid Bug description: When launching a new instance with the option 'Boot from volume' the vm will be attached to all the networks available for the tenant. I'm launching the instance through Horizon and using Quantum for the network. I found a question related to this bug https://answers.launchpad.net/nova/+question/217379. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1161496/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1161709] Re: confirm-resize failed, after migration. "KeyError: 'old_instance_type_memory_mb'"
Yes, that's the fix I'm talking about. I'm going to mark this bug as invalid since it has already been fixed. ** Changed in: nova Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1161709 Title: confirm-resize failed, after migration. "KeyError: 'old_instance_type_memory_mb'" Status in OpenStack Compute (Nova): Invalid Bug description: confirm-resize failed, after migration. "KeyError: 'old_instance_type_memory_mb'" Because after migration (Not resize), no "old_*" information exist in sys_meta. (old_* and new_* both exist only after resize operations) --- 2013-03-28 01:24:50.716 ERROR nova.api.openstack.compute.servers [req-cb15c1c5-3045-479e-a921-3f05a94c27be e9d9c977a94c4204b59192689347c126 e30341b47c714bf8b5f92b531cea9caf] Error in confirm-resize 'old_instance_type_memory_mb' 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers Traceback (most recent call last): 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/api/openstack/compute/servers.py", line 1051, in _action_confirm_resize 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers self.compute_api.confirm_resize(context, instance) 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/compute/api.py", line 174, in wrapped 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers return func(self, context, target, *args, **kwargs) 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/compute/api.py", line 164, in inner 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers return function(self, context, instance, *args, **kwargs) 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/compute/api.py", line 145, in inner 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers return f(self, context, instance, *args, **kw) 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/compute/api.py", line 1868, in confirm_resize 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers deltas = self._downsize_quota_delta(context, instance) 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/compute/api.py", line 1948, in _downsize_quota_delta 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 'old_') 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers File "/usr/lib/python2.6/site-packages/nova/compute/instance_types.py", line 250, in extract_instance_type 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers instance_type[key] = type_fn(sys_meta[type_key]) 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers KeyError: 'old_instance_type_memory_mb' 2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1161709/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1165895] Re: image-create/snapshot image_state property/metadata always 'available'
** Changed in: nova Importance: Undecided => Wishlist ** Changed in: nova Status: New => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1165895 Title: image-create/snapshot image_state property/metadata always 'available' Status in OpenStack Compute (Nova): Opinion Bug description: https://github.com/openstack/nova/blob/147eebe613d5d1756ce4f11066c62474eabb6076/nova/virt/libvirt/driver.py#L1113 'image_state': 'available' property added to every libvirt snapshot. I do not see the reason behind this "constant" property. Similar property removed by: https://github.com/openstack/nova/commit/c3b7cce8101548428b64abb23ab88482bc79c36e Example glance output: glance image-show cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 +---+--+ | Property | Value | +---+--+ | Property 'base_image_ref' | d943f775-b228-4dde-b8e8-076a9fc60351 | | Property 'image_location' | snapshot | | Property 'image_state'| available | | Property 'image_type' | snapshot | | Property 'instance_type_ephemeral_gb' | 0 | | Property 'instance_type_flavorid' | 3 | | Property 'instance_type_id' | 1 | | Property 'instance_type_memory_mb'| 4096 | | Property 'instance_type_name' | m1.medium | | Property 'instance_type_root_gb' | 40 | | Property 'instance_type_rxtx_factor' | 1 | | Property 'instance_type_swap' | 0 | | Property 'instance_type_vcpu_weight' | None | | Property 'instance_type_vcpus'| 2 | | Property 'instance_uuid' | f2d9f28a-24a3-4068-8ee0-15f55122faef | | Property 'owner_id' | b8bc1464db39459d9c3f814b908ae079 | | Property 'user_id'| 695c21ac81c6499d851f3a560516f19c | | checksum | fca8b0fb9346ea0c4ea167a7a7d9ce45 | | container_format | bare | | created_at| 2013-04-07T21:13:33 | | deleted | False | | disk_format | qcow2 | | id| cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 | | is_public | False | | min_disk | 0 | | min_ram | 0 | | name | snap_2Gb_urandom | | owner | b8bc1464db39459d9c3f814b908ae079 | | protected | False | | size | 2991521792 | | status| active | | updated_at| 2013-04-07T21:15:19 | +---+--+ nova image-show cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 +-+--+ | Property| Value| +-+--+ | metadata owner_id | b8bc1464db39459d9c3f814b908ae079 | | minDisk | 0| | metadata instance_type_name | m1.medium| | metadata instance_type_swap | 0| | metadata instance_type_memory_mb| 4096 | | id | cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 | | metadata instance_type_rxtx_factor | 1| | metadata image_state| available| | metadata image_location | snapshot | | minRam | 0
[Yahoo-eng-team] [Bug 1180618] Re: fault['message'] needs to be updated with exception message
I don't think this bug is valid. Isn't the problem just that you're failing to schedule both times and ending up with the same error message? ** Changed in: nova Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1180618 Title: fault['message'] needs to be updated with exception message Status in OpenStack Compute (Nova): Invalid Bug description: current implementation of nova/compute/utils.py will not update the exception message thrown from exception class. Here are steps taken to produce the defect: 1. Created a fake glance image: glance image-create --name=Fake_Image --is-public=true --container-format=ovf --min-ram=4000 --disk-format=raw < /mnt/download/test2.raw (test2.raw is only a txt file, not an image file). 2. The image Fake_Image shown: ubuntu@osee0221:/mnt/download$ nova image-list +--+-+++ | ID | Name| Status | Server | +--+-+++ | f474b249-e7fb-45de-adad-c5338fa53c53 | cirros-0.3.1-x86_64-uec | ACTIVE || | 4664b408-ad40-4fba-9d71-20c217189090 | cirros-0.3.1-x86_64-uec-kernel | ACTIVE || | bec25ebd-6543-4599-a8bf-c97b7ad3a649 | cirros-0.3.1-x86_64-uec-ramdisk | ACTIVE || | 97bcfabc-5fab-4dd0-9d55-233613c0fdea | Fake_Image| ACTIVE || +--+-+++ ubuntu@osee0221:/mnt/download$ 3. Now boot that Fake_Image: ubuntu@osee0221:/mnt/download$ nova boot --flavor 3 --image Fake_Image 97bcfabc-5fab-4dd0-9d55-233613c0fdea +-+--+ | Property| Value| +-+--+ | OS-EXT-STS:task_state | scheduling | | image | Fake_Image | | OS-EXT-STS:vm_state | building | | OS-EXT-SRV-ATTR:instance_name | instance-0002| | flavor | m1.medium| | id | bcae969f-ece0-4c20-8738-354fb3a7cf68 | | security_groups | [{u'name': u'default'}] | | user_id | 8a6aac216f3241bba8b6cfda8255 | | OS-DCF:diskConfig | MANUAL | | accessIPv4 | | | accessIPv6 | | | progress| 0| | OS-EXT-STS:power_state | 0| | OS-EXT-AZ:availability_zone | nova | | config_drive| | | status | BUILD| | updated | 2013-05-15T22:38:28Z | | hostId | | | OS-EXT-SRV-ATTR:host| None | | key_name| None | | OS-EXT-SRV-ATTR:hypervisor_hostname | None | | name| 97bcfabc-5fab-4dd0-9d55-233613c0fdea | | adminPass | cs4ULhBnb545 | | tenant_id | 97ba217a35a14b5aa09fefe9c95610c0 | | created | 2013-05-15T22:38:28Z | | metadata| {} | +-+--+ 4. see the servers: (in Error state) ubuntu@osee0221:/mnt/download$ nova list +--+--++--+ | ID | Name | Status | Networks | +--+--++--+ | 16c7fc43-8cab-48e4-be63-03c9305807d8 | 4664b408-ad40-4fba-9d71-20c217189090 | ACTIVE | private=10.0.0.2 | | bcae969f-ece0-4c20-8738-354fb3a7cf68 | 97bcfabc-5fab-4dd0-9d55-233613c0fdea | ERROR | private=10.0.0.3 | +
[Yahoo-eng-team] [Bug 1932337] [NEW] Cinder store migration will fail if first GET'er is not the owner
Public bug reported: During an upgrade to Xena, cinder-backed image locations are migrated to include the store name in the URL field. This is lazily done on the first GET of the image. The problem is that the first user to GET an image after the migration may not be an admin or the owner of the image, as would be the case for a public or shared image. If that happens, the user gets a 404 for a valid image because the DB layer refuses the modify operation. This is logged: 2021-06-17 08:50:06,559 WARNING [glance.db.sqlalchemy.api] Attempted to modify image user did not own. The lazy migration code needs to tolerate this and allow someone else to perform the migration without breaking non-owner GET operations until the migration is complete. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1932337 Title: Cinder store migration will fail if first GET'er is not the owner Status in Glance: New Bug description: During an upgrade to Xena, cinder-backed image locations are migrated to include the store name in the URL field. This is lazily done on the first GET of the image. The problem is that the first user to GET an image after the migration may not be an admin or the owner of the image, as would be the case for a public or shared image. If that happens, the user gets a 404 for a valid image because the DB layer refuses the modify operation. This is logged: 2021-06-17 08:50:06,559 WARNING [glance.db.sqlalchemy.api] Attempted to modify image user did not own. The lazy migration code needs to tolerate this and allow someone else to perform the migration without breaking non-owner GET operations until the migration is complete. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1932337/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1933360] [NEW] Test (and enforcement?) for os_hidden mutability on queued images is wrong
Public bug reported: The test glance.tests.unit.v2.test_images_resource.TestImagesController.test_update_queued_image_with_hidden seems to be looking to confirm that queued images cannot be marked as hidden. However, if that was the case, it should be checking for BadRequest (or similar) and not Forbidden. Currently it appears that the authorization "everything is immutable if not the owner" layer is what is triggering the Forbidden response. If we want to assert that os_hidden cannot be modified for queued images, we need to do that (as it does not appear to actually be enforced anywhere). In that case, the test needs to be modified to check for the proper return code as well. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1933360 Title: Test (and enforcement?) for os_hidden mutability on queued images is wrong Status in Glance: New Bug description: The test glance.tests.unit.v2.test_images_resource.TestImagesController.test_update_queued_image_with_hidden seems to be looking to confirm that queued images cannot be marked as hidden. However, if that was the case, it should be checking for BadRequest (or similar) and not Forbidden. Currently it appears that the authorization "everything is immutable if not the owner" layer is what is triggering the Forbidden response. If we want to assert that os_hidden cannot be modified for queued images, we need to do that (as it does not appear to actually be enforced anywhere). In that case, the test needs to be modified to check for the proper return code as well. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1933360/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1940460] [NEW] ORM fixes broke opportunistic testing on py36
Public bug reported: The patch 9e002a77f2131d3594a2a4029a147beaf37f5b17 which is aimed at fixing things in advance of SQLAlchemy 2.0 seems to have broken our opportunistic testing of DB migrations on py36 only. This manifests as a total lockup of one worker during functional tests, which fails to report anything and eventually times out the job. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1940460 Title: ORM fixes broke opportunistic testing on py36 Status in Glance: New Bug description: The patch 9e002a77f2131d3594a2a4029a147beaf37f5b17 which is aimed at fixing things in advance of SQLAlchemy 2.0 seems to have broken our opportunistic testing of DB migrations on py36 only. This manifests as a total lockup of one worker during functional tests, which fails to report anything and eventually times out the job. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1940460/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1958883] [NEW] Service version check breaks FFU
Public bug reported: As reported on the mailing list: http://lists.openstack.org/pipermail/openstack- discuss/2022-January/026603.html The service version check at startup can prevent FFUs from being possible without hacking the database. As implemented here: https://review.opendev.org/c/openstack/nova/+/738482 We currently filter "forced down" computes from the check, but we should probably also eliminate those down long enough due to missed heartbeats (i.e. offline during the upgrade). However, a fast-moving FFU where everything is switched from an old container to a new one would easily still find computes that are considered "up" and effectively force a wait. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1958883 Title: Service version check breaks FFU Status in OpenStack Compute (nova): New Bug description: As reported on the mailing list: http://lists.openstack.org/pipermail/openstack- discuss/2022-January/026603.html The service version check at startup can prevent FFUs from being possible without hacking the database. As implemented here: https://review.opendev.org/c/openstack/nova/+/738482 We currently filter "forced down" computes from the check, but we should probably also eliminate those down long enough due to missed heartbeats (i.e. offline during the upgrade). However, a fast-moving FFU where everything is switched from an old container to a new one would easily still find computes that are considered "up" and effectively force a wait. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1958883/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1820125] [NEW] Libvirt driver ungracefully explodes if unsupported arch is found
Public bug reported: If a new libvirt exposes an arch name that nova does not support, we fail to gracefully skip it during the instance capability gathering: 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager [req-4e626631-fefc-4c58-a1cd-5207c9384a1b - - - - -] Error updating resources for node primary.: InvalidArchitectureName: Architecture name 'armv6l' is not recognised 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager Traceback (most recent call last): 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 7956, in _update_available_resource_for_node 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager startup=startup) 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 727, in update_available_resource 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename) 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7070, in get_available_resource 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager data["supported_instances"] = self._get_instance_capabilities() 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5943, in _get_instance_capabilities 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager fields.Architecture.canonicalize(g.arch), 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/objects/fields.py", line 200, in canonicalize 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager raise exception.InvalidArchitectureName(arch=name) 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager InvalidArchitectureName: Architecture name 'armv6l' is not recognised 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1820125 Title: Libvirt driver ungracefully explodes if unsupported arch is found Status in OpenStack Compute (nova): In Progress Bug description: If a new libvirt exposes an arch name that nova does not support, we fail to gracefully skip it during the instance capability gathering: 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager [req-4e626631-fefc-4c58-a1cd-5207c9384a1b - - - - -] Error updating resources for node primary.: InvalidArchitectureName: Architecture name 'armv6l' is not recognised 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager Traceback (most recent call last): 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py", line 7956, in _update_available_resource_for_node 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager startup=startup) 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 727, in update_available_resource 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename) 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7070, in get_available_resource 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager data["supported_instances"] = self._get_instance_capabilities() 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5943, in _get_instance_capabilities 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager fields.Architecture.canonicalize(g.arch), 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/objects/fields.py", line 200, in canonicalize 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager raise exception.InvalidArchitectureName(arch=name) 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager InvalidArchitectureName: Architecture name 'armv6l' is not recognised 2019-03-14 19:11:31.709 6 ERROR nova.compute.manager To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1820125/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1888713] [NEW] Async tasks, image import not supported in pure-WSGI mode
Public bug reported: The wsgi_app.py file in the tree allows operators to run Glance API as a proper WSGI app. This has been the default devstack deployment for some time and multiple real clouds in the wild deploy like this. However, an attempt to start an import will be met with an image state of "queued" forever and no tasks will ever start, run, or complete. (note that this has been a known issue and the Glance team prescribes running standalone eventlet-based glance-api for deployments that need import to work). ** Affects: glance Importance: Undecided Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1888713 Title: Async tasks, image import not supported in pure-WSGI mode Status in Glance: In Progress Bug description: The wsgi_app.py file in the tree allows operators to run Glance API as a proper WSGI app. This has been the default devstack deployment for some time and multiple real clouds in the wild deploy like this. However, an attempt to start an import will be met with an image state of "queued" forever and no tasks will ever start, run, or complete. (note that this has been a known issue and the Glance team prescribes running standalone eventlet-based glance-api for deployments that need import to work). To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1888713/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1891190] [NEW] test_reload() functional test causes hang and jobs TIMED_OUT
Public bug reported: The glance.tests.functional.test_reload.TestReload.test_reload() test has been causing spurious deadlocks in functional test jobs, resulting in TIMED_OUT job statuses due to the global timeout expiring. This can be reproduced locally with lots of exposure, but Zuul runs things enough to hit it fairly often. I have tracked this down to the test_reload() test, which if I reproduce this locally, I find it is in an infinite waitpid() on the API master process that the FunctionalTest base class has started for it. The test tracks child PIDs of the master as it initiates several SIGHUP operations. Upon exit, the FunctionalTest.cleanup() routine runs and ends up waitpid()ing on the master process forever. A process list shows all the other stestr workers in Z state waiting for the final worker to complete. The final worker being stuck on waitpid() has the master process and both worker processes still running. Upon killing the master, stestr frees up, reports status from the test and exits normally. Stack trace of the hung test process after signaling the master it is waiting for manually is: Traceback (most recent call last): File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/run.py", line 93, in main() File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/run.py", line 89, in main testRunner=partial(runner, stdout=sys.stdout)) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/program.py", line 185, in __init__ self.runTests() File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/program.py", line 226, in runTests self.result = testRunner.run(self.test) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/run.py", line 52, in run test(result) File "/usr/lib64/python3.7/unittest/suite.py", line 84, in __call__ return self.run(*args, **kwds) File "/usr/lib64/python3.7/unittest/suite.py", line 122, in run test(result) File "/usr/lib64/python3.7/unittest/suite.py", line 84, in __call__ return self.run(*args, **kwds) File "/usr/lib64/python3.7/unittest/suite.py", line 122, in run test(result) File "/usr/lib64/python3.7/unittest/suite.py", line 84, in __call__ return self.run(*args, **kwds) File "/usr/lib64/python3.7/unittest/suite.py", line 122, in run test(result) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/unittest2/case.py", line 673, in __call__ return self.run(*args, **kwds) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/testcase.py", line 675, in run return run_test.run(result) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py", line 80, in run return self._run_one(actual_result) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py", line 94, in _run_one return self._run_prepared_result(ExtendedToOriginalDecorator(result)) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py", line 119, in _run_prepared_result raise e File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py", line 191, in _run_user return fn(*args, **kwargs) File "/home/dan/glance/glance/tests/functional/__init__.py", line 881, in cleanup s.stop() File "/home/dan/glance/glance/tests/functional/__init__.py", line 293, in stop rc = test_utils.wait_for_fork(self.process_pid, raise_error=False) File "/home/dan/glance/glance/tests/utils.py", line 294, in wait_for_fork (pid, rc) = os.waitpid(pid, 0) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/green/os.py", line 96, in waitpid greenthread.sleep(0.01) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/greenthread.py", line 36, in sleep hub.switch() File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/hub.py", line 298, in switch return self.greenlet.switch() File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/hub.py", line 350, in run self.wait(sleep_time) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/poll.py", line 80, in wait presult = self.do_poll(seconds) File "/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/epolls.py", line 31, in do_poll return self.poll.poll(seconds) ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. h
[Yahoo-eng-team] [Bug 1891352] [NEW] Failed import of one store will remain in progress forever if all_stores_must_succeed=True
Public bug reported: If import is called with all_stores_must_succeed=True and a store fails during set_image_data(), the store will remain in os_glance_importing_stores forever, never going into the os_glance_failed_import list. This means a polling client will never notice that the import failed. Further, if multiple stores are included in the import, and the failure happens in the later stores, the revert process will remove the earlier stores (after they had already been reported as available in stores). This means a polling client doing an import on an image already in store1 to store2,store3,store4 will see the following progression: stores=store1;os_glance_importing_to_stores=store2,store3,store4 stores=store1,store2;os_glance_importing_to_stores=store3,store4 stores=store1,store2,store3;os_glance_importing_to_stores=store4 stores=store1,store2;os_glance_importing_to_stores=store4 stores=store1;os_glance_importing_to_stores=store4 The last line, forever, and never see anything in os_glance_failed_import ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1891352 Title: Failed import of one store will remain in progress forever if all_stores_must_succeed=True Status in Glance: New Bug description: If import is called with all_stores_must_succeed=True and a store fails during set_image_data(), the store will remain in os_glance_importing_stores forever, never going into the os_glance_failed_import list. This means a polling client will never notice that the import failed. Further, if multiple stores are included in the import, and the failure happens in the later stores, the revert process will remove the earlier stores (after they had already been reported as available in stores). This means a polling client doing an import on an image already in store1 to store2,store3,store4 will see the following progression: stores=store1;os_glance_importing_to_stores=store2,store3,store4 stores=store1,store2;os_glance_importing_to_stores=store3,store4 stores=store1,store2,store3;os_glance_importing_to_stores=store4 stores=store1,store2;os_glance_importing_to_stores=store4 stores=store1;os_glance_importing_to_stores=store4 The last line, forever, and never see anything in os_glance_failed_import To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1891352/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1897907] [NEW] DELETE fails on StaleDataError when updating image_properties
Public bug reported: During the MultiStoresImportTest module in tempest, when we go to clean up images during tearDown, we occasionally get a 500 from the delete, which yields this from the test: ft1.1: tearDownClass (tempest.api.image.v2.test_images.MultiStoresImportImagesTest)testtools.testresult.real._StringException: Traceback (most recent call last): File "/opt/stack/tempest/tempest/test.py", line 242, in tearDownClass six.reraise(etype, value, trace) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/six.py", line 703, in reraise raise value File "/opt/stack/tempest/tempest/test.py", line 214, in tearDownClass teardown() File "/opt/stack/tempest/tempest/test.py", line 585, in resource_cleanup raise testtools.MultipleExceptions(*cleanup_errors) testtools.runtest.MultipleExceptions: ((, Got server fault Details: The server has either erred or is incapable of performing the requested operation. , ), (, Request timed out Details: (MultiStoresImportImagesTest:tearDownClass) Failed to delete image 9c4bba30-c244-4712-9995-86446a38eed8 within the required time (300 s)., )) The corresponding g-api.log message shows that we're failing to delete something from image_properties, I'm guessing because something has changed the image underneath us between fetch and delete. Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi [None req-4d353638-2da8-4a8b-8c6b-fb879b27c90b tempest-MultiStoresImportImagesTest-208757482 tempest-MultiStoresImportImagesTest-208757482] Caught error: UPDATE statement on table 'image_properties' expected to update 1 row(s); 0 were matched.: sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'image_properties' expected to update 1 row(s); 0 were matched. Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi Traceback (most recent call last): Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/common/wsgi.py", line 1347, in __call__ Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi action_result = self.dispatch(self.controller, action, Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/common/wsgi.py", line 1391, in dispatch Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi return method(*args, **kwargs) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/common/utils.py", line 416, in wrapped Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi return func(self, req, *args, **kwargs) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/api/v2/images.py", line 664, in delete Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi image_repo.remove(image) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/domain/proxy.py", line 104, in remove Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi result = self.base.remove(base_item) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/notifier.py", line 542, in remove Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi super(ImageRepoProxy, self).remove(image) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/domain/proxy.py", line 104, in remove Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi result = self.base.remove(base_item) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/domain/proxy.py", line 104, in remove Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi result = self.base.remove(base_item) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/domain/proxy.py", line 104, in remove Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi result = self.base.remove(base_item) Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR glance.common.wsgi [Previous line repeated 1 more time] Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR gl
[Yahoo-eng-team] [Bug 1912001] [NEW] glance allows reserved properties during create()
Public bug reported: Certain image properties are reserved for internal glance usage, such as os_glance_import_task. Changing these properties is disallowed during PATCH. However, glance does not enforce that they are not present in an image POST. It should. This command: openstack --debug image create --container-format bare --disk-format qcow2 \ --property os_glance_import_task=foobar test succeeds in creating an image with os_glance_import_task set. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1912001 Title: glance allows reserved properties during create() Status in Glance: New Bug description: Certain image properties are reserved for internal glance usage, such as os_glance_import_task. Changing these properties is disallowed during PATCH. However, glance does not enforce that they are not present in an image POST. It should. This command: openstack --debug image create --container-format bare --disk-format qcow2 \ --property os_glance_import_task=foobar test succeeds in creating an image with os_glance_import_task set. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1912001/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1913625] [NEW] Glance will leak staging data
Public bug reported: In various situations, glance will leak (potentially very large) temporary files in the staging store. One example is doing a web-download import, where glance initially downloads the image to its staging store. If the worker doing that activity crashes, loses power, etc, the user may delete the image and try again on another worker. When the crashed worker resumes, the staging data will remain but nothing will ever clean it up. Another example would be a misconfigured glance that uses local staging directories, but glance-direct is used, where the user stages data, and then deletes the image from another worker. Even in a situation where shared staging is properly configured, a failure to access the staging location during the delete call will result in the image being deleted, but the staging file not being purged. IMHO, glance workers should clean their staging directories at startup, purging any data that is attributable to a previous image having been deleted. Another option is to add a store location for each staged image, and make sure the scrubber can clean those things from the staging directory periodically (this requires also running the scrubber on each node, which may not be common practice currently). ** Affects: glance Importance: Undecided Status: Invalid ** Changed in: glance Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1913625 Title: Glance will leak staging data Status in Glance: Invalid Bug description: In various situations, glance will leak (potentially very large) temporary files in the staging store. One example is doing a web-download import, where glance initially downloads the image to its staging store. If the worker doing that activity crashes, loses power, etc, the user may delete the image and try again on another worker. When the crashed worker resumes, the staging data will remain but nothing will ever clean it up. Another example would be a misconfigured glance that uses local staging directories, but glance-direct is used, where the user stages data, and then deletes the image from another worker. Even in a situation where shared staging is properly configured, a failure to access the staging location during the delete call will result in the image being deleted, but the staging file not being purged. IMHO, glance workers should clean their staging directories at startup, purging any data that is attributable to a previous image having been deleted. Another option is to add a store location for each staged image, and make sure the scrubber can clean those things from the staging directory periodically (this requires also running the scrubber on each node, which may not be common practice currently). To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1913625/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1914664] [NEW] QEMU monitor read failure in ServerStableDeviceRescueTest
Public bug reported: Seeing this failure in the gate: https://zuul.opendev.org/t/openstack/build/7c71502b04fe47039b87f76fbe04fe56/log/controller/logs/screen-n-cpu.txt#33096 Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-compute[90163]: ERROR nova.virt.libvirt.driver [req-77f51485-cbc2-4f2a-8a6d-4a8ed910e585 req-a221d4f9-401e-420a-911e-8d32536a1d23 service nova] [instance: 7174e97c-8cf4-46c7-9498-2c5dbc452431] detaching network adapter failed.: libvirt.libvirtError: internal error: End of file from qemu monitor Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] Traceback (most recent call last): Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 2210, in detach_interface Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] wait_for_detach = guest.detach_device_with_retry( Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 423, in detach_device_with_retry Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] _try_detach_device(conf, persistent, live) Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 412, in _try_detach_device Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] ctx.reraise = True Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/usr/local/lib/python3.8/dist- packages/oslo_utils/excutils.py", line 220, in __exit__ Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] self.force_reraise() Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/usr/local/lib/python3.8/dist- packages/oslo_utils/excutils.py", line 196, in force_reraise Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] six.reraise(self.type_, self.value, self.tb) Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/usr/local/lib/python3.8/dist- packages/six.py", line 703, in reraise Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] raise value Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 398, in _try_detach_device Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] self.detach_device(conf, persistent=persistent, live=live) Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 473, in detach_device Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] self._domain.detachDeviceFlags(device_xml, flags=flags) Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] File "/usr/local/lib/python3.8/dist- packages/eventlet/tpool.py", line 190, in doit Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova- compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c- 8cf4-46c7-9498-2c5dbc452431] result = proxy_call(self._autowrap, f, *arg
[Yahoo-eng-team] [Bug 1914665] [NEW] Cinder Multistore job hits Cinder Quota error
Public bug reported: Noticed during a cinder multistore test run, we hit a quota not found error. It looks like we don't handle this well, which causes nova to see a 503: Proxy Error. I dunno if there's anything better can do than raise a 5xx, but we should probably explain in the error what happened when we know, as we clearly do here. >From this: https://cbff25b854b00bc0ff99-8ce5690b0835baabd00baac02d43f418.ssl.cf5.rackcdn.com/770629/5/check /glance-multistore-cinder- import/7c71502/controller/logs/screen-g-api.txt this log text (see the end): Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi Traceback (most recent call last): Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/common/wsgi.py", line 1347, in __call__ Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi action_result = self.dispatch(self.controller, action, Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/common/wsgi.py", line 1391, in dispatch Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi return method(*args, **kwargs) Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/common/utils.py", line 416, in wrapped Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi return func(self, req, *args, **kwargs) Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/api/v2/image_data.py", line 299, in upload Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi self._restore(image_repo, image) Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi self.force_reraise() Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi six.reraise(self.type_, self.value, self.tb) Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/usr/local/lib/python3.8/dist-packages/six.py", line 703, in reraise Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi raise value Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/api/v2/image_data.py", line 164, in upload Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi image.set_data(data, size, backend=backend) Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/domain/proxy.py", line 208, in set_data Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi self.base.set_data(data, size, backend=backend, set_active=set_active) Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/opt/stack/glance/glance/notifier.py", line 501, in set_data Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi _send_notification(notify_error, 'image.upload', msg) Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi self.force_reraise() Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 devstack@g-api.service[93292]: ERROR glance.common.wsgi File "/usr/local/lib/python3.8/dist-packages/oslo_utils/ex
[Yahoo-eng-team] [Bug 1914826] [NEW] web-download with invalid url does not report error
Public bug reported: In my testing, if I provide a URL to web-download that yields an error from urlopen(), I never see the store listed in the os_glance_failed_import list, and the store remains in os_glance_importing_to_stores. The image status does not change, which means there's no way for the API client to know that the import failed. I found this when debugging a gate issue where occasionally the tempest web-download test fails. It ends up waiting many minutes for the import to complete, even though it failed long before that. In that case, the cirros link we use for testing web-download raised a timeout. >From my log, here what we log as returning to the user just before we start the import: Feb 05 20:18:02 guaranine devstack@g-api.service[1008592]: DEBUG oslo_policy.policy [-] enforce: rule="modify_image" creds={"domain_id": null, "is_admin_project": true, "project_domain_id": "default", "project_id": "59a5997403484e97803cac28b7aa7366", "roles": ["reader", "member"], "service_project_domain_id": null, "service_project_id": null, "service_roles": [], "service_user_domain_id": null, "service_user_id": null, "system_scope": null, "user_domain_id": "default", "user_id": "10e5d60c60e54ab3889bcd57e367fe01"} target={"checksum": null, "container_format": "bare", "created_at": "2021-02-05T20:18:03.00", "disk_format": "raw", "extra_properties": {}, "image_id": "70917fce-bfc6-4d57-aa54-58235d09cf24", "locations": [], "min_disk": 0, "min_ram": 0, "name": "test", "os_glance_failed_import": "", "os_glance_import_task": "e2cb5441-8c92-45c6-9363-f4b7915401e1", "os_glance_importing_to_stores": "cheap", "os_hash_algo": null, "os_hash_value": null, "os_hidden": false, "owner": "59a5997403484e97803cac28b7aa7366", "protected": false, "size": null, "status": "importing", "tags": [], "updated_at": "2021-02-05T20:18:03.00", "virtual_size": null, "visibility": "shared"} {{(pid=1008592) enforce /usr/local/lib/python3.8/dist- packages/oslo_policy/policy.py:994}} Note that os_glance_importing_to_stores="cheap" and os_glance_failed_import="". Shortly after this, the web-download task fails: Feb 05 20:18:03 guaranine devstack@g-api.service[1008592]: ERROR glance.async_.flows._internal_plugins.web_download [-] Task e2cb5441-8c92-45c6-9363-f4b7915401e1 failed with exception : urllib.error.URLError: Here's where the task is fully reverted: Feb 05 20:18:03 guaranine devstack@g-api.service[1008592]: WARNING glance.async_.taskflow_executor [-] Task 'api_image_import-WebDownlo ad-e2cb5441-8c92-45c6-9363-f4b7915401e1' (bc722b5c-ddd4-404b-9c09-8625ed9c5941) transitioned into state 'REVERTED' from state 'REVERTIN G' with result 'None' And after that, here's what we're still returning to the user: Feb 05 20:18:03 guaranine devstack@g-api.service[1008592]: DEBUG oslo_policy.policy [-] enforce: rule="get_image" creds={"domain_id": n ull, "is_admin_project": true, "project_domain_id": "default", "project_id": "59a5997403484e97803cac28b7aa7366", "roles": ["reader", "m ember"], "service_project_domain_id": null, "service_project_id": null, "service_roles": [], "service_user_domain_id": null, "service_u ser_id": null, "system_scope": null, "user_domain_id": "default", "user_id": "10e5d60c60e54ab3889bcd57e367fe01"} target={"checksum": nu ll, "container_format": "bare", "created_at": "2021-02-05T20:18:03.00", "disk_format": "raw", "extra_properties": {}, "image_id": " 70917fce-bfc6-4d57-aa54-58235d09cf24", "locations": [], "min_disk": 0, "min_ram": 0, "name": "test", "os_glance_failed_import": "", "os _glance_import_task": "e2cb5441-8c92-45c6-9363-f4b7915401e1", "os_glance_importing_to_stores": "cheap", "os_hash_algo": null, "os_hash_ value": null, "os_hidden": false, "owner": "59a5997403484e97803cac28b7aa7366", "protected": false, "size": null, "status": "queued", "t ags": [], "updated_at": "2021-02-05T20:18:03.00", "virtual_size": null, "visibility": "shared"} {{(pid=1008592) enforce /usr/local/ lib/python3.8/dist-packages/oslo_policy/policy.py:994}} Note that os_glance_importing_to_stores="cheap" and os_glance_failed_import="". In this case, "cheap" should have moved from "importing" to "failed". I wrote a tempest negative test for this situation using a totally bogus URL, which is here: https://review.opendev.org/c/openstack/tempest/+/774303 ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1914826 Title: web-download with invalid url does not report error Status in Glance: New Bug description: In my testing, if I provide a URL to web-download that yields an error from urlopen(), I never see the store listed in the os_glance_failed_import list, and the store remains in os_glance_importing_to_stores. The image status does not change, which means there's no way for the API client to know that the import failed.
[Yahoo-eng-team] [Bug 1915543] [NEW] Glance returns 403 instead of 404 when images are not found
Public bug reported: Glance is translating "Not Found" errors from the DB layer into "Not Authorized" errors in policy, which it should not be doing. In general, we should always return 404 when something either does not exist, or when permissions do not allow you to know if that thing exists. Glance is actually translating both cases into "not authorized", which is confusing and runs counter to the goal. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1915543 Title: Glance returns 403 instead of 404 when images are not found Status in Glance: New Bug description: Glance is translating "Not Found" errors from the DB layer into "Not Authorized" errors in policy, which it should not be doing. In general, we should always return 404 when something either does not exist, or when permissions do not allow you to know if that thing exists. Glance is actually translating both cases into "not authorized", which is confusing and runs counter to the goal. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1915543/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1913625] Re: Glance will leak staging data
** Changed in: glance Status: Invalid => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1913625 Title: Glance will leak staging data Status in Glance: Confirmed Bug description: In various situations, glance will leak (potentially very large) temporary files in the staging store. One example is doing a web-download import, where glance initially downloads the image to its staging store. If the worker doing that activity crashes, loses power, etc, the user may delete the image and try again on another worker. When the crashed worker resumes, the staging data will remain but nothing will ever clean it up. Another example would be a misconfigured glance that uses local staging directories, but glance-direct is used, where the user stages data, and then deletes the image from another worker. Even in a situation where shared staging is properly configured, a failure to access the staging location during the delete call will result in the image being deleted, but the staging file not being purged. IMHO, glance workers should clean their staging directories at startup, purging any data that is attributable to a previous image having been deleted. Another option is to add a store location for each staged image, and make sure the scrubber can clean those things from the staging directory periodically (this requires also running the scrubber on each node, which may not be common practice currently). To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1913625/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1921399] [NEW] check_instance_shared_storage RPC call is broken
Public bug reported: We broke check_instance_shared_storage() in this change: https://review.opendev.org/c/openstack/nova/+/761452/13..15/nova/compute/rpcapi.py Where we re-ordered the rpcapi client signature without adjusting the caller. This leads to this failure: Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] Traceback (most recent call last): Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/opt/stack/new/nova/nova/compute/manager.py", line 797, in _is_instance_storage_shared Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] instance, data, host=host)) Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/opt/stack/new/nova/nova/compute/rpcapi.py", line 618, in check_instance_shared_storage Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] return cctxt.call(ctxt, 'check_instance_shared_storage', **msg_args) Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/usr/local/lib/python3.6/dist- packages/oslo_messaging/rpc/client.py", line 179, in call Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] transport_options=self.transport_options) Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/usr/local/lib/python3.6/dist- packages/oslo_messaging/transport.py", line 128, in _send Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] transport_options=transport_options) Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/usr/local/lib/python3.6/dist- packages/oslo_messaging/_drivers/amqpdriver.py", line 682, in send Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] transport_options=transport_options) Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/usr/local/lib/python3.6/dist- packages/oslo_messaging/_drivers/amqpdriver.py", line 672, in _send Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] raise result Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] AttributeError: 'Instance' object has no attribute 'filename' Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] Traceback (most recent call last): Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/usr/local/lib/python3.6/dist- packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] res = self.dispatcher.dispatch(message) Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] File "/usr/local/lib/python3.6/dist- packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova- compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c- 90a8-cd7f654b28ef] return self._do_dispa
[Yahoo-eng-team] [Bug 1922928] [NEW] Image tasks API excludes in-progress tasks
Public bug reported: The glance /images/$uuid/tasks API is excluding in-progress tasks, leading to test failures like this one: Traceback (most recent call last): File "/opt/stack/tempest/tempest/api/image/v2/test_images.py", line 111, in test_image_glance_direct_import self.assertEqual(1, len(tasks['tasks'])) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py", line 415, in assertEqual self.assertThat(observed, matcher, message) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py", line 502, in assertThat raise mismatch_error testtools.matchers._impl.MismatchError: 1 != 0 This is caused by the fact that we assert that the task is not expired by comparing the expires_at column to the current time. However, if the task is not completed yet, the expires_at will be NULL and never pass that test. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1922928 Title: Image tasks API excludes in-progress tasks Status in Glance: New Bug description: The glance /images/$uuid/tasks API is excluding in-progress tasks, leading to test failures like this one: Traceback (most recent call last): File "/opt/stack/tempest/tempest/api/image/v2/test_images.py", line 111, in test_image_glance_direct_import self.assertEqual(1, len(tasks['tasks'])) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py", line 415, in assertEqual self.assertThat(observed, matcher, message) File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py", line 502, in assertThat raise mismatch_error testtools.matchers._impl.MismatchError: 1 != 0 This is caused by the fact that we assert that the task is not expired by comparing the expires_at column to the current time. However, if the task is not completed yet, the expires_at will be NULL and never pass that test. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1922928/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2018612] [NEW] Guest kernel crashes with GPF on volume attach
Public bug reported: This isn't really a bug in nova, but it's something that we're hitting in CI quite a bit, so I'm filing here to record the details and so I can recheck against it. The actual bug is either in the guest (cirros 0.5.2) kernel, QEMU, or something similar. In tests where we attach a volume to a running guest, we occasionally get a guest kernel crash and stack trace that pretty much prevents anything else from working later in the test. Here's what the trace looks like: [ 10.152160] virtio_blk virtio2: [vda] 2093056 512-byte logical blocks (1.07 GB/1022 MiB) [ 10.198313] GPT:Primary header thinks Alt. header is not at the end of the disk. [ 10.199033] GPT:229375 != 2093055 [ 10.199278] GPT:Alternate GPT header not at the end of the disk. [ 10.199632] GPT:229375 != 2093055 [ 10.199857] GPT: Use GNU Parted to correct GPT errors. [ 11.291631] random: fast init done [ 11.312007] random: crng init done [ 11.419215] general protection fault: [#1] SMP PTI [ 11.420843] CPU: 0 PID: 199 Comm: modprobe Not tainted 5.3.0-26-generic #28~18.04.1-Ubuntu [ 11.421917] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014 [ 11.424732] RIP: 0010:__kmalloc_track_caller+0xa1/0x250 [ 11.425934] Code: 65 49 8b 50 08 65 4c 03 05 b4 48 37 6f 4d 8b 38 4d 85 ff 0f 84 77 01 00 00 41 8b 59 20 49 8b 39 48 8d 4a 01 4c 89 f8 4c 01 fb <48> 33 1b 49 33 99 70 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd [ 11.428460] RSP: 0018:b524801afaf0 EFLAGS: 0206 [ 11.429261] RAX: 51f2a72f63305b11 RBX: 51f2a72f63305b11 RCX: 2b7e [ 11.430205] RDX: 2b7d RSI: 0cc0 RDI: 0002f040 [ 11.431123] RBP: b524801afb28 R08: 90480762f040 R09: 904807001c40 [ 11.432032] R10: b524801afc28 R11: 0001 R12: 0cc0 [ 11.432953] R13: 0004 R14: 904807001c40 R15: 51f2a72f63305b11 [ 11.434125] FS: 7fb31d2486a0() GS:90480760() knlGS: [ 11.435139] CS: 0010 DS: ES: CR0: 80050033 [ 11.435909] CR2: 00abf9a8 CR3: 027c2000 CR4: 06f0 [ 11.437208] Call Trace: [ 11.438716] ? kstrdup_const+0x24/0x30 [ 11.439170] kstrdup+0x31/0x60 [ 11.439668] kstrdup_const+0x24/0x30 [ 11.440036] kvasprintf_const+0x86/0xa0 [ 11.440397] kobject_set_name_vargs+0x23/0x90 [ 11.440791] kobject_set_name+0x49/0x70 [ 11.452382] bus_register+0x80/0x270 [ 11.462448] ? 0xc033b000 [ 11.471469] hid_init+0x2b/0x62 [hid] [ 11.480198] do_one_initcall+0x4a/0x1fa [ 11.487738] ? _cond_resched+0x19/0x40 [ 11.495227] ? kmem_cache_alloc_trace+0x1ff/0x210 [ 11.502700] do_init_module+0x5f/0x227 [ 11.510944] load_module+0x1b96/0x2140 [ 11.517993] __do_sys_finit_module+0xfc/0x120 [ 11.525101] ? __do_sys_finit_module+0xfc/0x120 [ 11.533182] __x64_sys_finit_module+0x1a/0x20 [ 11.542123] do_syscall_64+0x5a/0x130 [ 11.549183] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.557921] RIP: 0033:0x7fb31cbaba7d [ 11.565182] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 79 9e 02 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3a fd ff ff c3 48 c7 c6 01 00 00 00 e9 a1 [ 11.581697] RSP: 002b:7ffdf6793c18 EFLAGS: 0206 ORIG_RAX: 0139 [ 11.589245] RAX: ffda RBX: RCX: 7fb31cbaba7d [ 11.597913] RDX: RSI: 004ab235 RDI: 0003 [ 11.605694] RBP: 004ab235 R08: 00c7 R09: 7fb31cbeba5f [ 11.613566] R10: R11: 0206 R12: 0003 [ 11.620772] R13: 00ab3c70 R14: 00ab3cc0 R15: [ 11.628586] Modules linked in: hid(+) virtio_rng virtio_gpu drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_scsi virtio_net net_failover failover virtio_input virtio_blk qemu_fw_cfg 9pnet_virtio 9pnet pcnet32 8139cp mii ne2k_pci 8390 e1000 [ 11.654944] ---[ end trace 9a9e8eebda38a127 ]--- [ 11.663441] RIP: 0010:__kmalloc_track_caller+0xa1/0x250 [ 11.671942] Code: 65 49 8b 50 08 65 4c 03 05 b4 48 37 6f 4d 8b 38 4d 85 ff 0f 84 77 01 00 00 41 8b 59 20 49 8b 39 48 8d 4a 01 4c 89 f8 4c 01 fb <48> 33 1b 49 33 99 70 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd [ 11.689167] RSP: 0018:b524801afaf0 EFLAGS: 0206 [ 11.698903] RAX: 51f2a72f63305b11 RBX: 51f2a72f63305b11 RCX: 2b7e [ 11.707107] RDX: 2b7d RSI: 0cc0 RDI: 0002f040 [ 11.715748] RBP: b524801afb28 R08: 90480762f040 R09: 904807001c40 [ 11.724372] R10: b524801afc28 R11: 0001 R12: 0cc0 [ 11.735147] R13: 0004 R14: 904807001c40 R15: 51f2a72f63305b11 [ 11.747065] FS: 7fb31d2486a0() GS:90480760() knlGS: [ 11.755136] CS: 0010 DS: ES:
[Yahoo-eng-team] [Bug 2033393] Re: Nova does not update libvirts instance name after server rename
The instance name in the XML is not the instance name according to nova. It is generated based on a template by the compute driver and is not otherwise mutable. So this is operating as designed. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2033393 Title: Nova does not update libvirts instance name after server rename Status in OpenStack Compute (nova): Won't Fix Bug description: Description === When renaming an OpenStack instance, the change is not reflected in the Libvirt XML configuration, leading to inconsistency between the instance name in OpenStack and the name stored in the Libvirt configuration. Steps to reproduce == * Launch an instance * Verify the instance name is correct: virsh dumpxml instance-00xx | grep '' * Rename the instance: openstack server set --name NEW * Check Libvirt config again Expected result === The instance name change should be synchronized across all components, including the underlying Libvirt configuration. Actual result = The instance name is only changed in the database. The change is not propagated to the Libvirt configuration. Environment === Kolla Containers Version: Xena Hypervisor Type: Libvirt KVM To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2033393/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2038840] [NEW] CPU state management fails if cpu0 is in dedicated set
Public bug reported: If an operator configures cpu0 in the dedicated set and enables state management, nova-compute will fail on startup with this obscure error: Oct 06 20:08:43.195137 np0035436890 nova-compute[104711]: ERROR oslo_service.service nova.exception.FileNotFound: File /sys/devices/system/cpu/cpu0/online could not be found. The problem is that cpu0 is not hot-pluggable and thus has no online knob. Nova should log a better error message in this case, at least. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2038840 Title: CPU state management fails if cpu0 is in dedicated set Status in OpenStack Compute (nova): New Bug description: If an operator configures cpu0 in the dedicated set and enables state management, nova-compute will fail on startup with this obscure error: Oct 06 20:08:43.195137 np0035436890 nova-compute[104711]: ERROR oslo_service.service nova.exception.FileNotFound: File /sys/devices/system/cpu/cpu0/online could not be found. The problem is that cpu0 is not hot-pluggable and thus has no online knob. Nova should log a better error message in this case, at least. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2038840/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2039463] [NEW] live migration jobs failing missing lxml
Public bug reported: Our jobs that run the evacuate post hook are failing due to not being able to run the ansible virt module because of a missing lxml library: 2023-10-16 14:38:57.818847 | TASK [run-evacuate-hook : Register running domains on subnode] 2023-10-16 14:38:58.598524 | controller -> 172.99.67.184 | ERROR 2023-10-16 14:38:58.598912 | controller -> 172.99.67.184 | { 2023-10-16 14:38:58.598981 | controller -> 172.99.67.184 | "msg": "The `lxml` module is not importable. Check the requirements." 2023-10-16 14:38:58.599046 | controller -> 172.99.67.184 | } Not sure why this is coming up now, but it's likely related to the recent switch to global venv for our services and some other dep change that no longer gets us this on the host for free. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2039463 Title: live migration jobs failing missing lxml Status in OpenStack Compute (nova): New Bug description: Our jobs that run the evacuate post hook are failing due to not being able to run the ansible virt module because of a missing lxml library: 2023-10-16 14:38:57.818847 | TASK [run-evacuate-hook : Register running domains on subnode] 2023-10-16 14:38:58.598524 | controller -> 172.99.67.184 | ERROR 2023-10-16 14:38:58.598912 | controller -> 172.99.67.184 | { 2023-10-16 14:38:58.598981 | controller -> 172.99.67.184 | "msg": "The `lxml` module is not importable. Check the requirements." 2023-10-16 14:38:58.599046 | controller -> 172.99.67.184 | } Not sure why this is coming up now, but it's likely related to the recent switch to global venv for our services and some other dep change that no longer gets us this on the host for free. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/2039463/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2051108] Re: Support for the "bring your own keys" approach for Cinder
** Also affects: cinder Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2051108 Title: Support for the "bring your own keys" approach for Cinder Status in Cinder: New Status in OpenStack Compute (nova): New Bug description: Description === Cinder currently lags support the API to create a volume with a predefined (e.g. already stored in Barbican) encryption key. This feature would be useful for use cases where end-users should be enabled to store keys later on used to encrypt volumes. Work flow would be as follow: 1. End user creates a new key and stores it in OpenStack Barbican 2. User requests a new volume with volume type "LUKS" and gives an "encryption_reference_key_id" (or just "key_id"). 3. Internally the key is copied (like in volume_utils.clone_encryption_key_()) and a new "encryption_key_id". To manage notifications about this bug go to: https://bugs.launchpad.net/cinder/+bug/2051108/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 2079850] [NEW] Ephemeral with vfat format fails inspection
Public bug reported: When configured to format ephemerals as vfat, we get this failure: Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.358 2 DEBUG oslo_utils.imageutils.format_inspector [None req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb ae43aec9c3c242a785c8256abdda1747 - - default default] Format inspector failed, aborting: Signature KDMV not found: b'\xebX\x90m' _process_chunk /usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py:1302 Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.365 2 DEBUG oslo_utils.imageutils.format_inspector [None req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb ae43aec9c3c242a785c8256abdda1747 - - default default] Format inspector failed, aborting: Region signature not found at 3 _process_chunk /usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py:1302 Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.366 2 WARNING oslo_utils.imageutils.format_inspector [None req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb ae43aec9c3c242a785c8256abdda1747 - - default default] Safety check mbr on gpt failed because GPT MBR has no partitions defined: oslo_utils.imageutils.format_inspector.SafetyViolation: GPT MBR has no partitions defined Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.366 2 WARNING nova.virt.libvirt.imagebackend [None req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb ae43aec9c3c242a785c8256abdda1747 - - default default] Base image /var/lib/nova/instances/_base/ephemeral_1_0706d66 failed safety check: Safety checks failed: mbr: oslo_utils.imageutils.format_inspector.SafetyCheckFailed: Safety checks failed: mbr Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [None req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb ae43aec9c3c242a785c8256abdda1747 - - default default] [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Instance failed to spawn: nova.exception.InvalidDiskInfo: Disk info file is invalid: Base image failed safety check Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Traceback (most recent call last): Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/imagebackend.py", line 685, in create_image Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] inspector.safety_check() Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] File "/usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py", line 430, in safety_check Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] raise SafetyCheckFailed(failures) Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] oslo_utils.imageutils.format_inspector.SafetyCheckFailed: Safety checks failed: mbr Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] During handling of the above exception, another exception occurred: Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Traceback (most recent call last): Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2894, in _build_resources Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] yield resources Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2641, in _build_and_run_instance Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2
[Yahoo-eng-team] [Bug 2012530] [NEW] nova-scheduler will crash at startup if placement is not up
Public bug reported: This is the same problem as https://bugs.launchpad.net/nova/+bug/1846820 but for scheduler. Because we initialize our placement client during manager init, we will crash (and loop) on startup if keystone or placement are down. Example trace: Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova.scheduler.client.report [None req-edf5-6f86-4910-a458-72decae8e451 None None] Failed to initialize placement client (is keystone available?): openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported versions. Mar 22 15:54:39 jammy nova-scheduler[119746]: CRITICAL nova [None req-edf5-6f86-4910-a458-72decae8e451 None None] Unhandled error: openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported versions. Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova Traceback (most recent call last): Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/usr/local/bin/nova-scheduler", line 10, in Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova sys.exit(main()) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/cmd/scheduler.py", line 47, in main Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova server = service.Service.create( Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/service.py", line 252, in create Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova service_obj = cls(host, binary, topic, manager, Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/service.py", line 116, in __init__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self.manager = manager_class(host=self.host, *args, **kwargs) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/manager.py", line 70, in __init__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self.placement_client = report.report_client_singleton() Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/client/report.py", line 91, in report_client_singleton Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova PLACEMENTCLIENT = SchedulerReportClient() Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/client/report.py", line 234, in __init__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self._client = self._create_client() Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/scheduler/client/report.py", line 277, in _create_client Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova client = self._adapter or utils.get_sdk_adapter('placement') Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/opt/stack/nova/nova/utils.py", line 984, in get_sdk_adapter Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova return getattr(conn, service_type) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", line 87, in __get__ Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova proxy = self._make_proxy(instance) Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova File "/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", line 266, in _make_proxy Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova raise exceptions.NotSupported( Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported versions. Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/2012530 Title: nova-scheduler will crash at startup if placement is not up Status in OpenStack Compute (nova): New Bug description: This is the same problem as https://bugs.launchpad.net/nova/+bug/1846820 but for scheduler. Because we initialize our placement client during manager init, we will crash (and loop) on startup if keystone or placement are down. Example trace: Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova.scheduler.client.report [None req-edf5-6f86-4910-a458-72decae8e451 None None] Failed to initialize placement client (is keystone available?): openstack.exceptions.NotSupported: The placement service for 192.168.122.154:RegionOne exists but does not have any supported versions. Mar 22 15:54:39 jammy nova-scheduler[119746]: CRITICAL nova [None req-edf5-6f86-4910-a458-72decae8e451 None None] Unhandled error: openstack.exception
[Yahoo-eng-team] [Bug 1853048] Re: Nova not updating VM's XML in KVM
Nova does not even call down to the compute node when attributes like display_name are changed. The next time the xml is updated would be when it is regenerated, like during a lifecycle event (hard reboot) or migration. Ceilometer scraping that information out of the libvirt XML underneath nova is, as expected, not reliable. Changing this would require new a RPC call, and would add load to rabbit, the compute, and introduce additional traffic between nova and libvirt. If there was some strong use-case for this, maybe that would be worthwhile, but I don't think ceilometer wanting to scrape those metadata items from the libvirt XML is strong enough. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1853048 Title: Nova not updating VM's XML in KVM Status in Ceilometer: New Status in OpenStack Compute (nova): Invalid Bug description: Ceilometer was causing resources to have a huge amount of revisions on Gnocchi (1000+), because the compute pollsters were constantly pushing outdated attributes. This can happen when users (as an example), update the name of VMs. The name is not updated in the VM's XML that is stored in the KVM host. This causes the Ceilometer compute pollster to constantly push outdated attributes that trigger resource revisions on Gnocchi (if we have other pollsters pushing the right attribute value that is gathered from OpenStack API). We are using OpenStack Rocky, and Nova version is 18.0.1. To manage notifications about this bug go to: https://bugs.launchpad.net/ceilometer/+bug/1853048/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1858877] [NEW] Silent wasted storage with multiple RBD backends
Public bug reported: Nova does not currently support multiple rbd backends. However, Glance does and an operator may point Nova at a Glance with access to multiple RBD clusters. If this happens, Nova will silently download the image from Glance, flatten it, and upload it to the local RBD cluster named privately to the image. If another instance is booted from the same image, this will happen again, using more network resources and duplicating the image on ceph for the second and subsequent instances. When configuring Nova and Glance for shared RBD, the expectation is that instances are fast-cloned from Glance base images, so this silent behavior of using a lot of storage would be highly undesirable and unexpected. Since operators control the backend config, but users upload images (and currently only to one backend), it is the users that would trigger this additional consumption of storage. This isn't really a bug in Nova per se, since Nova does not claim to support multiple backends and is download/uploading the image in the same way it would if the image was located on any other not-the-same-as- my-RBD-cluster location. It is, however, unexpected and undesirable behavior. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1858877 Title: Silent wasted storage with multiple RBD backends Status in OpenStack Compute (nova): New Bug description: Nova does not currently support multiple rbd backends. However, Glance does and an operator may point Nova at a Glance with access to multiple RBD clusters. If this happens, Nova will silently download the image from Glance, flatten it, and upload it to the local RBD cluster named privately to the image. If another instance is booted from the same image, this will happen again, using more network resources and duplicating the image on ceph for the second and subsequent instances. When configuring Nova and Glance for shared RBD, the expectation is that instances are fast-cloned from Glance base images, so this silent behavior of using a lot of storage would be highly undesirable and unexpected. Since operators control the backend config, but users upload images (and currently only to one backend), it is the users that would trigger this additional consumption of storage. This isn't really a bug in Nova per se, since Nova does not claim to support multiple backends and is download/uploading the image in the same way it would if the image was located on any other not-the-same- as-my-RBD-cluster location. It is, however, unexpected and undesirable behavior. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1858877/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1884587] [NEW] image import copy-to-store API should reflect proper authorization
Public bug reported: In testing the image import copy-to-store mechanism from Nova, I hit an issue that seems clearly to be a bug. Scenario: A user boots an instance from an image they have permission to see. Nova uses their credentials to start an image import copy-to-store operation, which succeeds: "POST /v2/images/e6b1a7d0-ccd8-4be3-bef7-69c68fca4313/import HTTP/1.1" 202 211 0.481190 Task [888e97e5-496d-4b94-b530-218d633f866a] status changing from pending to processing Note the 202 return code. My code polls for a $timeout period, waiting for the image to either arrive at the new store, or be marked as error, which never happens ($timeout=600s). The glance log shows (trace truncated): glance-api[14039]: File "/opt/stack/glance/glance/async_/flows/api_image_import.py", line 481, in get_flow glance-api[14039]: stores if glance-api[14039]: File "/opt/stack/glance/glance/api/authorization.py", line 296, in forbidden_key glance-api[14039]: raise exception.Forbidden(message % key) glance-api[14039]: glance.common.exception.Forbidden: You are not permitted to modify 'os_glance_importing_to_stores' on this image. So apparently Nova is unable to use the user's credentials to initiate a copy-to-store operation. That surprises me and I think it likely isn't the access control we should be enforcing. However, if we're going to reject the operation, we should reject it at the time the HTTP response is sent, not later async, since we can check authorization right then and there. The problem in this case is that from the outside, I have no way of knowing that the task fails subsequently. I receive a 202, which means I should start polling for completion. The task fails to load/run and thus can't update any status on the image, and I'm left to wait for 600s before I give up. So, at the very least, we're not checking the same set of permissions during the HTTP POST call, and we should be. I also would tend to argue that the user should be allowed to copy the image and not require an admin to do it, perhaps with some additional policy element to control that. However, I have to be able to determine when and when not to wait for 600s. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1884587 Title: image import copy-to-store API should reflect proper authorization Status in Glance: New Bug description: In testing the image import copy-to-store mechanism from Nova, I hit an issue that seems clearly to be a bug. Scenario: A user boots an instance from an image they have permission to see. Nova uses their credentials to start an image import copy-to-store operation, which succeeds: "POST /v2/images/e6b1a7d0-ccd8-4be3-bef7-69c68fca4313/import HTTP/1.1" 202 211 0.481190 Task [888e97e5-496d-4b94-b530-218d633f866a] status changing from pending to processing Note the 202 return code. My code polls for a $timeout period, waiting for the image to either arrive at the new store, or be marked as error, which never happens ($timeout=600s). The glance log shows (trace truncated): glance-api[14039]: File "/opt/stack/glance/glance/async_/flows/api_image_import.py", line 481, in get_flow glance-api[14039]: stores if glance-api[14039]: File "/opt/stack/glance/glance/api/authorization.py", line 296, in forbidden_key glance-api[14039]: raise exception.Forbidden(message % key) glance-api[14039]: glance.common.exception.Forbidden: You are not permitted to modify 'os_glance_importing_to_stores' on this image. So apparently Nova is unable to use the user's credentials to initiate a copy-to-store operation. That surprises me and I think it likely isn't the access control we should be enforcing. However, if we're going to reject the operation, we should reject it at the time the HTTP response is sent, not later async, since we can check authorization right then and there. The problem in this case is that from the outside, I have no way of knowing that the task fails subsequently. I receive a 202, which means I should start polling for completion. The task fails to load/run and thus can't update any status on the image, and I'm left to wait for 600s before I give up. So, at the very least, we're not checking the same set of permissions during the HTTP POST call, and we should be. I also would tend to argue that the user should be allowed to copy the image and not require an admin to do it, perhaps with some additional policy element to control that. However, I have to be able to determine when and when not to wait for 600s. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1884587/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https:
[Yahoo-eng-team] [Bug 1884596] [NEW] image import copy-to-store will start multiple importing threads due to race condition
Public bug reported: I'm filing this bug a little prematurely because Abhi and I didn't get a chance to fully discuss it. However, looking at the code and the behavior I'm seeing due to another bug (1884587), I feel rather confident. Especially in a situation where glance is running on multiple control plane nodes (i.e. any real-world situation), I believe there is a race condition whereby two closely-timed requests to copy an image to a store will result in two copy operations in glance proceeding in parallel. I believe this to be the case due to a common "test-and-set that isn't atomic" error. In the API layer, glance checks that an import copy-to-store operation isn't already in progress here: https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/api/v2/images.py#L167 And if that passes, it proceeds to setup the task as a thread here: https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/api/v2/images.py#L197 which may start running immediately or sometime in the future. Once running, that code updates a property on the image to indicate that the task is running here: https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/async_/flows/api_image_import.py#L479-L484 Between those two events, if another API user makes the same request, glance will not realize that a thread is already running to complete the initial task and will start another. In a situation where a user spawns a thousand new instances to a thousand compute nodes in a single operation where the image needs copying first, it's highly plausible to have _many_ duplicate glance operations going, impacting write performance on the rbd cluster at the very least. As evidence that this can happen, we see an abnormally extended race window because of the aforementioned bug (1884587) where we fail to update the property that indicates the task is running. In a test we see a large number of them get started, followed by a cascade of failures when they fail to update that image property, implying that many such threads are running. If this situation is allowed to happen when the property does *not* fail to update, I believe we would end up with glance copying the image to the destination in multiple threads simultaneously. That is much harder to simulate in practice in a development environment, but the other bug makes it happen every time since we never update the image property to prevent it and thus the window is long. Abhi also brought up the case where if this race occurs on the same node, the second attempt *may* actually start copying the partial image in the staging directory to the destination, finish early, and then mark the image as "copied to $store" such that nova will attempt to use the partial image immediately, resulting in a corrupted disk and various levels of failure after that. Note that it's not clear if that's really possible or not, but I'm putting it here so the glance gurus can validate. The use of the os_glance_importing_to_stores property to "lock" a copy to a particular store is good, except that updating that list atomically means that the above mentioned race will not have anything to check after the update to see if it was the race loser. I don't see any checks in the persistence layer to ensure that an UPDATE to the row with this property doesn't already have a given store in it, or do any kind of merge. This also leads me to worry that two parallel requests to copy an image to two different stores may result in clobbering the list of stores-in-progress and potentially also the final list of stores at rest. This is just conjecture at this point, I just haven't seen anywhere that situation is accounted for. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1884596 Title: image import copy-to-store will start multiple importing threads due to race condition Status in Glance: New Bug description: I'm filing this bug a little prematurely because Abhi and I didn't get a chance to fully discuss it. However, looking at the code and the behavior I'm seeing due to another bug (1884587), I feel rather confident. Especially in a situation where glance is running on multiple control plane nodes (i.e. any real-world situation), I believe there is a race condition whereby two closely-timed requests to copy an image to a store will result in two copy operations in glance proceeding in parallel. I believe this to be the case due to a common "test-and-set that isn't atomic" error. In the API layer, glance checks that an import copy-to-store operation isn't already in progress here: https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/api/v2/images.py#L167 And if that passes,
[Yahoo-eng-team] [Bug 1885003] [NEW] Interrupted copy-to-store may corrupt a subsequent operation
Public bug reported: This is a hypothetical (but very possible) scenario that will result in a corrupted image stored by glance. I don't have code to reproduce it, but discussion seems to indicate that it is possible. Scenario: 1. Upload image to glance to one store, everything is good 2. Start an image_import(method='copy-to-store') to copy the image to another store 3. Power failure, network failure, or `killall -9 glance-api` 4. After the failure, re-request the copy-to-store 5. That glance worker will see the residue of the image in the staging directory, which is only partial because the process never finished, and will start uploading that to the new store 6. Upon completion, the image will appear in two stores, but one of them will be quietly corrupted ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1885003 Title: Interrupted copy-to-store may corrupt a subsequent operation Status in Glance: New Bug description: This is a hypothetical (but very possible) scenario that will result in a corrupted image stored by glance. I don't have code to reproduce it, but discussion seems to indicate that it is possible. Scenario: 1. Upload image to glance to one store, everything is good 2. Start an image_import(method='copy-to-store') to copy the image to another store 3. Power failure, network failure, or `killall -9 glance-api` 4. After the failure, re-request the copy-to-store 5. That glance worker will see the residue of the image in the staging directory, which is only partial because the process never finished, and will start uploading that to the new store 6. Upon completion, the image will appear in two stores, but one of them will be quietly corrupted To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1885003/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1746294] [NEW] Scheduler requests unlimited results from placement
Public bug reported: The scheduler will request an infinitely-large host set from placement during scheduling operations. This may be very large on big clouds and makes for a huge JSON response from placement to scheduler each time a single host needs to be selected. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: In Progress ** Tags: queens-rc-potential scheduler -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1746294 Title: Scheduler requests unlimited results from placement Status in OpenStack Compute (nova): In Progress Bug description: The scheduler will request an infinitely-large host set from placement during scheduling operations. This may be very large on big clouds and makes for a huge JSON response from placement to scheduler each time a single host needs to be selected. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1746294/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1755602] [NEW] Ironic computes may not be discovered when node count is less than compute count
Public bug reported: In an ironic deployment being built from day zero, there is an ordering problem, which generates a race condition for operators. Consider this common example: At config time, you create and start three nova-compute services pointing at your ironic deployment. These three will be HA using the ironic driver's hash ring functionality. At config time, there are no ironic nodes present yet, which means running discover_hosts will create no host mappings. Next, a single ironic node is added, which is owned by one of the computes per the hash rules. At this point, you can run discover_hosts and whatever compute owns that node will get a host mapping. Then you add a second ironic node, which causes all three nova-computes to rebalance the hash ring. One or more of the ironic nodes will definitely land on one of the other nova-computes and will suddenly be unreachable because there is no host mapping until the next time discover_hosts is run. Since we track the "mapped" bit on compute nodes, and compute nodes move between hosts with ironic, we won't even notice that the new owner nova-compute needs a host mapping. In fact, we won't notice until we get lucky enough to land a never-mapped ironic node on a nova-compute for the first time and then run discover_hosts after that point. For an automated config management system, this is a lot of complexity to handle in order to generate a stable output of a working system. In many cases where you're using ironic to bootstrap another deployment (i.e. tripleo) the number of nodes may be small (less than the computes) for quite some time. There are a couple obvious options I see: 1. Add a --and-services flag to nova-manage, which will also look for all nova-compute services in the cell and make sure those have mappings. This is ideal because we could get all services mapped at config time without even having to have an ironic node in place yet (which is not possible today). We can't do this efficiently right away because nova.services does not have a mapped flag, and thus the scheduler periodic should _not_ include services. 2. We could unset compute_node.mapped any time we re-home an ironic node to a different nova-compute. This would cause our scheduler periodic to notice the change and create a host mapping if it happens to move to an unmapped nova-compute. This generates extra work during normal operating state and also still leaves us with an interval of time where a previously-usable ironic node becomes unusable until the host discovery periodic task runs again. IMHO, we should do #1. It's a backportable change, and it's actually a better workflow for config automation tools than what we have today, even discounting this race. We can do what we did before, which is do it once for backports, and then add a mapped bit in master to make it more efficient, allowing it to be included in the scheduler periodic task. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Confirmed ** Tags: cells ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Status: New => Confirmed ** Bug watch added: Red Hat Bugzilla #1554460 https://bugzilla.redhat.com/show_bug.cgi?id=1554460 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1755602 Title: Ironic computes may not be discovered when node count is less than compute count Status in OpenStack Compute (nova): Confirmed Bug description: In an ironic deployment being built from day zero, there is an ordering problem, which generates a race condition for operators. Consider this common example: At config time, you create and start three nova-compute services pointing at your ironic deployment. These three will be HA using the ironic driver's hash ring functionality. At config time, there are no ironic nodes present yet, which means running discover_hosts will create no host mappings. Next, a single ironic node is added, which is owned by one of the computes per the hash rules. At this point, you can run discover_hosts and whatever compute owns that node will get a host mapping. Then you add a second ironic node, which causes all three nova-computes to rebalance the hash ring. One or more of the ironic nodes will definitely land on one of the other nova-computes and will suddenly be unreachable because there is no host mapping until the next time discover_hosts is run. Since we track the "mapped" bit on compute nodes, and compute nodes move between hosts with ironic, we won't even notice that the new owner nova-compute needs a host mapping. In fact, we won't notice until we get lucky enough to land a never-mapped ironic node on a nova-compute for the first time and then run discover_hosts a
[Yahoo-eng-team] [Bug 1798158] [NEW] Non-templated transport_url will fail if not defined in config
Public bug reported: If transport_url is not defined in the config, we will fail to format a non-templated transport_url in the database like this: ERROR nova.objects.cell_mapping [None req-34831485-adf4-4a0d-bb20-e1736d93a451 None None] Failed to parse [DEFAULT]/transport_url to format cell mapping: AttributeError: 'NoneType' object has no attribute 'find' ERROR nova.objects.cell_mapping Traceback (most recent call last): ERROR nova.objects.cell_mapping File "/opt/stack/nova/nova/objects/cell_mapping.py", line 150, in _format_mq_url ERROR nova.objects.cell_mapping return CellMapping._format_url(url, CONF.transport_url) ERROR nova.objects.cell_mapping File "/opt/stack/nova/nova/objects/cell_mapping.py", line 101, in _format_url ERROR nova.objects.cell_mapping default_url = urlparse.urlparse(default) ERROR nova.objects.cell_mapping File "/usr/lib/python2.7/urlparse.py", line 143, in urlparse ERROR nova.objects.cell_mapping tuple = urlsplit(url, scheme, allow_fragments) ERROR nova.objects.cell_mapping File "/usr/lib/python2.7/urlparse.py", line 182, in urlsplit ERROR nova.objects.cell_mapping i = url.find(':') ERROR nova.objects.cell_mapping AttributeError: 'NoneType' object has no attribute 'find' ERROR nova.objects.cell_mapping ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1798158 Title: Non-templated transport_url will fail if not defined in config Status in OpenStack Compute (nova): In Progress Bug description: If transport_url is not defined in the config, we will fail to format a non-templated transport_url in the database like this: ERROR nova.objects.cell_mapping [None req-34831485-adf4-4a0d-bb20-e1736d93a451 None None] Failed to parse [DEFAULT]/transport_url to format cell mapping: AttributeError: 'NoneType' object has no attribute 'find' ERROR nova.objects.cell_mapping Traceback (most recent call last): ERROR nova.objects.cell_mapping File "/opt/stack/nova/nova/objects/cell_mapping.py", line 150, in _format_mq_url ERROR nova.objects.cell_mapping return CellMapping._format_url(url, CONF.transport_url) ERROR nova.objects.cell_mapping File "/opt/stack/nova/nova/objects/cell_mapping.py", line 101, in _format_url ERROR nova.objects.cell_mapping default_url = urlparse.urlparse(default) ERROR nova.objects.cell_mapping File "/usr/lib/python2.7/urlparse.py", line 143, in urlparse ERROR nova.objects.cell_mapping tuple = urlsplit(url, scheme, allow_fragments) ERROR nova.objects.cell_mapping File "/usr/lib/python2.7/urlparse.py", line 182, in urlsplit ERROR nova.objects.cell_mapping i = url.find(':') ERROR nova.objects.cell_mapping AttributeError: 'NoneType' object has no attribute 'find' ERROR nova.objects.cell_mapping To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1798158/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1719966] [NEW] Microversion 2.47 punches nova in its special place
Public bug reported: Testing with 500 instances in ACTIVE, and 500 in ERROR state, using curl to pull all 1000 instances ten times in a row, 2.47 clearly shows a knee in the curve on average response time: https://imgur.com/a/2lmiw We should...fix that and stuff. ** Affects: nova Importance: High Status: Confirmed ** Affects: nova/pike Importance: Undecided Status: New ** Tags: api performance -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1719966 Title: Microversion 2.47 punches nova in its special place Status in OpenStack Compute (nova): Confirmed Status in OpenStack Compute (nova) pike series: New Bug description: Testing with 500 instances in ACTIVE, and 500 in ERROR state, using curl to pull all 1000 instances ten times in a row, 2.47 clearly shows a knee in the curve on average response time: https://imgur.com/a/2lmiw We should...fix that and stuff. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1719966/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1738094] [NEW] TEXT is not large enough to store RequestSpec
Public bug reported: This error occurs during Newton's online_data_migration phase: error: (pymysql.err.DataError) (1406, u"Data too long for column 'spec' at row 1") [SQL: u'INSERT INTO request_specs Which comes from RequestSpec.instance_group.members being extremely large ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1738094 Title: TEXT is not large enough to store RequestSpec Status in OpenStack Compute (nova): New Bug description: This error occurs during Newton's online_data_migration phase: error: (pymysql.err.DataError) (1406, u"Data too long for column 'spec' at row 1") [SQL: u'INSERT INTO request_specs Which comes from RequestSpec.instance_group.members being extremely large To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1738094/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1648840] [NEW] libvirt driver leaves interface residue after failed start
Public bug reported: When the libvirt driver fails to start a VM due to reasons other than neutron plug timeout, it leaves interfaces on the system from the vif plugging. If a subsequent delete is performed and completes successfully, these will be removed. However, in cases where connectivity is preventing a normal delete, a local delete will be performed at the api level and the interfaces will remain. In at least one real world situation I have observed, a script was creating test instances which were failing and leaving residue. After the residue interface count reached about 6,000 on the system, VM creates started failing with "Argument list too long" as libvirt was choking on enumerating the interfaces it had left behind. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: In Progress ** Affects: nova/newton Importance: Undecided Status: New ** Changed in: nova Status: New => Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Assignee: (unassigned) => Dan Smith (danms) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1648840 Title: libvirt driver leaves interface residue after failed start Status in OpenStack Compute (nova): In Progress Status in OpenStack Compute (nova) newton series: New Bug description: When the libvirt driver fails to start a VM due to reasons other than neutron plug timeout, it leaves interfaces on the system from the vif plugging. If a subsequent delete is performed and completes successfully, these will be removed. However, in cases where connectivity is preventing a normal delete, a local delete will be performed at the api level and the interfaces will remain. In at least one real world situation I have observed, a script was creating test instances which were failing and leaving residue. After the residue interface count reached about 6,000 on the system, VM creates started failing with "Argument list too long" as libvirt was choking on enumerating the interfaces it had left behind. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1648840/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1652233] Re: mitaka is incompatible with newton - IncompatibleObjectVersion Version 2.1 of InstanceList is not supported
Yeah, mixed-version controllers isn't supported. We've made some progress towards being able to support it in master, but it's definitely not going to work in mitaka/newton. You have to upgrade your controllers simultaneously (well, most critically, your conductor services), and then you can have any mix of versions among your computes that you want. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1652233 Title: mitaka is incompatible with newton - IncompatibleObjectVersion Version 2.1 of InstanceList is not supported Status in OpenStack Compute (nova): Invalid Bug description: Description === I get an error after upgrade half of my cluster. Can't place any VMs. "RemoteError: Remote error: IncompatibleObjectVersion Version 2.1 of InstanceList is not supported" Steps to reproduce == 1) Install 4 nodes with mitaka 2) Disable 2 nodes (1 api controller and 1 compute): nova service-disable 3) Upgrade to newton on the disable nodes 4) compute=mitaka to [upgrade_levels] 5) db sync 6) Start newton 7) Try to place any VMs, it will fail 8) See nova-compute.log on the mitaka nodes Expected result === Successful upgrade one half of cluster, then another half Actual result = Nova can't place any VMs. Compute logs: 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task [req-41e6df10-b33b-47f5-be0c-86793cbcae6e - - - - -] Error during ComputeManager._sync_scheduler_instance_info 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task Traceback (most recent call last): 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task task(self, context) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1637, in _sync_scheduler_instance_info 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task use_slave=True) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 177, in wrapper 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task args, kwargs) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 236, in object_class_action_versions 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task args=args, kwargs=kwargs) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task retry=self.retry) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 97, in _send 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task timeout=timeout, retry=retry) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task retry=retry) 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 455, in _send 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task raise result 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task RemoteError: Remote error: IncompatibleObjectVersion Version 2.1 of InstanceList is not supported 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply\n', u' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 185, in _dispatch\n', u' File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch\n:param incoming: incoming message\n', u' File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 92, in object_class_action_versions\nobjname, object_versions[objname])\n', u' File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 374, in obj_class_from_name\nsupported=latest_ver)\n', u'IncompatibleObjectVersion: Version 2.1 of InstanceList is not supported\n']. 2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task nova-conductor: 2016-12-23 08:01:00.489 9958 ERROR oslo_messaging.rpc.di
[Yahoo-eng-team] [Bug 1655494] [NEW] Newton scheduler clients should keep trying to report
Public bug reported: Newton scheduler clients will stop reporting any time they encounter a setup-related error, which isn't very operator-friendly for the ocata upgrade process. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Confirmed ** Tags: newton-backport-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1655494 Title: Newton scheduler clients should keep trying to report Status in OpenStack Compute (nova): Confirmed Bug description: Newton scheduler clients will stop reporting any time they encounter a setup-related error, which isn't very operator-friendly for the ocata upgrade process. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1655494/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1693911] [NEW] compute node statistics will lie if service records are deleted
Public bug reported: If a compute node references a deleted service, we will include it in the compute node statistics output. This happens even if the compute node record _is_ deleted, because of our join of the services table, which causes us to get back rows anyway. This results in the stats showing more resource than actually exists, and disagreeing with the sum of all the individual hypervisor-show operations. This is the query we're doing: MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON compute_nodes.host=services.host WHERE services.binary="nova-compute" AND compute_nodes.deleted=0; ++ | SUM(memory_mb) | ++ |1047917 | ++ 1 row in set (0.00 sec) And this is what we *should* be doing MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON compute_nodes.host=services.host WHERE services.binary="nova-compute" AND compute_nodes.deleted=0 AND services.deleted=0; ++ | SUM(memory_mb) | ++ | 655097 | ++ 1 row in set (0.00 sec) The second value is correct for the database in question. ** Affects: nova Importance: Undecided Status: Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1693911 Title: compute node statistics will lie if service records are deleted Status in OpenStack Compute (nova): Won't Fix Bug description: If a compute node references a deleted service, we will include it in the compute node statistics output. This happens even if the compute node record _is_ deleted, because of our join of the services table, which causes us to get back rows anyway. This results in the stats showing more resource than actually exists, and disagreeing with the sum of all the individual hypervisor-show operations. This is the query we're doing: MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON compute_nodes.host=services.host WHERE services.binary="nova-compute" AND compute_nodes.deleted=0; ++ | SUM(memory_mb) | ++ |1047917 | ++ 1 row in set (0.00 sec) And this is what we *should* be doing MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON compute_nodes.host=services.host WHERE services.binary="nova-compute" AND compute_nodes.deleted=0 AND services.deleted=0; ++ | SUM(memory_mb) | ++ | 655097 | ++ 1 row in set (0.00 sec) The second value is correct for the database in question. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1693911/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1693911] Re: compute node statistics will lie if service records are deleted
Dupe of 1692397 ** Changed in: nova Status: New => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1693911 Title: compute node statistics will lie if service records are deleted Status in OpenStack Compute (nova): Won't Fix Bug description: If a compute node references a deleted service, we will include it in the compute node statistics output. This happens even if the compute node record _is_ deleted, because of our join of the services table, which causes us to get back rows anyway. This results in the stats showing more resource than actually exists, and disagreeing with the sum of all the individual hypervisor-show operations. This is the query we're doing: MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON compute_nodes.host=services.host WHERE services.binary="nova-compute" AND compute_nodes.deleted=0; ++ | SUM(memory_mb) | ++ |1047917 | ++ 1 row in set (0.00 sec) And this is what we *should* be doing MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON compute_nodes.host=services.host WHERE services.binary="nova-compute" AND compute_nodes.deleted=0 AND services.deleted=0; ++ | SUM(memory_mb) | ++ | 655097 | ++ 1 row in set (0.00 sec) The second value is correct for the database in question. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1693911/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1696125] Re: Detach interface failed - Unable to detach from guest transient domain (pike)
** Changed in: nova Status: Fix Released => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1696125 Title: Detach interface failed - Unable to detach from guest transient domain (pike) Status in OpenStack Compute (nova): Confirmed Bug description: Seeing this in Tempest runs on master (pike): http://logs.openstack.org/24/471024/2/check/gate-tempest-dsvm-neutron- linuxbridge-ubuntu- xenial/6b98d38/logs/screen-n-cpu.txt.gz?level=TRACE#_Jun_06_02_16_02_855503 Jun 06 02:16:02.855503 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: WARNING nova.compute.manager [None req-b4a50024-a2fd-4279-b284-340d2074f1c1 tempest-TestNetworkBasicOps-1479445685 tempest-TestNetworkBasicOps-1479445685] [instance: 2668bcb9-b13d-4b5b-8ee5-edbdee3b15a8] Detach interface failed, port_id=3843caa3-ab04-45f1-94d8-f330390e40fe, reason: Device detach failed for fa:16:3e:ab:e3:3f: Unable to detach from guest transient domain.: DeviceDetachFailed: Device detach failed for fa:16:3e:ab:e3:3f: Unable to detach from guest transient domain. Jun 06 02:16:02.884007 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server [None req-b4a50024-a2fd-4279-b284-340d2074f1c1 tempest-TestNetworkBasicOps-1479445685 tempest-TestNetworkBasicOps-1479445685] Exception during message handling: InterfaceDetachFailed: Failed to detach network adapter device from 2668bcb9-b13d-4b5b-8ee5-edbdee3b15a8 Jun 06 02:16:02.884180 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server Traceback (most recent call last): Jun 06 02:16:02.884286 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 157, in _process_incoming Jun 06 02:16:02.884395 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) Jun 06 02:16:02.884538 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 213, in dispatch Jun 06 02:16:02.884669 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) Jun 06 02:16:02.884777 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _do_dispatch Jun 06 02:16:02.884869 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) Jun 06 02:16:02.884968 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/exception_wrapper.py", line 77, in wrapped Jun 06 02:16:02.885069 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server function_name, call_dict, binary) Jun 06 02:16:02.885171 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ Jun 06 02:16:02.885272 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server self.force_reraise() Jun 06 02:16:02.885367 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise Jun 06 02:16:02.885461 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) Jun 06 02:16:02.885554 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/exception_wrapper.py", line 68, in wrapped Jun 06 02:16:02.885649 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw) Jun 06 02:16:02.885755 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/compute/manager.py", line 214, in decorated_function Jun 06 02:16:02.885856 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info()) Jun 06 02:16:02.885950 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ Jun 06 02:16:02.886053 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERROR oslo_messaging.rpc.server self.force_reraise() Jun 06 02:16:02.886143 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: ERRO
[Yahoo-eng-team] [Bug 1698383] [NEW] Resource tracker regressed reporting negative memory
Public bug reported: Nova's resource tracker is expected to publish negative values to the scheduler when resources are overcommitted. Nova's scheduler expects this: https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215 In change https://review.openstack.org/#/c/306670, these values were filtered to never drop below zero, which is incorrect. That change was making a complex alteration for ironic and cells, specifically to avoid resources from ironic nodes showing up as negative when they were unavailable. That was a cosmetic fix (which I believe has been corrected for ironic only in this patch: https://review.openstack.org/#/c/230487/ Regardless, since the scheduler does the same calculation to determine available resources on the node, if the node reports 0 when the scheduler calculates -100 for a given resource, the scheduler will assume the node till has room (due to oversubscription) and will send builds there destined to fail. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1698383 Title: Resource tracker regressed reporting negative memory Status in OpenStack Compute (nova): New Bug description: Nova's resource tracker is expected to publish negative values to the scheduler when resources are overcommitted. Nova's scheduler expects this: https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215 In change https://review.openstack.org/#/c/306670, these values were filtered to never drop below zero, which is incorrect. That change was making a complex alteration for ironic and cells, specifically to avoid resources from ironic nodes showing up as negative when they were unavailable. That was a cosmetic fix (which I believe has been corrected for ironic only in this patch: https://review.openstack.org/#/c/230487/ Regardless, since the scheduler does the same calculation to determine available resources on the node, if the node reports 0 when the scheduler calculates -100 for a given resource, the scheduler will assume the node till has room (due to oversubscription) and will send builds there destined to fail. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1698383/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1707071] [NEW] Compute nodes will fight over allocations during migration
Public bug reported: As far back as Ocata, compute nodes that manage allocations will end up overwriting allocations from other compute nodes when doing a migration. This stems from the fact that the Resource Tracker was designed to manage a per-compute-node set of accounting, but placement is per- instance accounting. When we try to create/update/delete allocations for instances on compute nodes from the existing resource tracker code paths, we end up deleting allocations that apply to other compute nodes in the process. For example, when an instance A is running against compute1, there is an allocation for its resources against that node. When migrating that instance to compute2, the target compute (or scheduler) may create allocations for instance A against compute2, which overwrite those for compute1. Then, compute1's periodic healing task runs, and deletes the allocation for instance A against compute2, replacing it with one for compute1. When migration completes, compute2 heals again and overwrites the allocation with one for the new home of the instance. Then, compute1 may the allocation it thinks it owns, followed finally by another heal on compute2. While this is going on, the scheduler (via placement) does not have a consistent view of resources to make proper decisions. In order to fix this, we need a combination of changes: 1. There should be allocations against both compute nodes for an instance during a migration 2. Compute nodes should respect the double claim, and not delete allocations for instances it used to own, if the allocation has no resources for its resource provider 3. Compute nodes should not delete allocations for instances unless they own the instance _and_ the instance is in DELETED/SHELVED_OFFLOADED state ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1707071 Title: Compute nodes will fight over allocations during migration Status in OpenStack Compute (nova): New Bug description: As far back as Ocata, compute nodes that manage allocations will end up overwriting allocations from other compute nodes when doing a migration. This stems from the fact that the Resource Tracker was designed to manage a per-compute-node set of accounting, but placement is per-instance accounting. When we try to create/update/delete allocations for instances on compute nodes from the existing resource tracker code paths, we end up deleting allocations that apply to other compute nodes in the process. For example, when an instance A is running against compute1, there is an allocation for its resources against that node. When migrating that instance to compute2, the target compute (or scheduler) may create allocations for instance A against compute2, which overwrite those for compute1. Then, compute1's periodic healing task runs, and deletes the allocation for instance A against compute2, replacing it with one for compute1. When migration completes, compute2 heals again and overwrites the allocation with one for the new home of the instance. Then, compute1 may the allocation it thinks it owns, followed finally by another heal on compute2. While this is going on, the scheduler (via placement) does not have a consistent view of resources to make proper decisions. In order to fix this, we need a combination of changes: 1. There should be allocations against both compute nodes for an instance during a migration 2. Compute nodes should respect the double claim, and not delete allocations for instances it used to own, if the allocation has no resources for its resource provider 3. Compute nodes should not delete allocations for instances unless they own the instance _and_ the instance is in DELETED/SHELVED_OFFLOADED state To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1707071/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1713095] [NEW] Nova compute driver init happens before conductor is ready
Public bug reported: In nova/service.py we poll for conductor readiness before we allow normal service startup behavior. The ironic driver does RPC to conductor in its _refresh_hash_ring() code, which may expect conductor be up before it's not. If so, we'll fail to start up because we called to conductor, waited a long time, and then timed out. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1713095 Title: Nova compute driver init happens before conductor is ready Status in OpenStack Compute (nova): New Bug description: In nova/service.py we poll for conductor readiness before we allow normal service startup behavior. The ironic driver does RPC to conductor in its _refresh_hash_ring() code, which may expect conductor be up before it's not. If so, we'll fail to start up because we called to conductor, waited a long time, and then timed out. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1713095/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1660160] Re: No host-to-cell mapping found for selected host
Something in your config has been preventing compute nodes from creating their compute node records for much longer than the referenced patch has been in place. I picked a random older run and found the same compute node record create failure: http://logs.openstack.org/95/422795/4/check/gate-tripleo-ci- centos-7-undercloud/9d4dda4/logs/var/log/nova/nova- compute.txt.gz#_2017-01-20_15_58_59_030 The referenced patch does require those compute node records, just like many other pieces of nova (your resource tracking will be wrong without it) but it is only related in as much as it requires them to be there in order to boot an instance. The ComputeNode records are very fundamental to Nova and have been for years, before cellsv2 was even a thing. Without the compute node records, the discover_hosts step will not be able to create HostMapping records for the compute nodes, which is what the "No host-to-cell mapping" message is about. So, this is, IMHO, not a Nova bug but just something config-related on the tripleo side. I'm not sure what exactly would cause that compute node record create failure, but I expect it's something minor. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1660160 Title: No host-to-cell mapping found for selected host Status in OpenStack Compute (nova): Invalid Status in tripleo: Triaged Bug description: This report is maybe not a bug but I found useful to share what happens in TripleO since this commit: https://review.openstack.org/#/c/319379/ We are unable to deploy the overcloud nodes anymore (in other words, create servers with Nova / Ironic). Nova Conductor sends this message: "No host-to-cell mapping found for selected host" http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/undercloud/var/log/nova/nova-conductor.txt.gz#_2017-01-27_19_21_56_348 And it sounds like the compute host is not registered: http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/undercloud/var/log/nova/nova-compute.txt.gz#_2017-01-27_18_56_56_543 Nova Config is available here: http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/etc/nova/nova.conf.txt.gz That's all the details I have now, feel free for more details if needed. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1660160/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1661312] [NEW] Evacuation will corrupt instance allocations
Public bug reported: The following sequence of events will result in a corrupted instance allocation in placement: 1. Instance running on host A, placement has allocations for instance on host A 2. Host A goes down 3. Instance is evacuated to host B, host B creates duplicated allocations in placement for instance 4. Host A comes up, notices that instance is gone, deletes all allocations for instance on both hosts A and B 5. Instance now has no allocations for a period 6. Eventually, host B will re-create the allocations for the instance The period between #4 and #6 will have the scheduler making bad decisions because it thinks host B is less loaded than it is. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1661312 Title: Evacuation will corrupt instance allocations Status in OpenStack Compute (nova): New Bug description: The following sequence of events will result in a corrupted instance allocation in placement: 1. Instance running on host A, placement has allocations for instance on host A 2. Host A goes down 3. Instance is evacuated to host B, host B creates duplicated allocations in placement for instance 4. Host A comes up, notices that instance is gone, deletes all allocations for instance on both hosts A and B 5. Instance now has no allocations for a period 6. Eventually, host B will re-create the allocations for the instance The period between #4 and #6 will have the scheduler making bad decisions because it thinks host B is less loaded than it is. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1661312/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
What you describe is fundamental to how nova works right now. We speculate in the scheduler, and if we race between two, we handle it with a reschedule. Nova specifically states that scheduling every last resource is out of scope. When trying to do that (which is often the use case for ironic) you're likely to hit this race as you run out of capacity: https://github.com/openstack/nova/blob/master/doc/source/project_scope.rst #iaas-not-batch-processing In the next few cycles we plan to move the claim process to the placement engine, which will eliminate most of these race-to-claim type issues, and in that situation things will be better for this kind of arrangement. Until that point, this is not a bug though, because it's specifically how nova is designed to work. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): Invalid Status in tripleo: New Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're here, I will run it past you :) 15:46 < lifeless> mikal: if you have ~5m 15:46 < mikal> Sure 15:46 < lifeless> so, symptoms 15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in it -> about 50% of instances fail to come up 15:47 < lifeless> this is with Ironic 15:47 < mikal> Sure 15:47 < lifeless> the failure on all the instances is the retry-three-times failure-of-death 15:47 < lifeless> what I believe is happening is this 15:48 < lifeless> the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 < lifeless> and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 < lifeless> with rt.instance_claim(context, instance, limits): 15:49 < lifeless> happen in _build_and_run_instance 15:49 < lifeless> before the resource usage is assigned 15:49 < mikal> Is heat making 45 separate requests to the nova API? 15:49 < lifeless> eys
[Yahoo-eng-team] [Bug 1659391] Re: Server list API does not show scheduled servers that are not assigned to any cell
Cells are not optional in Nova as of Ocata. Since cells are required, you should not see instances that are not assigned to a cell, because such a thing is not possible (post-scheduling). Creating an instance before nova is fully setup is not valid either. These two things combined are doubly invalid. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1659391 Title: Server list API does not show scheduled servers that are not assigned to any cell Status in OpenStack Compute (nova): Invalid Bug description: After merge of commit [1] command "nova list --all-" started returning list of servers that are assigned to some cell only. Revert of this change makes API return ALL servers including scheduled ones without assigned cells. In case server failed on scheduling step and hasn't been assigned to any cell, then we will never see it using "list servers" API. But, "list" operation should always show ALL servers. Steps to reproduce: 1) install latest nova that contains commit [1], not configuring cell service and not creating default cell. 2) create VM 3) run any of following commands: $ nova list --all- $ openstack server list --all $ openstack server show %name-of-server% $ nova show %name-of-server% Expected: we see data of server we created on second step. Actual: our server is absent in "list" command results or "NotFound" error on "show" command using "name" of server. There can be other approach for reproducing it, but we need to use "pdb" before step where we assign cell to server. [1] https://review.openstack.org/#/c/396775/ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1659391/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1663729] [NEW] CellsV1 regression introduced with flavor migration to api database
Public bug reported: In Newton we migrated flavors to the api database, which requires using the Flavor object for proper compatibility. A piece of cellsv1 was missed which would cause it to start reporting resources incorrectly after the migration happened and the flavors are removed from the main database. ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: In Progress ** Tags: newton-backport-potential ocata-backport-potential ** Tags added: newton-backport-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1663729 Title: CellsV1 regression introduced with flavor migration to api database Status in OpenStack Compute (nova): In Progress Bug description: In Newton we migrated flavors to the api database, which requires using the Flavor object for proper compatibility. A piece of cellsv1 was missed which would cause it to start reporting resources incorrectly after the migration happened and the flavors are removed from the main database. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1663729/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1668310] [NEW] PCI device migration cannot continue with old deleted service records
Public bug reported: If deleted service records are present in the database, the Service minimum version calculation should ignore them, but it does not. One manifestation of this is the PCI device migration from mitaka/newton will never complete, emitting an error message like this: 2017-02-27 07:40:19.665 ERROR nova.db.sqlalchemy.api [req-ad21480f-613a- 445b-a913-c54532b64ffa None None] Data migrations for PciDevice are not safe, likely because not all services that access the DB directly are updated to the latest version ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1668310 Title: PCI device migration cannot continue with old deleted service records Status in OpenStack Compute (nova): New Bug description: If deleted service records are present in the database, the Service minimum version calculation should ignore them, but it does not. One manifestation of this is the PCI device migration from mitaka/newton will never complete, emitting an error message like this: 2017-02-27 07:40:19.665 ERROR nova.db.sqlalchemy.api [req-ad21480f- 613a-445b-a913-c54532b64ffa None None] Data migrations for PciDevice are not safe, likely because not all services that access the DB directly are updated to the latest version To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1668310/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1670525] [NEW] Nova logs CellMapping objects at DEBUG
Public bug reported: This could contain credentials for the DB and MQ ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: In Progress ** Tags: newton-backport-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1670525 Title: Nova logs CellMapping objects at DEBUG Status in OpenStack Compute (nova): In Progress Bug description: This could contain credentials for the DB and MQ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1670525/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1672625] Re: Instance stuck in schedule state in Ocata release
The missed steps are documented here: https://docs.openstack.org/developer/nova/cells.html#first-time-setup That should get you a cell record created, hosts discovered, and back on track. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1672625 Title: Instance stuck in schedule state in Ocata release Status in OpenStack Compute (nova): Invalid Bug description: I have built devstack multinode setup on ocata release. Unable to launch any instances. All the instances are stuck into "scheduling" state. The only error in n-api log is: n-api.log:1720:2017-03-14 11:48:59.805 25488 ERROR nova.compute.api [req-9a0f533b-b5af-4061-9b88-d00012762131 vks vks] No cells are configured, unable to list instances n-api.log:1730:2017-03-14 11:49:21.641 25488 ERROR nova.compute.api [req-5bdcd955-ddec-4283-9b07-ccd1221912b5 vks vks] No cells are configured, unable to list instances n-api.log:1740:2017-03-14 11:49:26.659 25488 ERROR nova.compute.api [req-8f9e858b-d80a-4770-a727-f94dcebcc986 vks vks] No cells are configured, unable to list instances n-api.log:1816:2017-03-14 11:49:48.611 25487 ERROR nova.compute.api [req-da86d769-49d7-4eec-b389-0dca123b7e16 vks vks] No cells are configured, unable to list instances n-api.log:1899:2017-03-14 11:51:04.481 25487 INFO nova.api.openstack.ws n-sch has no error log. o/p: of service list stack@stack:~$ nova service-list /usr/local/lib/python2.7/dist-packages/novaclient/client.py:278: UserWarning: The 'tenant_id' argument is deprecated in Ocata and its use may result in errors in future releases. As 'project_id' is provided, the 'tenant_id' argument will be ignored. warnings.warn(msg) ++--+---+--+-+---++-+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | ++--+---+--+-+---++-+ | 3 | nova-conductor | stack | internal | enabled | up| 2017-03-14T07:21:38.00 | - | | 5 | nova-scheduler | stack | internal | enabled | up| 2017-03-14T07:21:37.00 | - | | 6 | nova-consoleauth | stack | internal | enabled | up| 2017-03-14T07:21:44.00 | - | | 7 | nova-compute | nfp | nova | enabled | up| 2017-03-14T07:21:38.00 | - | ++--+---+--+-+---++-+ To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1672625/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1684861] Re: Database online_data_migrations in newton fail due to missing keypairs
** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1684861 Title: Database online_data_migrations in newton fail due to missing keypairs Status in OpenStack Compute (nova): Invalid Bug description: Upgrading the deployment from Mitaka to Newton. This bug blocks people from upgrading to Ocata because the database migration for nova fails. Running nova newton 14.0.5, the database is 334 root@moby:/backups# nova-manage db online_data_migrations Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Running batches of 50 until complete 50 rows matched query migrate_flavors, 50 migrated 20 rows matched query migrate_flavors, 20 migrated Error attempting to run 30 rows matched query migrate_instances_add_request_spec, 30 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 50 migrated Error attempting to run /usr/lib/python2.7/dist-packages/pkg_resources/__init__.py:188: RuntimeWarning: You have iterated over the result of pkg_resources.parse_version. This is a legacy behavior which is inconsistent with the new version class introduced in setuptools 8.0. In most cases, conversion to a tuple is unnecessary. For comparison of versions, sort the Version instances directly. If you have another use case requiring the tuple, please file a bug with the setuptools project describing that need. stacklevel=1, 50 rows matched query migrate_instances_add_request_spec, 5 migrated 2017-04-20 14:48:36.586 396 ERROR nova.objects.keypair [req-565cbe62-030b-4b00-b9db-5ee82117889b - - - - -] Some instances are still missing keypair information. Unable to run keypair migration at this time. 5 rows matched query migrate_aggregates, 5 migrated 5 rows matched query migrate_instance_groups_to_api_db, 5 migrated 2 rows matched query delete_build_requests_with_no_instance_uuid, 2 migrated Error attempting to run 50 rows matched query migrate_instances_add_request_spec, 0 migrated 2017-04-20 14:48:40.620 396 ERROR nova.objects.keypair [req-565cbe62-030b-4b00-b9db-5ee82117889b - - - - -] Some instances are still missing keypair information. Unable to run keypair migration at this time. root@moby:/backups# Adding a 'raise' after https://github.com/openstack/nova/blob/stable/newton/nova/cmd/manage.py#L896 you can see: root@moby:/backups# nova-manage db online_data_migrations Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Running batches of 50 until complete Error attempting to run error: 'NoneType' object has no attribute 'key_name' To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1684861/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1686744] [NEW] Unable to add compute host to aggregate if no ironic nodes present
Public bug reported: After the cell-ification of the aggregates API, it is not possible to add a compute to an aggregate if that compute does not expose any ComputeNode objects. This can happen if the hash ring does not allocate any ironic nodes to one of the computes (i.e. more services than ironic nodes) or if there are not yet any nodes present in ironic. You get the following message: openstack aggregate add host baremetal-hosts overcloud- controller-0.localdomain Result: Host 'overcloud-controller-0.localdomain' is not mapped to any cell (HTTP 404) (Request-ID: req- 42525c1d-c419-4ea4-bb7c-7caa1d57a613) This is confusing because the service is exposed in service-list and should be a candidate for adding to an aggregate. ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Confirmed ** Changed in: nova Assignee: (unassigned) => Dan Smith (danms) ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1686744 Title: Unable to add compute host to aggregate if no ironic nodes present Status in OpenStack Compute (nova): Confirmed Bug description: After the cell-ification of the aggregates API, it is not possible to add a compute to an aggregate if that compute does not expose any ComputeNode objects. This can happen if the hash ring does not allocate any ironic nodes to one of the computes (i.e. more services than ironic nodes) or if there are not yet any nodes present in ironic. You get the following message: openstack aggregate add host baremetal-hosts overcloud- controller-0.localdomain Result: Host 'overcloud-controller-0.localdomain' is not mapped to any cell (HTTP 404) (Request-ID: req- 42525c1d-c419-4ea4-bb7c-7caa1d57a613) This is confusing because the service is exposed in service-list and should be a candidate for adding to an aggregate. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1686744/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1539271] [NEW] Libvirt live migration stalls
Public bug reported: The following message in nova gate test logs shows that libvirt live migration can stall on some sort of deadlock: 2016-01-28 16:53:20.878 INFO nova.virt.libvirt.driver [req-692a1f4f- 16aa-4d93-a694-1f7eef4df9f6 tempest- LiveBlockMigrationTestJSON-1471114638 tempest- LiveBlockMigrationTestJSON-1937054400] [instance: 7b1bc0e2-a6a9-4d85-a3f9-4568d52d1f1b] Migration running for 30 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0) Additionally, the libvirt logger thread seems to be deadlocked before this happens. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1539271 Title: Libvirt live migration stalls Status in OpenStack Compute (nova): New Bug description: The following message in nova gate test logs shows that libvirt live migration can stall on some sort of deadlock: 2016-01-28 16:53:20.878 INFO nova.virt.libvirt.driver [req-692a1f4f- 16aa-4d93-a694-1f7eef4df9f6 tempest- LiveBlockMigrationTestJSON-1471114638 tempest- LiveBlockMigrationTestJSON-1937054400] [instance: 7b1bc0e2-a6a9-4d85-a3f9-4568d52d1f1b] Migration running for 30 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0) Additionally, the libvirt logger thread seems to be deadlocked before this happens. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1539271/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1540526] [NEW] Too many lazy-loads in predictable situations
Public bug reported: During a normal tempest run, way (way) too many object lazy-loads are being triggered, which causes extra RPC and database traffic. In a given tempest run, we should be able to pretty much prevent any lazy-loads in that predictable situation. The only case where we might want to have some is where we are iterating objects and conditionally taking action that needs to load extra information. On a random devstack-tempest job run sampled on 1-Feb-2016, a lot of lazy loads were seen: grep 'Lazy-loading' screen-n-cpu.txt.gz -c 624 We should be able to vastly reduce this number without much work. ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1540526 Title: Too many lazy-loads in predictable situations Status in OpenStack Compute (nova): New Bug description: During a normal tempest run, way (way) too many object lazy-loads are being triggered, which causes extra RPC and database traffic. In a given tempest run, we should be able to pretty much prevent any lazy- loads in that predictable situation. The only case where we might want to have some is where we are iterating objects and conditionally taking action that needs to load extra information. On a random devstack-tempest job run sampled on 1-Feb-2016, a lot of lazy loads were seen: grep 'Lazy-loading' screen-n-cpu.txt.gz -c 624 We should be able to vastly reduce this number without much work. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1540526/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1470153] [NEW] Nova object relationships ignore List objects
Public bug reported: In nova/tests/objects/test_objects.py, we have an important test called test_relationships(). This ensures that we have version mappings between objects that depend on each other, and that those versions and relationships are bumped when one object changes versions. That test currently excludes any objects that are based on the List mixin, which obscures dependencies that do things like Foo->BarList->Bar. The test needs to be modified to not exclude List-based objects, and the relationship map needs to be updated for the List objects that are currently excluded. ** Affects: nova Importance: Low Assignee: Ryan Rossiter (rlrossit) Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1470153 Title: Nova object relationships ignore List objects Status in OpenStack Compute (Nova): Confirmed Bug description: In nova/tests/objects/test_objects.py, we have an important test called test_relationships(). This ensures that we have version mappings between objects that depend on each other, and that those versions and relationships are bumped when one object changes versions. That test currently excludes any objects that are based on the List mixin, which obscures dependencies that do things like Foo->BarList->Bar. The test needs to be modified to not exclude List-based objects, and the relationship map needs to be updated for the List objects that are currently excluded. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1470153/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1470154] [NEW] List objects should use obj_relationships
Public bug reported: Nova's List-based objects have something called child_versions, which is a naive mapping of the objects field and the version relationships between the list object and the content object. This was created before we generalized the work in obj_relationships, which normal objects now use. The list-based objects still use child_versions, which means we need a separate test and separate developer behaviors when updating these. For consistency, we should replace child_versions on all the list objects with obj_relationships, remove the list-specific test in test_objects.py, and make sure that the generalized tests properly cover list objects and relationships between list and non-list objects. ** Affects: nova Importance: Low Assignee: Ryan Rossiter (rlrossit) Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1470154 Title: List objects should use obj_relationships Status in OpenStack Compute (Nova): Confirmed Bug description: Nova's List-based objects have something called child_versions, which is a naive mapping of the objects field and the version relationships between the list object and the content object. This was created before we generalized the work in obj_relationships, which normal objects now use. The list-based objects still use child_versions, which means we need a separate test and separate developer behaviors when updating these. For consistency, we should replace child_versions on all the list objects with obj_relationships, remove the list-specific test in test_objects.py, and make sure that the generalized tests properly cover list objects and relationships between list and non-list objects. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1470154/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1471887] [NEW] nova-compute will delete all instances if hostname changes
Public bug reported: The evacuate code as it is currently in nova will delete instances when instance.host != $(hostname) of the host. This assumes that the instance has been evacuated (because its hostname changed). In that case, deleting the local residue is correct, but if the host's hostname changes, then we will just delete data based on a hunch. Nova-compute needs a better mechanism to detect if an evacuation has actually been requested before deleting the data. See Blueprint robustify-evacuate ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1471887 Title: nova-compute will delete all instances if hostname changes Status in OpenStack Compute (Nova): In Progress Bug description: The evacuate code as it is currently in nova will delete instances when instance.host != $(hostname) of the host. This assumes that the instance has been evacuated (because its hostname changed). In that case, deleting the local residue is correct, but if the host's hostname changes, then we will just delete data based on a hunch. Nova-compute needs a better mechanism to detect if an evacuation has actually been requested before deleting the data. See Blueprint robustify-evacuate To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1471887/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1478108] [NEW] Live migration should throttle itself
Public bug reported: Nova will accept an unbounded number of live migrations for a single host, which will result in timeouts and failures (at least for libvirt). Since live migrations are seriously IO intensive, allowing this to be unlimited is just never going to be the right thing to do, especially when we have functions in our own client to live migrate all instances to other hosts (nova host-evacuate-live). We recently added a build semaphore to allow capping the number of parallel builds being attempted on a compute host for a similar reason. This should be the same sort of thing for live migration. ** Affects: nova Importance: Low Status: New ** Changed in: nova Importance: Undecided => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1478108 Title: Live migration should throttle itself Status in OpenStack Compute (nova): New Bug description: Nova will accept an unbounded number of live migrations for a single host, which will result in timeouts and failures (at least for libvirt). Since live migrations are seriously IO intensive, allowing this to be unlimited is just never going to be the right thing to do, especially when we have functions in our own client to live migrate all instances to other hosts (nova host-evacuate-live). We recently added a build semaphore to allow capping the number of parallel builds being attempted on a compute host for a similar reason. This should be the same sort of thing for live migration. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1478108/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1493961] [NEW] nova-conductor object debug does not format
Public bug reported: The debug log statement in nova-conductor's object_backport_versions() method doesn't format and looks like this: 2015-09-09 11:26:57.126 DEBUG nova.conductor.manager [req-9ff7962c- c8b8-4579-8943-cbf2ef0be373 demo demo] Backporting %(obj)s to %(ver)s with versions %(manifest)s from (pid=14735) object_backport_versions /opt/stack/nova/nova/conductor/manager.py:506 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1493961 Title: nova-conductor object debug does not format Status in OpenStack Compute (nova): New Bug description: The debug log statement in nova-conductor's object_backport_versions() method doesn't format and looks like this: 2015-09-09 11:26:57.126 DEBUG nova.conductor.manager [req-9ff7962c- c8b8-4579-8943-cbf2ef0be373 demo demo] Backporting %(obj)s to %(ver)s with versions %(manifest)s from (pid=14735) object_backport_versions /opt/stack/nova/nova/conductor/manager.py:506 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1493961/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1387244] Re: Increasing number of InstancePCIRequests.get_by_instance_uuid RPC calls during compute host auditing
** Changed in: nova Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1387244 Title: Increasing number of InstancePCIRequests.get_by_instance_uuid RPC calls during compute host auditing Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: Triaged Bug description: Environment: Ubuntu 14.04/OpenStack Juno Release The periodic auditing on compute node becomes very RPC call intensive when a large number of instances are running on a cloud; the InstancePCIRequests.get_by_instance_uuid call is made on all instances running on the hypervisor - when this is multiplied across a large number of hypervisors, this impacts back onto the conductor processes as they try to service an increasing amount of RPC calls over time. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1387244/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1498023] [NEW] _cleanup_incomplete_migrations() does not check for shared storage
Public bug reported: The newly-added periodic task to cleanup residue from failed migrations does not properly consider shared storage before deleting instance files. This could easily lead to data loss in such an environment following a failed migration. ** Affects: nova Importance: High Assignee: Dan Smith (danms) Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1498023 Title: _cleanup_incomplete_migrations() does not check for shared storage Status in OpenStack Compute (nova): Confirmed Bug description: The newly-added periodic task to cleanup residue from failed migrations does not properly consider shared storage before deleting instance files. This could easily lead to data loss in such an environment following a failed migration. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1498023/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1498023] Re: _cleanup_incomplete_migrations() does not check for shared storage
** Changed in: nova Importance: High => Undecided ** Changed in: nova Status: In Progress => Invalid ** Changed in: nova Milestone: liberty-rc1 => None -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1498023 Title: _cleanup_incomplete_migrations() does not check for shared storage Status in OpenStack Compute (nova): Invalid Bug description: The newly-added periodic task to cleanup residue from failed migrations does not properly consider shared storage before deleting instance files. This could easily lead to data loss in such an environment following a failed migration. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1498023/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1555287] [NEW] Libvirt driver broken for non-disk-image backends
Public bug reported: Recently the ceph job (and any other configuration that doesn't use disk image as the backend storage) started failing like this: 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 138, in _dispatch_and_reply 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher incoming.message)) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 183, in _dispatch 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 127, in _do_dispatch 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 110, in wrapped 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher payload) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher self.force_reraise() 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 89, in wrapped 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 359, in decorated_function 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance=instance) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher self.force_reraise() 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 328, in decorated_function 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 409, in decorated_function 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 316, in decorated_function 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher migration.instance_uuid, exc_info=True) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher self.force_reraise() 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 293, in decorated_function 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 387, in decorated_function 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info()) 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__ 2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.
[Yahoo-eng-team] [Bug 1435586] [NEW] trigger security group refresh gives 'dict' object has no attribute 'uuid'
Public bug reported: During trigger_rules_refresh(), we get this from compute manager: 2015-03-23 03:50:49.677 ERROR oslo_messaging.rpc.dispatcher [req-117e72d4-8c12-4805-9c65-695b62fad491 alt_demo alt_demo] Exception during message handling: 'dict' object has no attribute 'uuid' 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher executor_callback)) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher executor_callback) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 88, in wrapped 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher payload) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__ 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 71, in wrapped 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 1301, in refresh_instance_security_rules 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher @utils.synchronized(instance.uuid) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher AttributeError: 'dict' object has no attribute 'uuid' This happens because we're passing non-object instances to refresh_instance_security_rules() ** Affects: nova Importance: Undecided Assignee: Dan Smith (danms) Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1435586 Title: trigger security group refresh gives 'dict' object has no attribute 'uuid' Status in OpenStack Compute (Nova): Confirmed Bug description: During trigger_rules_refresh(), we get this from compute manager: 2015-03-23 03:50:49.677 ERROR oslo_messaging.rpc.dispatcher [req-117e72d4-8c12-4805-9c65-695b62fad491 alt_demo alt_demo] Exception during message handling: 'dict' object has no attribute 'uuid' 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher executor_callback)) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher executor_callback) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 88, in wrapped 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher payload) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in __exit__ 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 71, in wrapped 2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher return
[Yahoo-eng-team] [Bug 1441243] [NEW] EnumField can be None and thus unrestricted
Public bug reported: The Enum objects field can be passed a valid_values=None set, which disables the enum checking and defeats the whole purpose of the field. This would allow something unversioned to creep into our RPC API, which would be bad. ** Affects: nova Importance: High Assignee: Dan Smith (danms) Status: In Progress ** Tags: unified-objects -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1441243 Title: EnumField can be None and thus unrestricted Status in OpenStack Compute (Nova): In Progress Bug description: The Enum objects field can be passed a valid_values=None set, which disables the enum checking and defeats the whole purpose of the field. This would allow something unversioned to creep into our RPC API, which would be bad. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1441243/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1442236] [NEW] Bump compute RPC API to 4.0
Public bug reported: We badly need to bump the compute RPC version to 4.0 BEFORE we release kilo. ** Affects: nova Importance: Critical Assignee: Dan Smith (danms) Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1442236 Title: Bump compute RPC API to 4.0 Status in OpenStack Compute (Nova): Confirmed Bug description: We badly need to bump the compute RPC version to 4.0 BEFORE we release kilo. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1442236/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1450624] [NEW] Nova waits for events from neutron on resize-revert that aren't coming
er.py", line 6095, in finish_revert_migration 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher block_device_info, power_on) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4446, in _create_domain_and_network 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher raise exception.VirtualInterfaceCreateException() 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher VirtualInterfaceCreateException: Virtual Interface creation failed ** Affects: nova Importance: High Assignee: Dan Smith (danms) Status: In Progress ** Tags: juno-backport-potential kilo-backport-potential libvirt neutron -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1450624 Title: Nova waits for events from neutron on resize-revert that aren't coming Status in OpenStack Compute (Nova): In Progress Bug description: On resize-revert, the original host was waiting for plug events from neutron before restarting the instance. These aren't sent since we don't ever unplug the vifs. Thus, we'll always fail like this: 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last): 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher incoming.message)) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher payload) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__ 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 298, in decorated_function 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher pass 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__ 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 284, in decorated_function 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 348, in decorated_function 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs) 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher File "/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py", line 326, in decorated_function 2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher kwargs['
[Yahoo-eng-team] [Bug 1465799] [NEW] Instance object always re-saves flavor info
Public bug reported: Due to a bug in the logic during Instance._save_flavor(), the instance.extra.flavor field will be saved every time we call Instance.save(), even if no changes have been made. This generates more database traffic for no reason. ** Affects: nova Importance: Medium Status: New ** Changed in: nova Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1465799 Title: Instance object always re-saves flavor info Status in OpenStack Compute (Nova): New Bug description: Due to a bug in the logic during Instance._save_flavor(), the instance.extra.flavor field will be saved every time we call Instance.save(), even if no changes have been made. This generates more database traffic for no reason. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1465799/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1373106] Re: jogo and sdague are making me sad
** Changed in: nova Status: Opinion => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1373106 Title: jogo and sdague are making me sad Status in OpenStack Compute (Nova): Confirmed Bug description: Just like when my parents would fight pre-separation... To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1373106/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1396324] [NEW] Instance object has no attribute get_flavor()
Public bug reported: The notifications code in nova is receiving a SQLAlchemy object when trying to send state update notifications, resulting in this in the conductor log: 2014-11-25 03:13:40.200 ERROR nova.notifications [req-1a9ed96d-7ce2-4c7d-a409-a6959852ce6a AggregatesAdminTestXML-569323565 AggregatesAdminTestXML-1788648791] [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Failed to send state update notification 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Traceback (most recent call last): 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] File "/opt/stack/new/nova/nova/notifications.py", line 146, in send_update 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] old_display_name=old_display_name) 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] File "/opt/stack/new/nova/nova/notifications.py", line 226, in _send_instance_update_notification 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] payload = info_from_instance(context, instance, None, None) 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] File "/opt/stack/new/nova/nova/notifications.py", line 369, in info_from_instance 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] instance_type = instance.get_flavor() 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] AttributeError: 'Instance' object has no attribute 'get_flavor' 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] ** Affects: nova Importance: Medium Assignee: Dan Smith (danms) Status: Confirmed ** Changed in: nova Importance: Undecided => Medium ** Changed in: nova Status: New => Confirmed ** Changed in: nova Milestone: None => kilo-1 ** Changed in: nova Assignee: (unassigned) => Dan Smith (danms) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1396324 Title: Instance object has no attribute get_flavor() Status in OpenStack Compute (Nova): Confirmed Bug description: The notifications code in nova is receiving a SQLAlchemy object when trying to send state update notifications, resulting in this in the conductor log: 2014-11-25 03:13:40.200 ERROR nova.notifications [req-1a9ed96d-7ce2-4c7d-a409-a6959852ce6a AggregatesAdminTestXML-569323565 AggregatesAdminTestXML-1788648791] [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Failed to send state update notification 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Traceback (most recent call last): 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] File "/opt/stack/new/nova/nova/notifications.py", line 146, in send_update 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] old_display_name=old_display_name) 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] File "/opt/stack/new/nova/nova/notifications.py", line 226, in _send_instance_update_notification 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] payload = info_from_instance(context, instance, None, None) 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] File "/opt/stack/new/nova/nova/notifications.py", line 369, in info_from_instance 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] instance_type = instance.get_flavor() 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] AttributeError: 'Instance' object has no attribute 'get_flavor' 2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 74bb24d3-ba69-41e2-b99a-1c35a2331c1b] To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1396324/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1403162] [NEW] fake_notifier: ValueError: Circular reference detected
Public bug reported: The fake_notifier code is using anyjson, which today is failing to serialize something in a notification payload. Failure looks like this: Traceback (most recent call last): File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/mock.py", line 1201, in patched return func(*args, **keywargs) File "nova/tests/unit/compute/test_compute.py", line 2774, in test_reboot_fail self._test_reboot(False, fail_reboot=True) File "nova/tests/unit/compute/test_compute.py", line 2744, in _test_reboot reboot_type=reboot_type) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 420, in assertRaises self.assertThat(our_callable, matcher) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 431, in assertThat mismatch_error = self._matchHelper(matchee, matcher, message, verbose) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 481, in _matchHelper mismatch = matcher.match(matchee) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_exception.py", line 108, in match mismatch = self.exception_matcher.match(exc_info) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_higherorder.py", line 62, in match mismatch = matcher.match(matchee) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 412, in match reraise(*matchee) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_exception.py", line 101, in match result = matchee() File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 965, in __call__ return self._callable_object(*self._args, **self._kwargs) File "nova/exception.py", line 88, in wrapped payload) File "nova/tests/unit/fake_notifier.py", line 57, in _notify anyjson.serialize(payload) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/anyjson/__init__.py", line 141, in dumps return implementation.dumps(value) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/anyjson/__init__.py", line 87, in dumps return self._encode(data) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/oslo/serialization/jsonutils.py", line 186, in dumps return json.dumps(obj, default=default, **kwargs) File "/usr/lib64/python2.7/json/__init__.py", line 250, in dumps sort_keys=sort_keys, **kw).encode(obj) File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode return _iterencode(o, 0) ValueError: Circular reference detected ** Affects: nova Importance: Critical Assignee: Dan Smith (danms) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1403162 Title: fake_notifier: ValueError: Circular reference detected Status in OpenStack Compute (Nova): In Progress Bug description: The fake_notifier code is using anyjson, which today is failing to serialize something in a notification payload. Failure looks like this: Traceback (most recent call last): File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/mock.py", line 1201, in patched return func(*args, **keywargs) File "nova/tests/unit/compute/test_compute.py", line 2774, in test_reboot_fail self._test_reboot(False, fail_reboot=True) File "nova/tests/unit/compute/test_compute.py", line 2744, in _test_reboot reboot_type=reboot_type) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 420, in assertRaises self.assertThat(our_callable, matcher) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 431, in assertThat mismatch_error = self._matchHelper(matchee, matcher, message, verbose) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", line 481, in _matchHelper mismatch = matcher.match(matchee) File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_exception.py", line 108, in match mismatch = self.exception_matcher.match(exc_info) File "/home/dan/nova/.tox/py27/lib/python2.7/site-pack
[Yahoo-eng-team] [Bug 1275875] [NEW] Virt drivers should use standard image properties
Public bug reported: Several virt drivers are using non-standard driver-specific image metadata properties. This creates an API contract between the external user and the driver implementation. These non-standard ones should be marked as deprecated in some way, enforced in v3, etc. We need a global whitelist of keys and values that are allowed so that we can make sure others don't leak in. Examples: nova/virt/vmwareapi/vmops.py:os_type = image_properties.get("vmware_ostype", "otherGuest") nova/virt/vmwareapi/vmops.py:adapter_type = image_properties.get("vmware_adaptertype", nova/virt/vmwareapi/vmops.py:disk_type = image_properties.get("vmware_disktype", nova/virt/vmwareapi/vmops.py:vif_model = image_properties.get("hw_vif_model", "VirtualE1000") nova/virt/xenapi/vm_utils.py:device_id = image_properties.get('xenapi_device_id') I think it's important to try to get this fixed (or as close as possible) before the icehouse release. ** Affects: nova Importance: Medium Status: Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1275875 Title: Virt drivers should use standard image properties Status in OpenStack Compute (Nova): Confirmed Bug description: Several virt drivers are using non-standard driver-specific image metadata properties. This creates an API contract between the external user and the driver implementation. These non-standard ones should be marked as deprecated in some way, enforced in v3, etc. We need a global whitelist of keys and values that are allowed so that we can make sure others don't leak in. Examples: nova/virt/vmwareapi/vmops.py:os_type = image_properties.get("vmware_ostype", "otherGuest") nova/virt/vmwareapi/vmops.py:adapter_type = image_properties.get("vmware_adaptertype", nova/virt/vmwareapi/vmops.py:disk_type = image_properties.get("vmware_disktype", nova/virt/vmwareapi/vmops.py:vif_model = image_properties.get("hw_vif_model", "VirtualE1000") nova/virt/xenapi/vm_utils.py:device_id = image_properties.get('xenapi_device_id') I think it's important to try to get this fixed (or as close as possible) before the icehouse release. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1275875/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1276731] [NEW] simple_tenant_usage extension should not rely on looking up flavors
Public bug reported: The simple_tenant_usage extension gets the flavor data from the instance and then looks up the flavor from the database to return usage information. Since we now store all of the flavor data in the instance itself, we should use that information instead of what the flavor currently says is right. This both (a) makes it more accurate and (b) avoids us failing to return usage info if a flavor disappears. ** Affects: nova Importance: Medium Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1276731 Title: simple_tenant_usage extension should not rely on looking up flavors Status in OpenStack Compute (Nova): New Bug description: The simple_tenant_usage extension gets the flavor data from the instance and then looks up the flavor from the database to return usage information. Since we now store all of the flavor data in the instance itself, we should use that information instead of what the flavor currently says is right. This both (a) makes it more accurate and (b) avoids us failing to return usage info if a flavor disappears. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1276731/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1280034] [NEW] compute_node_update broken with havana compute nodes
Public bug reported: This change: https://review.openstack.org/#/c/66469 Changed the format of the data in the "values" dictionary of compute_node_update. This causes an icehouse conductor to generate a broken SQL query when called from a havana compute node: http://logs.openstack.org/75/64075/13/check/check-grenade- dsvm/b70c839/logs/new/screen-n-cond.txt.gz?level=TRACE executors.base [-] Exception during message handling: (ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ': '2', u'io_workload': '0', u'num_instances': '2', u'num_vm_building': '0', u'nu' at line 1") 'UPDATE compute_nodes SET updated_at=%s, vcpus_used=%s, memory_mb_used=%s, free_ram_mb=%s, running_vms=%s, stats=%s WHERE compute_nodes.id = %s' (datetime.datetime(2014, 2, 12, 21, 2, 12, 395978), 4, 1216, 6737, 4, {u'num_task_None': 2, u'io_workload': 0, u'num_instances': 2, u'num_vm_active': 1, u'num_task_scheduling': 0, u'num_vm_building': 0, u'num_proj_d0e1e781676f4fe5b1b81e31b8ae87de': 1, u'num_vcpus_used': 2, u'num_proj_a8a2f9c3e3bd44edb1c5fd2ae4cc7b3c': 1, u'num_os_type_None': 2, u'num_vm_error': 1}, 1) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base Traceback (most recent call last): 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/oslo.messaging/oslo/messaging/_executors/base.py", line 36, in _dispatch 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base incoming.reply(self.callback(incoming.ctxt, incoming.message)) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 134, in __call__ 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return self._dispatch(endpoint, method, ctxt, args) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 104, in _dispatch 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base result = getattr(endpoint, method)(ctxt, **new_args) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/nova/nova/conductor/manager.py", line 458, in compute_node_update 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base result = self.db.compute_node_update(context, node['id'], values) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/nova/nova/db/api.py", line 228, in compute_node_update 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return IMPL.compute_node_update(context, compute_id, values) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 110, in wrapper 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return f(*args, **kwargs) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 166, in wrapped 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return f(*args, **kwargs) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 614, in compute_node_update 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base compute_ref.update(values) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 456, in __exit__ 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base self.commit() 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 368, in commit 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base self._prepare_impl() 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 347, in _prepare_impl 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base self.session.flush() 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base File "/opt/stack/new/nova/nova/openstack/common/db/sqlalchemy/session.py", line 616, in _wrap 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base raise exception.DBError(e) 2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base DBError: (ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ': '2', u'io_workload': '0', u'num_instances': '2', u'num_vm_building': '0', u'nu' at line 1") 'UPDATE compute_nodes SET updated_at=%s, vcpus_used=%s, memory_mb_used=%s, free_ram_mb=%s, running_vms=%s, st
[Yahoo-eng-team] [Bug 1284312] [NEW] vmware driver races to create instance images
Public bug reported: Change Ia0ebd674345734e7cfa31ccd400fdba93646c554 traded one race condition for another. By ignoring all mkdir() calls that would otherwise fail because an instance directory already exists, two nodes racing to create a single image will corrupt or lose data, or fail in a strange way. This call should fail in that case, but doesn't after the recent patch was merged: https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/vmops.py#L350 ** Affects: nova Importance: High Assignee: Shawn Hartsock (hartsock) Status: New ** Affects: openstack-vmwareapi-team Importance: Critical Status: New ** Tags: vmware -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1284312 Title: vmware driver races to create instance images Status in OpenStack Compute (Nova): New Status in The OpenStack VMwareAPI subTeam: New Bug description: Change Ia0ebd674345734e7cfa31ccd400fdba93646c554 traded one race condition for another. By ignoring all mkdir() calls that would otherwise fail because an instance directory already exists, two nodes racing to create a single image will corrupt or lose data, or fail in a strange way. This call should fail in that case, but doesn't after the recent patch was merged: https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/vmops.py#L350 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1284312/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1250300] Re: chinese secgroup description make nova list failed
Original poster confirms this is no longer a problem ** Changed in: nova Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1250300 Title: chinese secgroup description make nova list failed Status in OpenStack Compute (Nova): Invalid Bug description: I create a secgroup with chinese description just as following: hzguanqiang@debian:/data/log/nova$ nova secgroup-list ++--+-+ | Id | Name | Description | ++--+-+ | 11 | bingoxxx | 无 | Then I create an instance with this secgroup, It report an 500 error. And when I execute 'nova list' command, it failed with such error info in nova-api.log: 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack Traceback (most recent call last): 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/nova/api/openstack/__init__.py", line 111, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return req.get_response(self.application) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1053, in get_response 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack application, catch_exc_info=False) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1022, in call_application 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack app_iter = application(self.environ, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return resp(environ, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py", line 571, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return self.app(env, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return resp(environ, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return resp(environ, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/routes/middleware.py", line 131, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack response = self.app(environ, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return resp(environ, start_response) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 147, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack resp = self.call_func(req, *args, **self.kwargs) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 208, in call_func 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return self.func(req, *args, **kwargs) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py", line 904, in __call__ 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack content_type, body, accept) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py", line 963, in _process_stack 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack action_result = self.dispatch(meth, request, action_args) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py", line 1044, in dispatch 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return method(req=request, **action_args) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", line 505, in detail 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack servers = self._get_servers(req, is_detail=True) 2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack File "/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", line 567, in _get_servers 2013-11-12 11:12:24.137 26386 TRACE nova.api.op