[Yahoo-eng-team] [Bug 1503708] [NEW] InstanceV2 backports to V1 lack a context

2015-10-07 Thread Dan Smith
Public bug reported:

When we convert a V2 instance to a V1 instance, we don't provide it a
context, which could, under some circumstances, cause a failure to lazy-
load things we need to construct the older instance.

** Affects: nova
 Importance: High
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1503708

Title:
  InstanceV2 backports to V1 lack a context

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  When we convert a V2 instance to a V1 instance, we don't provide it a
  context, which could, under some circumstances, cause a failure to
  lazy-load things we need to construct the older instance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1503708/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1506089] [NEW] Nova incorrectly calculates service version

2015-10-14 Thread Dan Smith
Public bug reported:

Nova will incorrectly calculate the service version from the database,
resulting in improper upgrade decisions like automatic compute rpc
version pinning.

For a dump that looks like this:

2015-10-13 23:53:15.824 | created_atupdated_at  deleted_at  id  
hostbinary  topic   report_countdisableddeleted disabled_reason 
last_seen_upforced_down version
2015-10-13 23:53:15.824 | 2015-10-13 23:42:34   2015-10-13 23:50:39 NULL
1   devstack-trusty-hpcloud-b2-5398906  nova-conductor  conductor   
49  0   0   NULL2015-10-13 23:50:39 0   2
2015-10-13 23:53:15.824 | 2015-10-13 23:42:34   2015-10-13 23:50:39 NULL
2   devstack-trusty-hpcloud-b2-5398906  nova-cert   cert49  
0   0   NULL2015-10-13 23:50:39 0   2
2015-10-13 23:53:15.824 | 2015-10-13 23:42:34   2015-10-13 23:50:39 NULL
3   devstack-trusty-hpcloud-b2-5398906  nova-scheduler  scheduler   
49  0   0   NULL2015-10-13 23:50:39 0   2
2015-10-13 23:53:15.824 | 2015-10-13 23:42:34   2015-10-13 23:50:40 NULL
4   devstack-trusty-hpcloud-b2-5398906  nova-computecompute 49  
0   0   NULL2015-10-13 23:50:40 0   2
2015-10-13 23:53:15.824 | 2015-10-13 23:42:44   2015-10-13 23:50:39 NULL
5   devstack-trusty-hpcloud-b2-5398906  nova-networknetwork 48  
0   0   NULL2015-10-13 23:50:39 0   2

Where all versions are 2, this is displayed in logs that load the
compute rpcapi module:

2015-10-13 23:56:05.149 INFO nova.compute.rpcapi [req-
d3601f93-73a2-4427-91d0-bb5964002592 None None] Automatically selected
compute RPC version 4.0 from minimum service version 0

Which is clearly wrong (service_version minimum should be 2 not 0)

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1506089

Title:
  Nova incorrectly calculates service version

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Nova will incorrectly calculate the service version from the database,
  resulting in improper upgrade decisions like automatic compute rpc
  version pinning.

  For a dump that looks like this:

  2015-10-13 23:53:15.824 | created_at  updated_at  deleted_at  id  
hostbinary  topic   report_countdisableddeleted disabled_reason 
last_seen_upforced_down version
  2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL
1   devstack-trusty-hpcloud-b2-5398906  nova-conductor  conductor   
49  0   0   NULL2015-10-13 23:50:39 0   2
  2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL
2   devstack-trusty-hpcloud-b2-5398906  nova-cert   cert49  
0   0   NULL2015-10-13 23:50:39 0   2
  2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:39 NULL
3   devstack-trusty-hpcloud-b2-5398906  nova-scheduler  scheduler   
49  0   0   NULL2015-10-13 23:50:39 0   2
  2015-10-13 23:53:15.824 | 2015-10-13 23:42:34 2015-10-13 23:50:40 NULL
4   devstack-trusty-hpcloud-b2-5398906  nova-computecompute 49  
0   0   NULL2015-10-13 23:50:40 0   2
  2015-10-13 23:53:15.824 | 2015-10-13 23:42:44 2015-10-13 23:50:39 NULL
5   devstack-trusty-hpcloud-b2-5398906  nova-networknetwork 48  
0   0   NULL2015-10-13 23:50:39 0   2

  Where all versions are 2, this is displayed in logs that load the
  compute rpcapi module:

  2015-10-13 23:56:05.149 INFO nova.compute.rpcapi [req-
  d3601f93-73a2-4427-91d0-bb5964002592 None None] Automatically selected
  compute RPC version 4.0 from minimum service version 0

  Which is clearly wrong (service_version minimum should be 2 not 0)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1506089/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1351020] [NEW] FloatingIP fails to load from database when not associated

2014-07-31 Thread Dan Smith
Public bug reported:

A FloatingIP can be not associated with an FixedIP, which will cause its
fixed_ip field in the database model to be None. Currently, FloatingIP's
_from_db_object() method always assumes it's non-None and thus tries to
load a FixedIP from None, which fails.

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1351020

Title:
  FloatingIP fails to load from database when not associated

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  A FloatingIP can be not associated with an FixedIP, which will cause
  its fixed_ip field in the database model to be None. Currently,
  FloatingIP's _from_db_object() method always assumes it's non-None and
  thus tries to load a FixedIP from None, which fails.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1351020/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1360320] [NEW] Unit tests fail in handle_schedule_error with wrong instance

2014-08-22 Thread Dan Smith
Public bug reported:

>From http://logs.openstack.org/70/113270/3/check/gate-nova-
python26/038b3fa/console.html:

2014-08-21 20:08:33.507 | Traceback (most recent call last):
2014-08-21 20:08:33.507 |   File "nova/tests/conductor/test_conductor.py", line 
1343, in test_build_instances_scheduler_failure
2014-08-21 20:08:33.507 | legacy_bdm=False)
2014-08-21 20:08:33.507 |   File "nova/conductor/rpcapi.py", line 415, in 
build_instances
2014-08-21 20:08:33.507 | cctxt.cast(context, 'build_instances', **kw)
2014-08-21 20:08:33.508 |   File 
"/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/rpc/client.py",
 line 152, in call
2014-08-21 20:08:33.508 | retry=self.retry)
2014-08-21 20:08:33.508 |   File 
"/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/transport.py",
 line 90, in _send
2014-08-21 20:08:33.508 | timeout=timeout, retry=retry)
2014-08-21 20:08:33.508 |   File 
"/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/_drivers/impl_fake.py",
 line 194, in send
2014-08-21 20:08:33.508 | return self._send(target, ctxt, message, 
wait_for_reply, timeout)
2014-08-21 20:08:33.508 |   File 
"/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/_drivers/impl_fake.py",
 line 181, in _send
2014-08-21 20:08:33.508 | raise failure
2014-08-21 20:08:33.509 | UnexpectedMethodCallError: Unexpected method call.  
unexpected:-  expected:+
2014-08-21 20:15:52.443 | - 
handle_schedule_error.__call__(, NoValidHost(u'No valid host was found. fake-reason',), 
'8eb9d649-0985-43libvir:  error : internal error could not initialize domain 
event timer
2014-08-21 20:16:46.065 | Exception TypeError: "'NoneType' object is not 
callable" in > ignored
2014-08-21 20:19:45.254 | 50-8946-570ce100534c', {'instance_properties': 
Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone=None,cell_name=None,cleaned=False,config_drive=None,created_at=1955-11-05T00:00:00Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description=None,display_name=None,ephemeral_gb=0,ephemeral_key_uuid=None,fault=,host='fake-host',hostname=None,id=1,image_ref=None,info_cache=,instance_type_id=None,kernel_id=None,key_data=None,key_name=None,launch_index=None,launched_at=None,launched_on=None,locked=False,locked_by=None,memory_mb=None,metadata=,node=None,os_type=None,pci_devices=,power_state=None,progress=None,project_id='fake-project',ramdisk_id=None,reservation_id=None,root_device_name=None,root_gb=0,scheduled_at=None,security_groups=,shutdown_terminate=False,system_metadata=,task_state=None,terminated_at=None,updated_at=None
 
,user_data=None,user_id='fake-user',uuid=8eb9d649-0985-4350-8946-570ce100534c,vcpus=None,vm_mode=None,vm_state=None),
 'fake': 'specs'}) -> None
2014-08-21 20:19:45.254 | + 
handle_schedule_error.__call__(, NoValidHost(u'No valid host was found. fake-reason',), 
'712006a3-7ca5-4350-8f7f-028a9e4c78b2', {'instance_properties': 
Instance(access_ip_v4=None,access_ip_v6=None,architecture=None,auto_disk_config=False,availability_zone=None,cell_name=None,cleaned=False,config_drive=None,created_at=1955-11-05T00:00:00Z,default_ephemeral_device=None,default_swap_device=None,deleted=False,deleted_at=None,disable_terminate=False,display_description=None,display_name=None,ephemeral_gb=0,ephemeral_key_uuid=None,fault=,host='fake-host',hostname=None,id=1,image_ref=None,info_cache=,instance_type_id=None,kernel_id=None,key_data=None,key_name=None,launch_index=None,launched_at=None,launched_on=None,locked=False,locked_by=None,memory_mb=None,metadata=,node=None,os_type=None,pci_devices=,power_state=None,progress=None,project_id='fake-project',ramdisk_id=None,reservati
 
on_id=None,root_device_name=None,root_gb=0,scheduled_at=None,security_groups=,shutdown_terminate=False,system_metadata=,task_state=None,terminated_at=None,updated_at=None,user_data=None,user_id='fake-user',uuid=8eb9d649-0985-4350-8946-570ce100534c,vcpus=None,vm_mode=None,vm_state=None),
 'fake': 'specs'}) -> None

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1360320

Title:
  Unit tests fail in handle_schedule_error with wrong instance

Status in OpenStack Compute (Nova):
  New

Bug description:
  From http://logs.openstack.org/70/113270/3/check/gate-nova-
  python26/038b3fa/console.html:

  2014-08-21 20:08:33.507 | Traceback (most recent call last):
  2014-08-21 20:08:33.507 |   File "nova/tests/conductor/test_conductor.py", 
line 1343, in test_build_instances_scheduler_failure
  2014-08-21 20:08:33.507 | legacy_bdm=False)
  2014-08-21 20:08:33.507 |   File "nova/conductor/rpca

[Yahoo-eng-team] [Bug 1360333] [NEW] Object hash test fails to detect changes when serialize_args is used

2014-08-22 Thread Dan Smith
Public bug reported:

The object hash test will fail to detect method signature changes when
something like the serialize_args decorator is used. The test needs to
drill down until it finds the remotable level and do the calculation
there.

** Affects: nova
 Importance: Low
 Assignee: Dan Smith (danms)
 Status: Confirmed


** Tags: testing unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1360333

Title:
  Object hash test fails to detect changes when serialize_args is used

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  The object hash test will fail to detect method signature changes when
  something like the serialize_args decorator is used. The test needs to
  drill down until it finds the remotable level and do the calculation
  there.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1360333/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1361683] [NEW] Instance pci_devices and security_groups refreshing can break backporting

2014-08-26 Thread Dan Smith
Public bug reported:

In the Instance object, on a remotable operation such as save(), we
refresh the pci_devices and security_groups with the information we get
back from the database. Since this *replaces* the objects currently
attached to the instance object (which might be backlevel) with current
versions, an older client could get a failure upon deserializing the
result.

We need to figure out some way to either backport the results of
remoteable methods, or put matching backlevel objects into the instance
during the refresh in the first place.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: Confirmed


** Tags: unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1361683

Title:
  Instance pci_devices and security_groups refreshing can break
  backporting

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  In the Instance object, on a remotable operation such as save(), we
  refresh the pci_devices and security_groups with the information we
  get back from the database. Since this *replaces* the objects
  currently attached to the instance object (which might be backlevel)
  with current versions, an older client could get a failure upon
  deserializing the result.

  We need to figure out some way to either backport the results of
  remoteable methods, or put matching backlevel objects into the
  instance during the refresh in the first place.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1361683/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1155800] Re: Cannot delete / confirm / revert resize an instance if nova-compute crashes after VERIFY_RESIZE

2014-09-16 Thread Dan Smith
This is super old, lots has changed since then, and several folks have
not been able to reproduce. Please re-open if this is still valid.

** Changed in: nova
   Importance: High => Undecided

** Changed in: nova
   Status: Triaged => Invalid

** Changed in: nova
 Assignee: Dan Smith (danms) => (unassigned)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1155800

Title:
  Cannot delete / confirm / revert resize an instance if nova-compute
  crashes after VERIFY_RESIZE

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  How to reproduce the bug:

  nova boot ... vm1
  nova migrate vm1 (or resize)

  wait for the vm status to reach VERIFY_RESIZE

  stop nova-compute on the host where vm1 is running

  nova delete vm1

  Error: The server has either erred or is incapable of performing the
  requested operation. (HTTP 500) (Request-ID: req-be1379bc-6a5b-
  41f5-a554-60e02acfdb79)

  restart quickly the nova-compute service, before the status becomes "XXX" in:
  nova-manage service list

  Note: the vm is still running on the hypervisor.

  nova show vm1
  VM status is still: VERIFY_RESIZE

  nova resize-confirm vm1

  ERROR: Cannot 'confirmResize' while instance is in task_state deleting
  (HTTP 409) (Request-ID: req-9660c776-ebc3-4397-a8e2-7ad83e8b6a0f)

  nova resize-revert vm1

  ERROR: Cannot 'revertResize' while instance is in task_state deleting
  (HTTP 409) (Request-ID: req-3cf0141b-ee3d-478f-8aa0-89091028a227)

  nova delete vm1

  The server has either erred or is incapable of performing the
  requested operation. (HTTP 500) (Request-ID: req-2cb17333-6cc9-42ca-
  baaa-da88ec90153f)

  nova-api log when running nova delete:
  http://paste.openstack.org/show/33783/

  Notes:

  Tests have been performed using the Hyper-V driver, but the issue
  seems to be unrelated to the driver.

  After stopping nova-compute, by waiting long enough for the service to
  be marked as XXX in "nova-manage service list", issuing "nova delete
  vm1" succeeds.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1155800/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1370536] [NEW] DB migrations can go unchecked

2014-09-17 Thread Dan Smith
Public bug reported:

Currently DB migrations can be added to the tree without the
corresponding migration tests. This is bad and means that we have some
that are untested in the tree already.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: In Progress


** Tags: db

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1370536

Title:
  DB migrations can go unchecked

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  Currently DB migrations can be added to the tree without the
  corresponding migration tests. This is bad and means that we have some
  that are untested in the tree already.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1370536/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1198142] Re: server fails to start after a "stop" action

2013-09-16 Thread Dan Smith
I'm marking this as invalid given my last findings and the lack of any
response. We can reopen it if new details become available.

** Changed in: nova
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1198142

Title:
  server fails to start after a "stop" action

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  After a shutoff operation, the server fails to start, even though
  "Success:Start Instace" is reported in GUI. The issue is reproducible
  in CLI as well.

  Steps to reproduce:

  1. Stop the server using command

  # nova stop 

  2. Start the server back after the server status shows "SHUTOFF"
  # nova start 

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1198142/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1201784] Re: Resize doesn't fail when the operation doesn't complete

2013-09-17 Thread Dan Smith
This *is* by design because the call to start the resize is cast-ed
(like almost everything else) from the api node and returns immediately.
We don't know that it failed until potentially much later. I'm going to
mark this as invalid, but if I'm missing something, please reopen.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1201784

Title:
  Resize doesn't fail when the operation doesn't complete

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  I've noticed nova resize doesn't fail on the client side when the
  server doesn't actually do the resize. 2 examples:

   * resizing to a flavor with too much RAM. The scheduler can't find a
  host, but the command line call succeeds, and the server state stays
  the same.

   * resizing a shutdown server, where nothing seems to be happening.

  Using devstack and latest master.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1201784/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1201873] Re: dnsmasq does not use -h, so /etc/hosts sends folks to loopback when they look up the machine it's running on

2013-09-17 Thread Dan Smith
This sounds like a JuJu problem to me :)

IMHO, /etc/hosts should not redirect $HOSTNAME to anything other than a
routable external interface in a real environment with working DNS.
Assuming your machine is not called "localhost" I think that this is a
configuration issue.

** Changed in: nova
   Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1201873

Title:
  dnsmasq does not use -h, so /etc/hosts sends folks to loopback when
  they look up the machine it's running on

Status in OpenStack Compute (Nova):
  Opinion

Bug description:
   from dnsmasq(8):

-h, --no-hosts
Don't read the hostnames in /etc/hosts.

  
  I reliably get bit by this during certain kinds of deployments, where my 
nova-network/dns host has an entry in /etc/hosts such as:

  127.0.1.1hostname.example.com hostname

  I keep having to edit /etc/hosts on that machine to use a real IP,
  because juju gets really confused when it looks up certain openstack
  hostnames and gets sent to its own instance!

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1201873/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1258256] [NEW] Live upgrade from Havana broken by commit 62e9829

2013-12-05 Thread Dan Smith
Public bug reported:

Commit 62e9829 inadvertently broke live upgrades from Havana to master.
This was not really related to the patch itself, other than that it
bumped the Instance version which uncovered a bunch of issues in the
object infrastructure that weren't yet ready to handle this properly.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: Confirmed


** Tags: unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1258256

Title:
  Live upgrade from Havana broken by commit 62e9829

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Commit 62e9829 inadvertently broke live upgrades from Havana to
  master. This was not really related to the patch itself, other than
  that it bumped the Instance version which uncovered a bunch of issues
  in the object infrastructure that weren't yet ready to handle this
  properly.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1258256/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1265607] [NEW] Instance.refresh() sends new info_cache objects

2014-01-02 Thread Dan Smith
Public bug reported:

If an older node does an Instance.refresh() it will fail because
conductor will overwrite the info_cache field with a new
InstanceInfoCache object. This happens during the LifecycleEvent handler
in nova-compute.

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: Confirmed


** Tags: unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1265607

Title:
  Instance.refresh() sends new info_cache objects

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  If an older node does an Instance.refresh() it will fail because
  conductor will overwrite the info_cache field with a new
  InstanceInfoCache object. This happens during the LifecycleEvent
  handler in nova-compute.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1265607/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1265618] [NEW] image_snapshot_pending state breaks havana nodes

2014-01-02 Thread Dan Smith
Public bug reported:

Icehouse introduced a state called image_snapshot_pending which havana
nodes do not understand. If they call save with
expected_task_state="image_snapshot" they will crash on the new state.

2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2341, in _snapshot_instance
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
update_task_state)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 1386, in snapshot
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
update_task_state(task_state=task_states.IMAGE_PENDING_UPLOAD)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 2338, in update_task_state
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
instance.save(expected_task_state=expected_state)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/objects/base.py",
 line 139, in wrapper
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp ctxt, self, 
fn.__name__, args, kwargs)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/conductor/rpcapi.py",
 line 497, in object_action
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
objmethod=objmethod, args=args, kwargs=kwargs)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/rpcclient.py", 
line 85, in call
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return 
self._invoke(self.proxy.call, ctxt, method, **kwargs)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/rpcclient.py", 
line 63, in _invoke
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return 
cast_or_call(ctxt, msg, **self.kwargs)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/proxy.py",
 line 126, in call
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp result = 
rpc.call(context, real_topic, msg, timeout)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/__init__.py",
 line 139, in call
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return 
_get_impl().call(CONF, context, topic, msg, timeout)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/impl_kombu.py",
 line 816, in call
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
rpc_amqp.get_connection_pool(conf, Connection))
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/amqp.py",
 line 574, in call
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp rv = list(rv)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova-havana/local/lib/python2.7/site-packages/nova/openstack/common/rpc/amqp.py",
 line 539, in __iter__
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp raise result
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
UnexpectedTaskStateError_Remote: Unexpected task state: expecting 
(u'image_snapshot',) but the actual state is image_snapshot_pending
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp Traceback (most 
recent call last):
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova/nova/conductor/manager.py", line 576, in _object_dispatch
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return 
getattr(target, method)(context, *args, **kwargs)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova/nova/objects/base.py", line 152, in wrapper
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp return 
fn(self, ctxt, *args, **kwargs)
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp   File 
"/opt/upstack/nova/nova/objects/instance.py", line 459, in save
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp 
columns_to_join=_expected_cols(expected_attrs))
2014-01-02 11:58:46.766 TRACE nova.openstack.common.rpc.amqp

[Yahoo-eng-team] [Bug 981263] Re: Nova API should present deleted flavors (instance_types) in some cases

2013-02-08 Thread Dan Smith
This was fixed at some point, probably after several recent changes, and
is no longer an issue according to the reporter.

** Changed in: nova
   Status: Triaged => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/981263

Title:
  Nova API should present deleted flavors (instance_types) in some cases

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  In certain cases Nova API should return instance flavors
  (instance_types) that are deleted.  Notably if there is an instance
  that is "active" and the flavor has been deleted, we need to pull the
  instance_type data down to ensure that we can apply network specifics
  attached to that instance_type on startup of nova-compute.

  The second case that a deleted flavor should be returned is if the
  instance_type is being requested by ID, as IDs should not be reused.
  This is important for Horizon to be able to properly retrieve
  "instances" for a given project (in Nova Dashboard and Syspanel
  Dashboard).

  Example traceback you can see if you delete a flavor and restart nova
  compute:

  resource: 'NoneType' object is not subscriptable
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi Traceback (most recent call 
last):
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/wsgi.py", line 851, in _process_stack
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi action_result = 
self.dispatch(meth, request, action_args)
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/wsgi.py", line 926, in dispatch
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return 
method(req=request, **action_args)
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/servers.py", line 382, in detail
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi servers = 
self._get_servers(req, is_detail=True)
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/servers.py", line 465, in 
_get_servers
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return 
self._view_builder.detail(req, limited_list)
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 123, in 
detail
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return 
self._list_view(self.show, request, instances)
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 127, in 
_list_view
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi server_list = 
[func(request, server)["server"] for server in servers]
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 61, in 
wrapped
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi return func(self, 
request, instance)
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 97, in show
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi "flavor": 
self._get_flavor(request, instance),
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi   File 
"/opt/stack/nova/nova/api/openstack/compute/views/servers.py", line 172, in 
_get_flavor
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi flavor_id = 
instance["instance_type"]["flavorid"]
  2012-04-13 19:31:18 TRACE nova.api.openstack.wsgi TypeError: 'NoneType' 
object is not subscriptable

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/981263/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1089386] Re: destroying an instance not possible if a broken cinder volume is attached

2013-02-25 Thread Dan Smith
Unable to reproduce and original submitter unable to provide more
information.

** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1089386

Title:
  destroying an instance not possible if a broken cinder volume is
  attached

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  I just managed it to have an instance (in state shutoff) on nova with
  an attached volume of cinder which is no longer available.

  It's not possible to destroy this instance, I got an exception in
  class ComputeManager in method _shutdown_instance (file
  nova/compute/manager.py).

  Problem is the call to cinder to deattach the volume, which will fail
  because the volume no longer exists.

  The exception (ClientException of cinder) is not handled in the try-
  except-block and should be added.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1089386/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1119873] Re: nova-compute crashes if restarted with an instance in VERIFY_RESIZE state

2013-03-15 Thread Dan Smith
** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1119873

Title:
  nova-compute crashes if restarted with an instance in VERIFY_RESIZE
  state

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  steps to reproduce the issue:

  boot a vm

  nova migrate vm1 (or nova resize)

  wait for the vm to reach the VERIFY_RESIZE state

  stop nova-compute (kill -9 or similar)
   
  restart nova-compute

  The process will terminate after a few seconds with the following error:
  http://paste.openstack.org/show/30836/

  The only workaround I found consists in changing the VM status in the 
database. 
  nova delete before starting the service is not enough.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1119873/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1161538] Re: migrate fails with 'ProcessExecutionError'

2013-03-28 Thread Dan Smith
The log shows you're out of space on a disk that is trying to get
something copied to it.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1161538

Title:
  migrate fails with 'ProcessExecutionError'

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  I applied the latest build to my vs347 system in an attempt to verify
  a fix for Bug 1160489.

  I ended up hitting a different error:

  nova boot --image be8b6475-26d8-410f-aaa5-8b278d98c8f9 --flavor 1
  MIGRATE1

  [root@vs347 ~]# nova show MIGRATE1
  
+-+--+
  | Property| Value 
   |
  
+-+--+
  | status  | BUILD 
   |
  | updated | 2013-03-28T16:48:22Z  
   |
  | OS-EXT-STS:task_state   | networking
   |
  | OS-EXT-SRV-ATTR:host| vs342.rch.kstart.ibm.com  
   |
  | key_name| None  
   |
  | image   | Rhel6MasterFile 
(be8b6475-26d8-410f-aaa5-8b278d98c8f9)   |
  | hostId  | 
7ddf45b44e3e8078fa9401525a630083670fdf5a5792784c506a73f7 |
  | OS-EXT-STS:vm_state | building  
   |
  | OS-EXT-SRV-ATTR:instance_name   | bvt-instance-003d 
   |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | olyblade02.rch.stglabs.ibm.com
   |
  | flavor  | m1.tiny (1)   
   |
  | id  | 29931d39-8d18-4a25-b733-59e894d94731  
   |
  | security_groups | [{u'name': u'default'}]   
   |
  | user_id | 3ccf55fd609b45319f24fe681338886d  
   |
  | name| MIGRATE1  
   |
  | created | 2013-03-28T16:48:20Z  
   |
  | tenant_id   | 67b1c37f4ca64283908c7077e9e59997  
   |
  | OS-DCF:diskConfig   | MANUAL
   |
  | metadata| {}
   |
  | accessIPv4  |   
   |
  | accessIPv6  |   
   |
  | progress| 0 
   |
  | OS-EXT-STS:power_state  | 0 
   |
  | OS-EXT-AZ:availability_zone | nova  
   |
  | config_drive|   
   |
  
+-+--+

  [root@vs347 ~]# nova list
  
+--+--++---+
  | ID   | Name | Status | Networks 
 |
  
+--+--++---+
  | 29931d39-8d18-4a25-b733-59e894d94731 | MIGRATE1 | ACTIVE | 
demonet=172.0.0.5 |
  
+--+--++---+

  [root@vs347 ~]# nova migrate MIGRATE1
  [root@vs347 ~]#

  * State as of 11:53 a.m.

  [root@vs347 ~]# nova list
  
+--+--++---+
  | ID   | Name | Status | Networks 
 |
  
+--+--++---+
  | 29931d39-8d18-4a25-b733-59e894d94731 | MIGRATE1 | RESIZE | 
demonet=172.0.0.5 |
  
+--+--++---+

  [root@vs347 ~]# nova list
  
+--+--++---+
  | ID   | Name | Status | Networks 
 |
  
+--+--++---+
  | 29931d39-8d18-4a25-b733-59e894d94731 | MIGRATE1 | ERROR  | 
demonet=172.0.

[Yahoo-eng-team] [Bug 1161496] Re: Boot from volume will attach the VM to all networks

2013-03-28 Thread Dan Smith
OP realized this is a dupe

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1161496

Title:
  Boot from volume will attach the VM to all networks

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  When launching a new instance with the option 'Boot from volume' the vm will 
be attached to all the networks available for the tenant.
  I'm launching the instance through Horizon and using Quantum for the network.

  I found a question related to this bug
  https://answers.launchpad.net/nova/+question/217379.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1161496/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1161709] Re: confirm-resize failed, after migration. "KeyError: 'old_instance_type_memory_mb'"

2013-04-01 Thread Dan Smith
Yes, that's the fix I'm talking about. I'm going to mark this bug as
invalid since it has already been fixed.

** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1161709

Title:
  confirm-resize failed, after migration. "KeyError:
  'old_instance_type_memory_mb'"

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  confirm-resize failed, after migration. "KeyError: 
'old_instance_type_memory_mb'"
  Because after migration (Not resize), no "old_*" information exist in 
sys_meta. (old_* and new_* both exist only after resize operations)

  ---
  2013-03-28 01:24:50.716 ERROR nova.api.openstack.compute.servers 
[req-cb15c1c5-3045-479e-a921-3f05a94c27be e9d9c977a94c4204b59192689347c126 
e30341b47c714bf8b5f92b531cea9caf] Error in confirm-resize 
'old_instance_type_memory_mb'
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
Traceback (most recent call last):
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/api/openstack/compute/servers.py", line 
1051, in _action_confirm_resize
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
self.compute_api.confirm_resize(context, instance)
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/compute/api.py", line 174, in wrapped
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
return func(self, context, target, *args, **kwargs)
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/compute/api.py", line 164, in inner
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
return function(self, context, instance, *args, **kwargs)
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/compute/api.py", line 145, in inner
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
return f(self, context, instance, *args, **kw)
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/compute/api.py", line 1868, in 
confirm_resize
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
deltas = self._downsize_quota_delta(context, instance)
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/compute/api.py", line 1948, in 
_downsize_quota_delta
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
'old_')
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers   File 
"/usr/lib/python2.6/site-packages/nova/compute/instance_types.py", line 250, in 
extract_instance_type
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
instance_type[key] = type_fn(sys_meta[type_key])
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers 
KeyError: 'old_instance_type_memory_mb'
  2013-03-28 01:24:50.716 16798 TRACE nova.api.openstack.compute.servers

  

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1161709/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1165895] Re: image-create/snapshot image_state property/metadata always 'available'

2013-05-08 Thread Dan Smith
** Changed in: nova
   Importance: Undecided => Wishlist

** Changed in: nova
   Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1165895

Title:
  image-create/snapshot image_state property/metadata always 'available'

Status in OpenStack Compute (Nova):
  Opinion

Bug description:
  
https://github.com/openstack/nova/blob/147eebe613d5d1756ce4f11066c62474eabb6076/nova/virt/libvirt/driver.py#L1113

  'image_state': 'available' property added to every libvirt snapshot.

  I do not see the reason behind this "constant" property.

  Similar property removed by:
  
https://github.com/openstack/nova/commit/c3b7cce8101548428b64abb23ab88482bc79c36e

  Example glance output:
   glance image-show cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 
  
+---+--+
  | Property  | Value   
 |
  
+---+--+
  | Property 'base_image_ref' | 
d943f775-b228-4dde-b8e8-076a9fc60351 |
  | Property 'image_location' | snapshot
 |
  | Property 'image_state'| available   
 |
  | Property 'image_type' | snapshot
 |
  | Property 'instance_type_ephemeral_gb' | 0   
 |
  | Property 'instance_type_flavorid' | 3   
 |
  | Property 'instance_type_id'   | 1   
 |
  | Property 'instance_type_memory_mb'| 4096
 |
  | Property 'instance_type_name' | m1.medium   
 |
  | Property 'instance_type_root_gb'  | 40  
 |
  | Property 'instance_type_rxtx_factor'  | 1   
 |
  | Property 'instance_type_swap' | 0   
 |
  | Property 'instance_type_vcpu_weight'  | None
 |
  | Property 'instance_type_vcpus'| 2   
 |
  | Property 'instance_uuid'  | 
f2d9f28a-24a3-4068-8ee0-15f55122faef |
  | Property 'owner_id'   | b8bc1464db39459d9c3f814b908ae079
 |
  | Property 'user_id'| 695c21ac81c6499d851f3a560516f19c
 |
  | checksum  | fca8b0fb9346ea0c4ea167a7a7d9ce45
 |
  | container_format  | bare
 |
  | created_at| 2013-04-07T21:13:33 
 |
  | deleted   | False   
 |
  | disk_format   | qcow2   
 |
  | id| 
cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 |
  | is_public | False   
 |
  | min_disk  | 0   
 |
  | min_ram   | 0   
 |
  | name  | snap_2Gb_urandom
 |
  | owner | b8bc1464db39459d9c3f814b908ae079
 |
  | protected | False   
 |
  | size  | 2991521792  
 |
  | status| active  
 |
  | updated_at| 2013-04-07T21:15:19 
 |
  
+---+--+

  
  nova image-show cd0bd937-e2b3-4e3e-b22f-3bdb58c63755
  +-+--+
  | Property| Value|
  +-+--+
  | metadata owner_id   | b8bc1464db39459d9c3f814b908ae079 |
  | minDisk | 0|
  | metadata instance_type_name | m1.medium|
  | metadata instance_type_swap | 0|
  | metadata instance_type_memory_mb| 4096 |
  | id  | cd0bd937-e2b3-4e3e-b22f-3bdb58c63755 |
  | metadata instance_type_rxtx_factor  | 1|
  | metadata image_state| available|
  | metadata image_location | snapshot |
  | minRam  | 0  

[Yahoo-eng-team] [Bug 1180618] Re: fault['message'] needs to be updated with exception message

2013-06-05 Thread Dan Smith
I don't think this bug is valid. Isn't the problem just that you're
failing to schedule both times and ending up with the same error
message?

** Changed in: nova
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1180618

Title:
  fault['message'] needs to be updated with exception message

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  current implementation of nova/compute/utils.py will not update the
  exception message thrown from exception class.

  Here are steps taken to produce the defect:

  1. Created a fake glance image:
  glance image-create --name=Fake_Image --is-public=true --container-format=ovf 
--min-ram=4000 --disk-format=raw < /mnt/download/test2.raw
  (test2.raw is only a txt file, not an image file).
  2. The image Fake_Image shown:
  ubuntu@osee0221:/mnt/download$ nova image-list
  
+--+-+++
  | ID   | Name| 
Status | Server |
  
+--+-+++
  | f474b249-e7fb-45de-adad-c5338fa53c53 | cirros-0.3.1-x86_64-uec | 
ACTIVE ||
  | 4664b408-ad40-4fba-9d71-20c217189090 | cirros-0.3.1-x86_64-uec-kernel  | 
ACTIVE ||
  | bec25ebd-6543-4599-a8bf-c97b7ad3a649 | cirros-0.3.1-x86_64-uec-ramdisk | 
ACTIVE ||
  | 97bcfabc-5fab-4dd0-9d55-233613c0fdea | Fake_Image| 
ACTIVE ||
  
+--+-+++
  ubuntu@osee0221:/mnt/download$

  3. Now boot that Fake_Image:
  ubuntu@osee0221:/mnt/download$ nova boot --flavor 3 --image Fake_Image 
97bcfabc-5fab-4dd0-9d55-233613c0fdea
  +-+--+
  | Property| Value|
  +-+--+
  | OS-EXT-STS:task_state   | scheduling   |
  | image   | Fake_Image  |
  | OS-EXT-STS:vm_state | building |
  | OS-EXT-SRV-ATTR:instance_name   | instance-0002|
  | flavor  | m1.medium|
  | id  | bcae969f-ece0-4c20-8738-354fb3a7cf68 |
  | security_groups | [{u'name': u'default'}]  |
  | user_id | 8a6aac216f3241bba8b6cfda8255 |
  | OS-DCF:diskConfig   | MANUAL   |
  | accessIPv4  |  |
  | accessIPv6  |  |
  | progress| 0|
  | OS-EXT-STS:power_state  | 0|
  | OS-EXT-AZ:availability_zone | nova |
  | config_drive|  |
  | status  | BUILD|
  | updated | 2013-05-15T22:38:28Z |
  | hostId  |  |
  | OS-EXT-SRV-ATTR:host| None |
  | key_name| None |
  | OS-EXT-SRV-ATTR:hypervisor_hostname | None |
  | name| 97bcfabc-5fab-4dd0-9d55-233613c0fdea |
  | adminPass   | cs4ULhBnb545 |
  | tenant_id   | 97ba217a35a14b5aa09fefe9c95610c0 |
  | created | 2013-05-15T22:38:28Z |
  | metadata| {}   |
  +-+--+

  4. see the servers: (in Error state)
  ubuntu@osee0221:/mnt/download$ nova list
  
+--+--++--+
  | ID   | Name 
| Status | Networks |
  
+--+--++--+
  | 16c7fc43-8cab-48e4-be63-03c9305807d8 | 4664b408-ad40-4fba-9d71-20c217189090 
| ACTIVE | private=10.0.0.2 |
  | bcae969f-ece0-4c20-8738-354fb3a7cf68 | 97bcfabc-5fab-4dd0-9d55-233613c0fdea 
| ERROR  | private=10.0.0.3 |
  
+

[Yahoo-eng-team] [Bug 1932337] [NEW] Cinder store migration will fail if first GET'er is not the owner

2021-06-17 Thread Dan Smith
Public bug reported:

During an upgrade to Xena, cinder-backed image locations are migrated to
include the store name in the URL field. This is lazily done on the
first GET of the image. The problem is that the first user to GET an
image after the migration may not be an admin or the owner of the image,
as would be the case for a public or shared image. If that happens, the
user gets a 404 for a valid image because the DB layer refuses the
modify operation. This is logged:

2021-06-17 08:50:06,559 WARNING [glance.db.sqlalchemy.api] Attempted to
modify image user did not own.

The lazy migration code needs to tolerate this and allow someone else to
perform the migration without breaking non-owner GET operations until
the migration is complete.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1932337

Title:
  Cinder store migration will fail if first GET'er is not the owner

Status in Glance:
  New

Bug description:
  During an upgrade to Xena, cinder-backed image locations are migrated
  to include the store name in the URL field. This is lazily done on the
  first GET of the image. The problem is that the first user to GET an
  image after the migration may not be an admin or the owner of the
  image, as would be the case for a public or shared image. If that
  happens, the user gets a 404 for a valid image because the DB layer
  refuses the modify operation. This is logged:

  2021-06-17 08:50:06,559 WARNING [glance.db.sqlalchemy.api] Attempted
  to modify image user did not own.

  The lazy migration code needs to tolerate this and allow someone else
  to perform the migration without breaking non-owner GET operations
  until the migration is complete.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1932337/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1933360] [NEW] Test (and enforcement?) for os_hidden mutability on queued images is wrong

2021-06-23 Thread Dan Smith
Public bug reported:

The test
glance.tests.unit.v2.test_images_resource.TestImagesController.test_update_queued_image_with_hidden
seems to be looking to confirm that queued images cannot be marked as
hidden. However, if that was the case, it should be checking for
BadRequest (or similar) and not Forbidden. Currently it appears that the
authorization "everything is immutable if not the owner" layer is what
is triggering the Forbidden response.

If we want to assert that os_hidden cannot be modified for queued
images, we need to do that (as it does not appear to actually be
enforced anywhere). In that case, the test needs to be modified to check
for the proper return code as well.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1933360

Title:
  Test (and enforcement?) for os_hidden mutability on queued images is
  wrong

Status in Glance:
  New

Bug description:
  The test
  
glance.tests.unit.v2.test_images_resource.TestImagesController.test_update_queued_image_with_hidden
  seems to be looking to confirm that queued images cannot be marked as
  hidden. However, if that was the case, it should be checking for
  BadRequest (or similar) and not Forbidden. Currently it appears that
  the authorization "everything is immutable if not the owner" layer is
  what is triggering the Forbidden response.

  If we want to assert that os_hidden cannot be modified for queued
  images, we need to do that (as it does not appear to actually be
  enforced anywhere). In that case, the test needs to be modified to
  check for the proper return code as well.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1933360/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1940460] [NEW] ORM fixes broke opportunistic testing on py36

2021-08-18 Thread Dan Smith
Public bug reported:

The patch 9e002a77f2131d3594a2a4029a147beaf37f5b17 which is aimed at
fixing things in advance of SQLAlchemy 2.0 seems to have broken our
opportunistic testing of DB migrations on py36 only. This manifests as a
total lockup of one worker during functional tests, which fails to
report anything and eventually times out the job.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1940460

Title:
  ORM fixes broke opportunistic testing on py36

Status in Glance:
  New

Bug description:
  The patch 9e002a77f2131d3594a2a4029a147beaf37f5b17 which is aimed at
  fixing things in advance of SQLAlchemy 2.0 seems to have broken our
  opportunistic testing of DB migrations on py36 only. This manifests as
  a total lockup of one worker during functional tests, which fails to
  report anything and eventually times out the job.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1940460/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1958883] [NEW] Service version check breaks FFU

2022-01-24 Thread Dan Smith
Public bug reported:

As reported on the mailing list:

http://lists.openstack.org/pipermail/openstack-
discuss/2022-January/026603.html

The service version check at startup can prevent FFUs from being
possible without hacking the database. As implemented here:

https://review.opendev.org/c/openstack/nova/+/738482

We currently filter "forced down" computes from the check, but we should
probably also eliminate those down long enough due to missed heartbeats
(i.e. offline during the upgrade). However, a fast-moving FFU where
everything is switched from an old container to a new one would easily
still find computes that are considered "up" and effectively force a
wait.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1958883

Title:
  Service version check breaks FFU

Status in OpenStack Compute (nova):
  New

Bug description:
  As reported on the mailing list:

  http://lists.openstack.org/pipermail/openstack-
  discuss/2022-January/026603.html

  The service version check at startup can prevent FFUs from being
  possible without hacking the database. As implemented here:

  https://review.opendev.org/c/openstack/nova/+/738482

  We currently filter "forced down" computes from the check, but we
  should probably also eliminate those down long enough due to missed
  heartbeats (i.e. offline during the upgrade). However, a fast-moving
  FFU where everything is switched from an old container to a new one
  would easily still find computes that are considered "up" and
  effectively force a wait.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1958883/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1820125] [NEW] Libvirt driver ungracefully explodes if unsupported arch is found

2019-03-14 Thread Dan Smith
Public bug reported:

If a new libvirt exposes an arch name that nova does not support, we
fail to gracefully skip it during the instance capability gathering:

2019-03-14 19:11:31.709 6 ERROR nova.compute.manager 
[req-4e626631-fefc-4c58-a1cd-5207c9384a1b - - - - -] Error updating resources 
for node primary.: InvalidArchitectureName: Architecture name 'armv6l' is not 
recognised
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager Traceback (most recent 
call last):
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 7956, in _update_available_resource_for_node
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager startup=startup)
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/resource_tracker.py",
 line 727, in update_available_resource
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager resources = 
self.driver.get_available_resource(nodename)
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 7070, in get_available_resource
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager 
data["supported_instances"] = self._get_instance_capabilities()
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 5943, in _get_instance_capabilities
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager 
fields.Architecture.canonicalize(g.arch),
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/objects/fields.py", 
line 200, in canonicalize
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager raise 
exception.InvalidArchitectureName(arch=name)
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager InvalidArchitectureName: 
Architecture name 'armv6l' is not recognised
2019-03-14 19:11:31.709 6 ERROR nova.compute.manager

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1820125

Title:
  Libvirt driver ungracefully explodes if unsupported arch is found

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  If a new libvirt exposes an arch name that nova does not support, we
  fail to gracefully skip it during the instance capability gathering:

  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager 
[req-4e626631-fefc-4c58-a1cd-5207c9384a1b - - - - -] Error updating resources 
for node primary.: InvalidArchitectureName: Architecture name 'armv6l' is not 
recognised
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager Traceback (most recent 
call last):
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 7956, in _update_available_resource_for_node
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager startup=startup)
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/compute/resource_tracker.py",
 line 727, in update_available_resource
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager resources = 
self.driver.get_available_resource(nodename)
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 7070, in get_available_resource
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager 
data["supported_instances"] = self._get_instance_capabilities()
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 5943, in _get_instance_capabilities
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager 
fields.Architecture.canonicalize(g.arch),
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/nova/objects/fields.py", 
line 200, in canonicalize
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager raise 
exception.InvalidArchitectureName(arch=name)
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager InvalidArchitectureName: 
Architecture name 'armv6l' is not recognised
  2019-03-14 19:11:31.709 6 ERROR nova.compute.manager

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1820125/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1888713] [NEW] Async tasks, image import not supported in pure-WSGI mode

2020-07-23 Thread Dan Smith
Public bug reported:

The wsgi_app.py file in the tree allows operators to run Glance API as a
proper WSGI app. This has been the default devstack deployment for some
time and multiple real clouds in the wild deploy like this. However, an
attempt to start an import will be met with an image state of "queued"
forever and no tasks will ever start, run, or complete.

(note that this has been a known issue and the Glance team prescribes
running standalone eventlet-based glance-api for deployments that need
import to work).

** Affects: glance
 Importance: Undecided
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1888713

Title:
  Async tasks, image import not supported in pure-WSGI mode

Status in Glance:
  In Progress

Bug description:
  The wsgi_app.py file in the tree allows operators to run Glance API as
  a proper WSGI app. This has been the default devstack deployment for
  some time and multiple real clouds in the wild deploy like this.
  However, an attempt to start an import will be met with an image state
  of "queued" forever and no tasks will ever start, run, or complete.

  (note that this has been a known issue and the Glance team prescribes
  running standalone eventlet-based glance-api for deployments that need
  import to work).

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1888713/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1891190] [NEW] test_reload() functional test causes hang and jobs TIMED_OUT

2020-08-11 Thread Dan Smith
Public bug reported:

The glance.tests.functional.test_reload.TestReload.test_reload() test
has been causing spurious deadlocks in functional test jobs, resulting
in TIMED_OUT job statuses due to the global timeout expiring. This can
be reproduced locally with lots of exposure, but Zuul runs things enough
to hit it fairly often.

I have tracked this down to the test_reload() test, which if I reproduce
this locally, I find it is in an infinite waitpid() on the API master
process that the FunctionalTest base class has started for it. The test
tracks child PIDs of the master as it initiates several SIGHUP
operations. Upon exit, the FunctionalTest.cleanup() routine runs and
ends up waitpid()ing on the master process forever. A process list shows
all the other stestr workers in Z state waiting for the final worker to
complete. The final worker being stuck on waitpid() has the master
process and both worker processes still running. Upon killing the
master, stestr frees up, reports status from the test and exits
normally.

Stack trace of the hung test process after signaling the master it is
waiting for manually is:

Traceback (most recent call last):
  File "/usr/lib64/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
  File "/usr/lib64/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/run.py",
 line 93, in 
main()
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/run.py",
 line 89, in main
testRunner=partial(runner, stdout=sys.stdout))
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/program.py",
 line 185, in __init__
self.runTests()
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/program.py",
 line 226, in runTests
self.result = testRunner.run(self.test)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/stestr/subunit_runner/run.py",
 line 52, in run
test(result)
  File "/usr/lib64/python3.7/unittest/suite.py", line 84, in __call__
return self.run(*args, **kwds)
  File "/usr/lib64/python3.7/unittest/suite.py", line 122, in run
test(result)
  File "/usr/lib64/python3.7/unittest/suite.py", line 84, in __call__
return self.run(*args, **kwds)
  File "/usr/lib64/python3.7/unittest/suite.py", line 122, in run
test(result)
  File "/usr/lib64/python3.7/unittest/suite.py", line 84, in __call__
return self.run(*args, **kwds)
  File "/usr/lib64/python3.7/unittest/suite.py", line 122, in run
test(result)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/unittest2/case.py",
 line 673, in __call__
return self.run(*args, **kwds)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/testcase.py",
 line 675, in run
return run_test.run(result)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py",
 line 80, in run
return self._run_one(actual_result)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py",
 line 94, in _run_one
return self._run_prepared_result(ExtendedToOriginalDecorator(result))
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py",
 line 119, in _run_prepared_result
raise e
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/testtools/runtest.py",
 line 191, in _run_user
return fn(*args, **kwargs)
  File "/home/dan/glance/glance/tests/functional/__init__.py", line 881, in 
cleanup
s.stop()
  File "/home/dan/glance/glance/tests/functional/__init__.py", line 293, in stop
rc = test_utils.wait_for_fork(self.process_pid, raise_error=False)
  File "/home/dan/glance/glance/tests/utils.py", line 294, in wait_for_fork
(pid, rc) = os.waitpid(pid, 0)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/green/os.py",
 line 96, in waitpid
greenthread.sleep(0.01)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/greenthread.py",
 line 36, in sleep
hub.switch()
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/hub.py",
 line 298, in switch
return self.greenlet.switch()
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/hub.py",
 line 350, in run
self.wait(sleep_time)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/poll.py",
 line 80, in wait
presult = self.do_poll(seconds)
  File 
"/home/dan/glance/.tox/functional/lib/python3.7/site-packages/eventlet/hubs/epolls.py",
 line 31, in do_poll
return self.poll.poll(seconds)

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
h

[Yahoo-eng-team] [Bug 1891352] [NEW] Failed import of one store will remain in progress forever if all_stores_must_succeed=True

2020-08-12 Thread Dan Smith
Public bug reported:

If import is called with all_stores_must_succeed=True and a store fails
during set_image_data(), the store will remain in
os_glance_importing_stores forever, never going into the
os_glance_failed_import list. This means a polling client will never
notice that the import failed. Further, if multiple stores are included
in the import, and the failure happens in the later stores, the revert
process will remove the earlier stores (after they had already been
reported as available in stores). This means a polling client doing an
import on an image already in store1 to store2,store3,store4 will see
the following progression:

stores=store1;os_glance_importing_to_stores=store2,store3,store4

stores=store1,store2;os_glance_importing_to_stores=store3,store4

stores=store1,store2,store3;os_glance_importing_to_stores=store4

stores=store1,store2;os_glance_importing_to_stores=store4

stores=store1;os_glance_importing_to_stores=store4

The last line, forever, and never see anything in
os_glance_failed_import

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1891352

Title:
  Failed import of one store will remain in progress forever if
  all_stores_must_succeed=True

Status in Glance:
  New

Bug description:
  If import is called with all_stores_must_succeed=True and a store
  fails during set_image_data(), the store will remain in
  os_glance_importing_stores forever, never going into the
  os_glance_failed_import list. This means a polling client will never
  notice that the import failed. Further, if multiple stores are
  included in the import, and the failure happens in the later stores,
  the revert process will remove the earlier stores (after they had
  already been reported as available in stores). This means a polling
  client doing an import on an image already in store1 to
  store2,store3,store4 will see the following progression:

  stores=store1;os_glance_importing_to_stores=store2,store3,store4

  stores=store1,store2;os_glance_importing_to_stores=store3,store4

  stores=store1,store2,store3;os_glance_importing_to_stores=store4

  stores=store1,store2;os_glance_importing_to_stores=store4

  stores=store1;os_glance_importing_to_stores=store4

  The last line, forever, and never see anything in
  os_glance_failed_import

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1891352/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1897907] [NEW] DELETE fails on StaleDataError when updating image_properties

2020-09-30 Thread Dan Smith
Public bug reported:

During the MultiStoresImportTest module in tempest, when we go to clean
up images during tearDown, we occasionally get a 500 from the delete,
which yields this from the test:

ft1.1: tearDownClass 
(tempest.api.image.v2.test_images.MultiStoresImportImagesTest)testtools.testresult.real._StringException:
 Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/test.py", line 242, in tearDownClass
six.reraise(etype, value, trace)
  File "/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/six.py", 
line 703, in reraise
raise value
  File "/opt/stack/tempest/tempest/test.py", line 214, in tearDownClass
teardown()
  File "/opt/stack/tempest/tempest/test.py", line 585, in resource_cleanup
raise testtools.MultipleExceptions(*cleanup_errors)
testtools.runtest.MultipleExceptions: ((, Got server fault
Details: The server has either erred or is incapable of performing the 
requested operation.


, ), (, Request timed out
Details: (MultiStoresImportImagesTest:tearDownClass) Failed to delete image 
9c4bba30-c244-4712-9995-86446a38eed8 within the required time (300 s)., 
))


The corresponding g-api.log message shows that we're failing to delete 
something from image_properties, I'm guessing because something has changed the 
image underneath us between fetch and delete.


Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi [None req-4d353638-2da8-4a8b-8c6b-fb879b27c90b 
tempest-MultiStoresImportImagesTest-208757482 
tempest-MultiStoresImportImagesTest-208757482] Caught error: UPDATE statement 
on table 'image_properties' expected to update 1 row(s); 0 were matched.: 
sqlalchemy.orm.exc.StaleDataError: UPDATE statement on table 'image_properties' 
expected to update 1 row(s); 0 were matched.
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi Traceback (most recent call last):
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/common/wsgi.py", line 1347, 
in __call__
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi action_result = self.dispatch(self.controller, action,
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/common/wsgi.py", line 1391, 
in dispatch
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi return method(*args, **kwargs)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/common/utils.py", line 416, 
in wrapped
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi return func(self, req, *args, **kwargs)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/api/v2/images.py", line 
664, in delete
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi image_repo.remove(image)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/domain/proxy.py", line 104, 
in remove
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi result = self.base.remove(base_item)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/notifier.py", line 542, in 
remove
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi super(ImageRepoProxy, self).remove(image)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/domain/proxy.py", line 104, 
in remove
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi result = self.base.remove(base_item)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/domain/proxy.py", line 104, 
in remove
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi result = self.base.remove(base_item)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   File "/opt/stack/glance/glance/domain/proxy.py", line 104, 
in remove
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi result = self.base.remove(base_item)
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
glance.common.wsgi   [Previous line repeated 1 more time]
Sep 30 09:52:44.240675 ubuntu-focal-rax-iad-0020118352 glance-api[94242]: ERROR 
gl

[Yahoo-eng-team] [Bug 1912001] [NEW] glance allows reserved properties during create()

2021-01-15 Thread Dan Smith
Public bug reported:

Certain image properties are reserved for internal glance usage, such as
os_glance_import_task. Changing these properties is disallowed during
PATCH. However, glance does not enforce that they are not present in an
image POST. It should.

This command:

openstack --debug image create --container-format bare --disk-format qcow2 \
  --property os_glance_import_task=foobar test

succeeds in creating an image with os_glance_import_task set.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1912001

Title:
  glance allows reserved properties during create()

Status in Glance:
  New

Bug description:
  Certain image properties are reserved for internal glance usage, such
  as os_glance_import_task. Changing these properties is disallowed
  during PATCH. However, glance does not enforce that they are not
  present in an image POST. It should.

  This command:

  openstack --debug image create --container-format bare --disk-format qcow2 \
--property os_glance_import_task=foobar test

  succeeds in creating an image with os_glance_import_task set.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1912001/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1913625] [NEW] Glance will leak staging data

2021-01-28 Thread Dan Smith
Public bug reported:

In various situations, glance will leak (potentially very large)
temporary files in the staging store.

One example is doing a web-download import, where glance initially
downloads the image to its staging store. If the worker doing that
activity crashes, loses power, etc, the user may delete the image and
try again on another worker. When the crashed worker resumes, the
staging data will remain but nothing will ever clean it up.

Another example would be a misconfigured glance that uses local staging
directories, but glance-direct is used, where the user stages data, and
then deletes the image from another worker.

Even in a situation where shared staging is properly configured, a
failure to access the staging location during the delete call will
result in the image being deleted, but the staging file not being
purged.

IMHO, glance workers should clean their staging directories at startup,
purging any data that is attributable to a previous image having been
deleted.

Another option is to add a store location for each staged image, and
make sure the scrubber can clean those things from the staging directory
periodically (this requires also running the scrubber on each node,
which may not be common practice currently).

** Affects: glance
 Importance: Undecided
 Status: Invalid

** Changed in: glance
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1913625

Title:
  Glance will leak staging data

Status in Glance:
  Invalid

Bug description:
  In various situations, glance will leak (potentially very large)
  temporary files in the staging store.

  One example is doing a web-download import, where glance initially
  downloads the image to its staging store. If the worker doing that
  activity crashes, loses power, etc, the user may delete the image and
  try again on another worker. When the crashed worker resumes, the
  staging data will remain but nothing will ever clean it up.

  Another example would be a misconfigured glance that uses local
  staging directories, but glance-direct is used, where the user stages
  data, and then deletes the image from another worker.

  Even in a situation where shared staging is properly configured, a
  failure to access the staging location during the delete call will
  result in the image being deleted, but the staging file not being
  purged.

  IMHO, glance workers should clean their staging directories at
  startup, purging any data that is attributable to a previous image
  having been deleted.

  Another option is to add a store location for each staged image, and
  make sure the scrubber can clean those things from the staging
  directory periodically (this requires also running the scrubber on
  each node, which may not be common practice currently).

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1913625/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1914664] [NEW] QEMU monitor read failure in ServerStableDeviceRescueTest

2021-02-04 Thread Dan Smith
Public bug reported:

Seeing this failure in the gate:

https://zuul.opendev.org/t/openstack/build/7c71502b04fe47039b87f76fbe04fe56/log/controller/logs/screen-n-cpu.txt#33096


Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 
nova-compute[90163]: ERROR nova.virt.libvirt.driver 
[req-77f51485-cbc2-4f2a-8a6d-4a8ed910e585 
req-a221d4f9-401e-420a-911e-8d32536a1d23 service nova] [instance: 
7174e97c-8cf4-46c7-9498-2c5dbc452431] detaching network adapter failed.: 
libvirt.libvirtError: internal error: End of file from qemu monitor

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] Traceback (most recent call last):

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 2210, in
detach_interface

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] wait_for_detach =
guest.detach_device_with_retry(

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 423, in
detach_device_with_retry

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] _try_detach_device(conf, persistent,
live)

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 412, in
_try_detach_device

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] ctx.reraise = True

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File "/usr/local/lib/python3.8/dist-
packages/oslo_utils/excutils.py", line 220, in __exit__

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] self.force_reraise()

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File "/usr/local/lib/python3.8/dist-
packages/oslo_utils/excutils.py", line 196, in force_reraise

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] six.reraise(self.type_, self.value,
self.tb)

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File "/usr/local/lib/python3.8/dist-
packages/six.py", line 703, in reraise

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] raise value

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 398, in
_try_detach_device

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] self.detach_device(conf,
persistent=persistent, live=live)

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File
"/opt/stack/nova/nova/virt/libvirt/guest.py", line 473, in detach_device

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]
self._domain.detachDeviceFlags(device_xml, flags=flags)

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431]   File "/usr/local/lib/python3.8/dist-
packages/eventlet/tpool.py", line 190, in doit

Feb 04 20:54:32.857198 ubuntu-focal-limestone-regionone-0022873642 nova-
compute[90163]: ERROR nova.virt.libvirt.driver [instance: 7174e97c-
8cf4-46c7-9498-2c5dbc452431] result = proxy_call(self._autowrap, f,
*arg

[Yahoo-eng-team] [Bug 1914665] [NEW] Cinder Multistore job hits Cinder Quota error

2021-02-04 Thread Dan Smith
Public bug reported:

Noticed during a cinder multistore test run, we hit a quota not found
error. It looks like we don't handle this well, which causes nova to see
a 503: Proxy Error. I dunno if there's anything better can do than raise
a 5xx, but we should probably explain in the error what happened when we
know, as we clearly do here.

>From this:

https://cbff25b854b00bc0ff99-8ce5690b0835baabd00baac02d43f418.ssl.cf5.rackcdn.com/770629/5/check
/glance-multistore-cinder-
import/7c71502/controller/logs/screen-g-api.txt

this log text (see the end):

Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi Traceback (most recent 
call last):
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/common/wsgi.py", line 1347, in __call__
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi action_result = 
self.dispatch(self.controller, action,
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/common/wsgi.py", line 1391, in dispatch
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi return 
method(*args, **kwargs)
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/common/utils.py", line 416, in wrapped
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi return func(self, 
req, *args, **kwargs)
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/api/v2/image_data.py", line 299, in upload
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi 
self._restore(image_repo, image)
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi self.force_reraise()
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi 
six.reraise(self.type_, self.value, self.tb)
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/six.py", line 703, in reraise
Feb 04 21:07:13.368998 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi raise value
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/api/v2/image_data.py", line 164, in upload
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi 
image.set_data(data, size, backend=backend)
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/domain/proxy.py", line 208, in set_data
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi 
self.base.set_data(data, size, backend=backend, set_active=set_active)
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/opt/stack/glance/glance/notifier.py", line 501, in set_data
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi 
_send_notification(notify_error, 'image.upload', msg)
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi self.force_reraise()
Feb 04 21:07:13.370209 ubuntu-focal-limestone-regionone-0022873642 
devstack@g-api.service[93292]: ERROR glance.common.wsgi   File 
"/usr/local/lib/python3.8/dist-packages/oslo_utils/ex

[Yahoo-eng-team] [Bug 1914826] [NEW] web-download with invalid url does not report error

2021-02-05 Thread Dan Smith
Public bug reported:

In my testing, if I provide a URL to web-download that yields an error
from urlopen(), I never see the store listed in the
os_glance_failed_import list, and the store remains in
os_glance_importing_to_stores. The image status does not change, which
means there's no way for the API client to know that the import failed.

I found this when debugging a gate issue where occasionally the tempest
web-download test fails. It ends up waiting many minutes for the import
to complete, even though it failed long before that. In that case, the
cirros link we use for testing web-download raised a timeout.

>From my log, here what we log as returning to the user just before we
start the import:

Feb 05 20:18:02 guaranine devstack@g-api.service[1008592]: DEBUG
oslo_policy.policy [-] enforce: rule="modify_image" creds={"domain_id":
null, "is_admin_project": true, "project_domain_id": "default",
"project_id": "59a5997403484e97803cac28b7aa7366", "roles": ["reader",
"member"], "service_project_domain_id": null, "service_project_id":
null, "service_roles": [], "service_user_domain_id": null,
"service_user_id": null, "system_scope": null, "user_domain_id":
"default", "user_id": "10e5d60c60e54ab3889bcd57e367fe01"}
target={"checksum": null, "container_format": "bare", "created_at":
"2021-02-05T20:18:03.00", "disk_format": "raw", "extra_properties":
{}, "image_id": "70917fce-bfc6-4d57-aa54-58235d09cf24", "locations": [],
"min_disk": 0, "min_ram": 0, "name": "test", "os_glance_failed_import":
"", "os_glance_import_task": "e2cb5441-8c92-45c6-9363-f4b7915401e1",
"os_glance_importing_to_stores": "cheap", "os_hash_algo": null,
"os_hash_value": null, "os_hidden": false, "owner":
"59a5997403484e97803cac28b7aa7366", "protected": false, "size": null,
"status": "importing", "tags": [], "updated_at":
"2021-02-05T20:18:03.00", "virtual_size": null, "visibility":
"shared"} {{(pid=1008592) enforce /usr/local/lib/python3.8/dist-
packages/oslo_policy/policy.py:994}}

Note that os_glance_importing_to_stores="cheap" and
os_glance_failed_import="". Shortly after this, the web-download task
fails:

Feb 05 20:18:03 guaranine devstack@g-api.service[1008592]: ERROR
glance.async_.flows._internal_plugins.web_download [-] Task
e2cb5441-8c92-45c6-9363-f4b7915401e1 failed with exception : urllib.error.URLError:


Here's where the task is fully reverted:

Feb 05 20:18:03 guaranine devstack@g-api.service[1008592]: WARNING 
glance.async_.taskflow_executor [-] Task 'api_image_import-WebDownlo
ad-e2cb5441-8c92-45c6-9363-f4b7915401e1' (bc722b5c-ddd4-404b-9c09-8625ed9c5941) 
transitioned into state 'REVERTED' from state 'REVERTIN
G' with result 'None'

And after that, here's what we're still returning to the user:

Feb 05 20:18:03 guaranine devstack@g-api.service[1008592]: DEBUG 
oslo_policy.policy [-] enforce: rule="get_image" creds={"domain_id": n
ull, "is_admin_project": true, "project_domain_id": "default", "project_id": 
"59a5997403484e97803cac28b7aa7366", "roles": ["reader", "m
ember"], "service_project_domain_id": null, "service_project_id": null, 
"service_roles": [], "service_user_domain_id": null, "service_u
ser_id": null, "system_scope": null, "user_domain_id": "default", "user_id": 
"10e5d60c60e54ab3889bcd57e367fe01"} target={"checksum": nu
ll, "container_format": "bare", "created_at": "2021-02-05T20:18:03.00", 
"disk_format": "raw", "extra_properties": {}, "image_id": "
70917fce-bfc6-4d57-aa54-58235d09cf24", "locations": [], "min_disk": 0, 
"min_ram": 0, "name": "test", "os_glance_failed_import": "", "os
_glance_import_task": "e2cb5441-8c92-45c6-9363-f4b7915401e1", 
"os_glance_importing_to_stores": "cheap", "os_hash_algo": null, "os_hash_
value": null, "os_hidden": false, "owner": "59a5997403484e97803cac28b7aa7366", 
"protected": false, "size": null, "status": "queued", "t
ags": [], "updated_at": "2021-02-05T20:18:03.00", "virtual_size": null, 
"visibility": "shared"} {{(pid=1008592) enforce /usr/local/
lib/python3.8/dist-packages/oslo_policy/policy.py:994}}

Note that os_glance_importing_to_stores="cheap" and
os_glance_failed_import="". In this case, "cheap" should have moved from
"importing" to "failed".

I wrote a tempest negative test for this situation using a totally bogus
URL, which is here:

https://review.opendev.org/c/openstack/tempest/+/774303

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1914826

Title:
  web-download with invalid url does not report error

Status in Glance:
  New

Bug description:
  In my testing, if I provide a URL to web-download that yields an error
  from urlopen(), I never see the store listed in the
  os_glance_failed_import list, and the store remains in
  os_glance_importing_to_stores. The image status does not change, which
  means there's no way for the API client to know that the import
  failed.

  

[Yahoo-eng-team] [Bug 1915543] [NEW] Glance returns 403 instead of 404 when images are not found

2021-02-12 Thread Dan Smith
Public bug reported:

Glance is translating "Not Found" errors from the DB layer into "Not
Authorized" errors in policy, which it should not be doing. In general,
we should always return 404 when something either does not exist, or
when permissions do not allow you to know if that thing exists.

Glance is actually translating both cases into "not authorized", which
is confusing and runs counter to the goal.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1915543

Title:
  Glance returns 403 instead of 404 when images are not found

Status in Glance:
  New

Bug description:
  Glance is translating "Not Found" errors from the DB layer into "Not
  Authorized" errors in policy, which it should not be doing. In
  general, we should always return 404 when something either does not
  exist, or when permissions do not allow you to know if that thing
  exists.

  Glance is actually translating both cases into "not authorized", which
  is confusing and runs counter to the goal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1915543/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1913625] Re: Glance will leak staging data

2021-02-22 Thread Dan Smith
** Changed in: glance
   Status: Invalid => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1913625

Title:
  Glance will leak staging data

Status in Glance:
  Confirmed

Bug description:
  In various situations, glance will leak (potentially very large)
  temporary files in the staging store.

  One example is doing a web-download import, where glance initially
  downloads the image to its staging store. If the worker doing that
  activity crashes, loses power, etc, the user may delete the image and
  try again on another worker. When the crashed worker resumes, the
  staging data will remain but nothing will ever clean it up.

  Another example would be a misconfigured glance that uses local
  staging directories, but glance-direct is used, where the user stages
  data, and then deletes the image from another worker.

  Even in a situation where shared staging is properly configured, a
  failure to access the staging location during the delete call will
  result in the image being deleted, but the staging file not being
  purged.

  IMHO, glance workers should clean their staging directories at
  startup, purging any data that is attributable to a previous image
  having been deleted.

  Another option is to add a store location for each staged image, and
  make sure the scrubber can clean those things from the staging
  directory periodically (this requires also running the scrubber on
  each node, which may not be common practice currently).

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1913625/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1921399] [NEW] check_instance_shared_storage RPC call is broken

2021-03-25 Thread Dan Smith
Public bug reported:

We broke check_instance_shared_storage() in this change:

https://review.opendev.org/c/openstack/nova/+/761452/13..15/nova/compute/rpcapi.py

Where we re-ordered the rpcapi client signature without adjusting the
caller. This leads to this failure:

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] Traceback (most recent call last):

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/opt/stack/new/nova/nova/compute/manager.py",
line 797, in _is_instance_storage_shared

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] instance, data, host=host))

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/opt/stack/new/nova/nova/compute/rpcapi.py",
line 618, in check_instance_shared_storage

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] return cctxt.call(ctxt,
'check_instance_shared_storage', **msg_args)

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/usr/local/lib/python3.6/dist-
packages/oslo_messaging/rpc/client.py", line 179, in call

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] transport_options=self.transport_options)

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/usr/local/lib/python3.6/dist-
packages/oslo_messaging/transport.py", line 128, in _send

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] transport_options=transport_options)

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/usr/local/lib/python3.6/dist-
packages/oslo_messaging/_drivers/amqpdriver.py", line 682, in send

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] transport_options=transport_options)

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/usr/local/lib/python3.6/dist-
packages/oslo_messaging/_drivers/amqpdriver.py", line 672, in _send

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] raise result

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] AttributeError: 'Instance' object has no attribute
'filename'

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] Traceback (most recent call last):

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/usr/local/lib/python3.6/dist-
packages/oslo_messaging/rpc/server.py", line 165, in _process_incoming

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] res = self.dispatcher.dispatch(message)

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef]   File "/usr/local/lib/python3.6/dist-
packages/oslo_messaging/rpc/dispatcher.py", line 309, in dispatch

Mar 25 13:46:28.041587 ubuntu-bionic-vexxhost-ca-ymq-1-0023683006 nova-
compute[8570]: ERROR nova.compute.manager [instance: 20d48d76-f93c-4b3c-
90a8-cd7f654b28ef] return self._do_dispa

[Yahoo-eng-team] [Bug 1922928] [NEW] Image tasks API excludes in-progress tasks

2021-04-07 Thread Dan Smith
Public bug reported:

The glance /images/$uuid/tasks API is excluding in-progress tasks,
leading to test failures like this one:


Traceback (most recent call last):
  File "/opt/stack/tempest/tempest/api/image/v2/test_images.py", line 111, in 
test_image_glance_direct_import
self.assertEqual(1, len(tasks['tasks']))
  File 
"/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py",
 line 415, in assertEqual
self.assertThat(observed, matcher, message)
  File 
"/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py",
 line 502, in assertThat
raise mismatch_error
testtools.matchers._impl.MismatchError: 1 != 0


This is caused by the fact that we assert that the task is not expired by 
comparing the expires_at column to the current time. However, if the task is 
not completed yet, the expires_at will be NULL and never pass that test.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1922928

Title:
  Image tasks API excludes in-progress tasks

Status in Glance:
  New

Bug description:
  The glance /images/$uuid/tasks API is excluding in-progress tasks,
  leading to test failures like this one:

  
  Traceback (most recent call last):
File "/opt/stack/tempest/tempest/api/image/v2/test_images.py", line 111, in 
test_image_glance_direct_import
  self.assertEqual(1, len(tasks['tasks']))
File 
"/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py",
 line 415, in assertEqual
  self.assertThat(observed, matcher, message)
File 
"/opt/stack/tempest/.tox/tempest/lib/python3.8/site-packages/testtools/testcase.py",
 line 502, in assertThat
  raise mismatch_error
  testtools.matchers._impl.MismatchError: 1 != 0

  
  This is caused by the fact that we assert that the task is not expired by 
comparing the expires_at column to the current time. However, if the task is 
not completed yet, the expires_at will be NULL and never pass that test.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1922928/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2018612] [NEW] Guest kernel crashes with GPF on volume attach

2023-05-05 Thread Dan Smith
Public bug reported:

This isn't really a bug in nova, but it's something that we're hitting
in CI quite a bit, so I'm filing here to record the details and so I can
recheck against it. The actual bug is either in the guest (cirros 0.5.2)
kernel, QEMU, or something similar. In tests where we attach a volume to
a running guest, we occasionally get a guest kernel crash and stack
trace that pretty much prevents anything else from working later in the
test.

Here's what the trace looks like:

[   10.152160] virtio_blk virtio2: [vda] 2093056 512-byte logical blocks (1.07 
GB/1022 MiB)
[   10.198313] GPT:Primary header thinks Alt. header is not at the end of the 
disk.
[   10.199033] GPT:229375 != 2093055
[   10.199278] GPT:Alternate GPT header not at the end of the disk.
[   10.199632] GPT:229375 != 2093055
[   10.199857] GPT: Use GNU Parted to correct GPT errors.
[   11.291631] random: fast init done
[   11.312007] random: crng init done
[   11.419215] general protection fault:  [#1] SMP PTI
[   11.420843] CPU: 0 PID: 199 Comm: modprobe Not tainted 5.3.0-26-generic 
#28~18.04.1-Ubuntu
[   11.421917] Hardware name: OpenStack Foundation OpenStack Nova, BIOS 
1.13.0-1ubuntu1.1 04/01/2014
[   11.424732] RIP: 0010:__kmalloc_track_caller+0xa1/0x250
[   11.425934] Code: 65 49 8b 50 08 65 4c 03 05 b4 48 37 6f 4d 8b 38 4d 85 ff 
0f 84 77 01 00 00 41 8b 59 20 49 8b 39 48 8d 4a 01 4c 89 f8 4c 01 fb <48> 33 1b 
49 33 99 70 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd
[   11.428460] RSP: 0018:b524801afaf0 EFLAGS: 0206
[   11.429261] RAX: 51f2a72f63305b11 RBX: 51f2a72f63305b11 RCX: 2b7e
[   11.430205] RDX: 2b7d RSI: 0cc0 RDI: 0002f040
[   11.431123] RBP: b524801afb28 R08: 90480762f040 R09: 904807001c40
[   11.432032] R10: b524801afc28 R11: 0001 R12: 0cc0
[   11.432953] R13: 0004 R14: 904807001c40 R15: 51f2a72f63305b11
[   11.434125] FS:  7fb31d2486a0() GS:90480760() 
knlGS:
[   11.435139] CS:  0010 DS:  ES:  CR0: 80050033
[   11.435909] CR2: 00abf9a8 CR3: 027c2000 CR4: 06f0
[   11.437208] Call Trace:
[   11.438716]  ? kstrdup_const+0x24/0x30
[   11.439170]  kstrdup+0x31/0x60
[   11.439668]  kstrdup_const+0x24/0x30
[   11.440036]  kvasprintf_const+0x86/0xa0
[   11.440397]  kobject_set_name_vargs+0x23/0x90
[   11.440791]  kobject_set_name+0x49/0x70
[   11.452382]  bus_register+0x80/0x270
[   11.462448]  ? 0xc033b000
[   11.471469]  hid_init+0x2b/0x62 [hid]
[   11.480198]  do_one_initcall+0x4a/0x1fa
[   11.487738]  ? _cond_resched+0x19/0x40
[   11.495227]  ? kmem_cache_alloc_trace+0x1ff/0x210
[   11.502700]  do_init_module+0x5f/0x227
[   11.510944]  load_module+0x1b96/0x2140
[   11.517993]  __do_sys_finit_module+0xfc/0x120
[   11.525101]  ? __do_sys_finit_module+0xfc/0x120
[   11.533182]  __x64_sys_finit_module+0x1a/0x20
[   11.542123]  do_syscall_64+0x5a/0x130
[   11.549183]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   11.557921] RIP: 0033:0x7fb31cbaba7d
[   11.565182] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 79 9e 02 00 48 89 
f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 
f0 ff ff 0f 83 3a fd ff ff c3 48 c7 c6 01 00 00 00 e9 a1
[   11.581697] RSP: 002b:7ffdf6793c18 EFLAGS: 0206 ORIG_RAX: 
0139
[   11.589245] RAX: ffda RBX:  RCX: 7fb31cbaba7d
[   11.597913] RDX:  RSI: 004ab235 RDI: 0003
[   11.605694] RBP: 004ab235 R08: 00c7 R09: 7fb31cbeba5f
[   11.613566] R10:  R11: 0206 R12: 0003
[   11.620772] R13: 00ab3c70 R14: 00ab3cc0 R15: 
[   11.628586] Modules linked in: hid(+) virtio_rng virtio_gpu drm_kms_helper 
syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_scsi virtio_net 
net_failover failover virtio_input virtio_blk qemu_fw_cfg 9pnet_virtio 9pnet 
pcnet32 8139cp mii ne2k_pci 8390 e1000
[   11.654944] ---[ end trace 9a9e8eebda38a127 ]---
[   11.663441] RIP: 0010:__kmalloc_track_caller+0xa1/0x250
[   11.671942] Code: 65 49 8b 50 08 65 4c 03 05 b4 48 37 6f 4d 8b 38 4d 85 ff 
0f 84 77 01 00 00 41 8b 59 20 49 8b 39 48 8d 4a 01 4c 89 f8 4c 01 fb <48> 33 1b 
49 33 99 70 01 00 00 65 48 0f c7 0f 0f 94 c0 84 c0 74 bd
[   11.689167] RSP: 0018:b524801afaf0 EFLAGS: 0206
[   11.698903] RAX: 51f2a72f63305b11 RBX: 51f2a72f63305b11 RCX: 2b7e
[   11.707107] RDX: 2b7d RSI: 0cc0 RDI: 0002f040
[   11.715748] RBP: b524801afb28 R08: 90480762f040 R09: 904807001c40
[   11.724372] R10: b524801afc28 R11: 0001 R12: 0cc0
[   11.735147] R13: 0004 R14: 904807001c40 R15: 51f2a72f63305b11
[   11.747065] FS:  7fb31d2486a0() GS:90480760() 
knlGS:
[   11.755136] CS:  0010 DS:  ES:  

[Yahoo-eng-team] [Bug 2033393] Re: Nova does not update libvirts instance name after server rename

2023-09-05 Thread Dan Smith
The instance name in the XML is not the instance name according to nova.
It is generated based on a template by the compute driver and is not
otherwise mutable. So this is operating as designed.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2033393

Title:
  Nova does not update libvirts instance name after server rename

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description
  ===

  When renaming an OpenStack instance, the change is not reflected in
  the Libvirt XML configuration, leading to inconsistency between the
  instance name in OpenStack and the name stored in the Libvirt
  configuration.

  
  Steps to reproduce
  ==

  * Launch an instance
  * Verify the instance name is correct: virsh dumpxml instance-00xx | grep 
''
  * Rename the instance: openstack server set --name NEW 
  * Check Libvirt config again

  
  Expected result
  ===
  The instance name change should be synchronized across all components, 
including the underlying Libvirt configuration.

  
  Actual result
  =
  The instance name is only changed in the database. The change is not 
propagated to the Libvirt configuration.

  
  Environment
  ===

  Kolla Containers
  Version: Xena
  Hypervisor Type: Libvirt KVM

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2033393/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2038840] [NEW] CPU state management fails if cpu0 is in dedicated set

2023-10-09 Thread Dan Smith
Public bug reported:

If an operator configures cpu0 in the dedicated set and enables state
management, nova-compute will fail on startup with this obscure error:

Oct 06 20:08:43.195137 np0035436890 nova-compute[104711]: ERROR
oslo_service.service nova.exception.FileNotFound: File
/sys/devices/system/cpu/cpu0/online could not be found.

The problem is that cpu0 is not hot-pluggable and thus has no online
knob. Nova should log a better error message in this case, at least.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2038840

Title:
  CPU state management fails if cpu0 is in dedicated set

Status in OpenStack Compute (nova):
  New

Bug description:
  If an operator configures cpu0 in the dedicated set and enables state
  management, nova-compute will fail on startup with this obscure error:

  Oct 06 20:08:43.195137 np0035436890 nova-compute[104711]: ERROR
  oslo_service.service nova.exception.FileNotFound: File
  /sys/devices/system/cpu/cpu0/online could not be found.

  The problem is that cpu0 is not hot-pluggable and thus has no online
  knob. Nova should log a better error message in this case, at least.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2038840/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2039463] [NEW] live migration jobs failing missing lxml

2023-10-16 Thread Dan Smith
Public bug reported:

Our jobs that run the evacuate post hook are failing due to not being
able to run the ansible virt module because of a missing lxml library:

2023-10-16 14:38:57.818847 | TASK [run-evacuate-hook : Register running domains 
on subnode]
2023-10-16 14:38:58.598524 | controller -> 172.99.67.184 | ERROR
2023-10-16 14:38:58.598912 | controller -> 172.99.67.184 | {
2023-10-16 14:38:58.598981 | controller -> 172.99.67.184 |   "msg": "The `lxml` 
module is not importable. Check the requirements."
2023-10-16 14:38:58.599046 | controller -> 172.99.67.184 | }

Not sure why this is coming up now, but it's likely related to the
recent switch to global venv for our services and some other dep change
that no longer gets us this on the host for free.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2039463

Title:
  live migration jobs failing missing lxml

Status in OpenStack Compute (nova):
  New

Bug description:
  Our jobs that run the evacuate post hook are failing due to not being
  able to run the ansible virt module because of a missing lxml library:

  2023-10-16 14:38:57.818847 | TASK [run-evacuate-hook : Register running 
domains on subnode]
  2023-10-16 14:38:58.598524 | controller -> 172.99.67.184 | ERROR
  2023-10-16 14:38:58.598912 | controller -> 172.99.67.184 | {
  2023-10-16 14:38:58.598981 | controller -> 172.99.67.184 |   "msg": "The 
`lxml` module is not importable. Check the requirements."
  2023-10-16 14:38:58.599046 | controller -> 172.99.67.184 | }

  Not sure why this is coming up now, but it's likely related to the
  recent switch to global venv for our services and some other dep
  change that no longer gets us this on the host for free.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2039463/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2051108] Re: Support for the "bring your own keys" approach for Cinder

2024-01-30 Thread Dan Smith
** Also affects: cinder
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2051108

Title:
  Support for the "bring your own keys" approach for Cinder

Status in Cinder:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===
  Cinder currently lags support the API to create a volume with a predefined 
(e.g. already stored in Barbican) encryption key. This feature would be useful 
for use cases where end-users should be enabled to store keys later on used to 
encrypt volumes.

  Work flow would be as follow:
  1. End user creates a new key and stores it in OpenStack Barbican
  2. User requests a new volume with volume type "LUKS" and gives an 
"encryption_reference_key_id" (or just "key_id").
  3. Internally the key is copied (like in 
volume_utils.clone_encryption_key_()) and a new "encryption_key_id".

To manage notifications about this bug go to:
https://bugs.launchpad.net/cinder/+bug/2051108/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 2079850] [NEW] Ephemeral with vfat format fails inspection

2024-09-06 Thread Dan Smith
Public bug reported:

When configured to format ephemerals as vfat, we get this failure:

Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.358 2 DEBUG 
oslo_utils.imageutils.format_inspector [None 
req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb 
ae43aec9c3c242a785c8256abdda1747 - - default default] Format inspector failed, 
aborting: Signature KDMV not found: b'\xebX\x90m' _process_chunk 
/usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py:1302
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.365 2 DEBUG 
oslo_utils.imageutils.format_inspector [None 
req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb 
ae43aec9c3c242a785c8256abdda1747 - - default default] Format inspector failed, 
aborting: Region signature not found at 3 _process_chunk 
/usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py:1302
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.366 2 
WARNING oslo_utils.imageutils.format_inspector [None 
req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb 
ae43aec9c3c242a785c8256abdda1747 - - default default] Safety check mbr on gpt 
failed because GPT MBR has no partitions defined: 
oslo_utils.imageutils.format_inspector.SafetyViolation: GPT MBR has no 
partitions defined
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.366 2 
WARNING nova.virt.libvirt.imagebackend [None 
req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 60ed4d3e522640b6ad19633b28c5b5bb 
ae43aec9c3c242a785c8256abdda1747 - - default default] Base image 
/var/lib/nova/instances/_base/ephemeral_1_0706d66 failed safety check: Safety 
checks failed: mbr: oslo_utils.imageutils.format_inspector.SafetyCheckFailed: 
Safety checks failed: mbr
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [None req-fcf3a278-3417-4a6d-8b10-66e91ca1677d 
60ed4d3e522640b6ad19633b28c5b5bb ae43aec9c3c242a785c8256abdda1747 - - default 
default] [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Instance failed to 
spawn: nova.exception.InvalidDiskInfo: Disk info file is invalid: Base image 
failed safety check
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Traceback 
(most recent call last):
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a]   File 
"/usr/lib/python3.9/site-packages/nova/virt/libvirt/imagebackend.py", line 685, 
in create_image
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] 
inspector.safety_check()
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a]   File 
"/usr/lib/python3.9/site-packages/oslo_utils/imageutils/format_inspector.py", 
line 430, in safety_check
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] raise 
SafetyCheckFailed(failures)
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] 
oslo_utils.imageutils.format_inspector.SafetyCheckFailed: Safety checks failed: 
mbr
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] 
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] During 
handling of the above exception, another exception occurred:
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] 
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] Traceback 
(most recent call last):
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a]   File 
"/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2894, in 
_build_resources
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a] yield 
resources
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 ERROR 
nova.compute.manager [instance: 263ccd01-10b1-46a6-9f81-a6fc27c7177a]   File 
"/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 2641, in 
_build_and_run_instance
Sep 03 17:34:28 compute-2 nova_compute[133243]: 2024-09-03 17:34:28.367 2 

[Yahoo-eng-team] [Bug 2012530] [NEW] nova-scheduler will crash at startup if placement is not up

2023-03-22 Thread Dan Smith
Public bug reported:

This is the same problem as https://bugs.launchpad.net/nova/+bug/1846820
but for scheduler. Because we initialize our placement client during
manager init, we will crash (and loop) on startup if keystone or
placement are down. Example trace:

Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR 
nova.scheduler.client.report [None req-edf5-6f86-4910-a458-72decae8e451 
None None] Failed to initialize placement client (is keystone available?): 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported versions.
Mar 22 15:54:39 jammy nova-scheduler[119746]: CRITICAL nova [None 
req-edf5-6f86-4910-a458-72decae8e451 None None] Unhandled error: 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported versions.
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova Traceback (most recent 
call last):
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/usr/local/bin/nova-scheduler", line 10, in 
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova sys.exit(main())
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/cmd/scheduler.py", line 47, in main
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova server = 
service.Service.create(
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/service.py", line 252, in create
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova service_obj = 
cls(host, binary, topic, manager,
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/service.py", line 116, in __init__
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self.manager = 
manager_class(host=self.host, *args, **kwargs)
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/manager.py", line 70, in __init__
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova 
self.placement_client = report.report_client_singleton()
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/client/report.py", line 91, in 
report_client_singleton
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova PLACEMENTCLIENT = 
SchedulerReportClient()
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/client/report.py", line 234, in __init__
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova self._client = 
self._create_client()
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/scheduler/client/report.py", line 277, in _create_client
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova client = 
self._adapter or utils.get_sdk_adapter('placement')
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/opt/stack/nova/nova/utils.py", line 984, in get_sdk_adapter
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova return 
getattr(conn, service_type)
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", 
line 87, in __get__
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova proxy = 
self._make_proxy(instance)
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova   File 
"/usr/local/lib/python3.10/dist-packages/openstack/service_description.py", 
line 266, in _make_proxy
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova raise 
exceptions.NotSupported(
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported versions.
Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR nova

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2012530

Title:
  nova-scheduler will crash at startup if placement is not up

Status in OpenStack Compute (nova):
  New

Bug description:
  This is the same problem as
  https://bugs.launchpad.net/nova/+bug/1846820 but for scheduler.
  Because we initialize our placement client during manager init, we
  will crash (and loop) on startup if keystone or placement are down.
  Example trace:

  Mar 22 15:54:39 jammy nova-scheduler[119746]: ERROR 
nova.scheduler.client.report [None req-edf5-6f86-4910-a458-72decae8e451 
None None] Failed to initialize placement client (is keystone available?): 
openstack.exceptions.NotSupported: The placement service for 
192.168.122.154:RegionOne exists but does not have any supported versions.
  Mar 22 15:54:39 jammy nova-scheduler[119746]: CRITICAL nova [None 
req-edf5-6f86-4910-a458-72decae8e451 None None] Unhandled error: 
openstack.exception

[Yahoo-eng-team] [Bug 1853048] Re: Nova not updating VM's XML in KVM

2019-11-18 Thread Dan Smith
Nova does not even call down to the compute node when attributes like
display_name are changed. The next time the xml is updated would be when
it is regenerated, like during a lifecycle event  (hard reboot) or
migration. Ceilometer scraping that information out of the libvirt XML
underneath nova is, as expected, not reliable.

Changing this would require new a RPC call, and would add load to
rabbit, the compute, and introduce additional traffic between nova and
libvirt. If there was some strong use-case for this, maybe that would be
worthwhile, but I don't think ceilometer wanting to scrape those
metadata items from the libvirt XML is strong enough.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1853048

Title:
  Nova not updating VM's XML in KVM

Status in Ceilometer:
  New
Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Ceilometer was causing resources to have a huge amount of revisions on Gnocchi
  (1000+), because the compute pollsters were constantly pushing outdated
  attributes. This can happen when users (as an example), update the name of 
VMs.
  The name is not updated in the VM's XML that is stored in the KVM host.
  This causes the Ceilometer compute pollster to constantly push outdated 
attributes
  that trigger resource revisions on Gnocchi (if we have other pollsters 
pushing the right
  attribute value that is gathered from OpenStack API).

  We are using OpenStack Rocky, and Nova version is 18.0.1.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ceilometer/+bug/1853048/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1858877] [NEW] Silent wasted storage with multiple RBD backends

2020-01-08 Thread Dan Smith
Public bug reported:

Nova does not currently support multiple rbd backends. However, Glance
does and an operator may point Nova at a Glance with access to multiple
RBD clusters. If this happens, Nova will silently download the image
from Glance, flatten it, and upload it to the local RBD cluster named
privately to the image. If another instance is booted from the same
image, this will happen again, using more network resources and
duplicating the image on ceph for the second and subsequent instances.
When configuring Nova and Glance for shared RBD, the expectation is that
instances are fast-cloned from Glance base images, so this silent
behavior of using a lot of storage would be highly undesirable and
unexpected. Since operators control the backend config, but users upload
images (and currently only to one backend), it is the users that would
trigger this additional consumption of storage.

This isn't really a bug in Nova per se, since Nova does not claim to
support multiple backends and is download/uploading the image in the
same way it would if the image was located on any other not-the-same-as-
my-RBD-cluster location. It is, however, unexpected and undesirable
behavior.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1858877

Title:
  Silent wasted storage with multiple RBD backends

Status in OpenStack Compute (nova):
  New

Bug description:
  Nova does not currently support multiple rbd backends. However, Glance
  does and an operator may point Nova at a Glance with access to
  multiple RBD clusters. If this happens, Nova will silently download
  the image from Glance, flatten it, and upload it to the local RBD
  cluster named privately to the image. If another instance is booted
  from the same image, this will happen again, using more network
  resources and duplicating the image on ceph for the second and
  subsequent instances. When configuring Nova and Glance for shared RBD,
  the expectation is that instances are fast-cloned from Glance base
  images, so this silent behavior of using a lot of storage would be
  highly undesirable and unexpected. Since operators control the backend
  config, but users upload images (and currently only to one backend),
  it is the users that would trigger this additional consumption of
  storage.

  This isn't really a bug in Nova per se, since Nova does not claim to
  support multiple backends and is download/uploading the image in the
  same way it would if the image was located on any other not-the-same-
  as-my-RBD-cluster location. It is, however, unexpected and undesirable
  behavior.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1858877/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1884587] [NEW] image import copy-to-store API should reflect proper authorization

2020-06-22 Thread Dan Smith
Public bug reported:

In testing the image import copy-to-store mechanism from Nova, I hit an
issue that seems clearly to be a bug. Scenario:

A user boots an instance from an image they have permission to see. Nova
uses their credentials to start an image import copy-to-store operation,
which succeeds:

"POST /v2/images/e6b1a7d0-ccd8-4be3-bef7-69c68fca4313/import HTTP/1.1" 202 211 
0.481190
Task [888e97e5-496d-4b94-b530-218d633f866a] status changing from pending to 
processing

Note the 202 return code. My code polls for a $timeout period, waiting
for the image to either arrive at the new store, or be marked as error,
which never happens ($timeout=600s). The glance log shows (trace
truncated):

glance-api[14039]:   File 
"/opt/stack/glance/glance/async_/flows/api_image_import.py", line 481, in 
get_flow
glance-api[14039]: stores if
glance-api[14039]:   File "/opt/stack/glance/glance/api/authorization.py", line 
296, in forbidden_key
glance-api[14039]: raise exception.Forbidden(message % key)
glance-api[14039]: glance.common.exception.Forbidden: You are not permitted to 
modify 'os_glance_importing_to_stores' on this image.

So apparently Nova is unable to use the user's credentials to initiate a
copy-to-store operation. That surprises me and I think it likely isn't
the access control we should be enforcing. However, if we're going to
reject the operation, we should reject it at the time the HTTP response
is sent, not later async, since we can check authorization right then
and there.

The problem in this case is that from the outside, I have no way of
knowing that the task fails subsequently. I receive a 202, which means I
should start polling for completion. The task fails to load/run and thus
can't update any status on the image, and I'm left to wait for 600s
before I give up.

So, at the very least, we're not checking the same set of permissions
during the HTTP POST call, and we should be. I also would tend to argue
that the user should be allowed to copy the image and not require an
admin to do it, perhaps with some additional policy element to control
that. However, I have to be able to determine when and when not to wait
for 600s.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1884587

Title:
  image import copy-to-store API should reflect proper authorization

Status in Glance:
  New

Bug description:
  In testing the image import copy-to-store mechanism from Nova, I hit
  an issue that seems clearly to be a bug. Scenario:

  A user boots an instance from an image they have permission to see.
  Nova uses their credentials to start an image import copy-to-store
  operation, which succeeds:

  "POST /v2/images/e6b1a7d0-ccd8-4be3-bef7-69c68fca4313/import HTTP/1.1" 202 
211 0.481190
  Task [888e97e5-496d-4b94-b530-218d633f866a] status changing from pending to 
processing

  Note the 202 return code. My code polls for a $timeout period, waiting
  for the image to either arrive at the new store, or be marked as
  error, which never happens ($timeout=600s). The glance log shows
  (trace truncated):

  glance-api[14039]:   File 
"/opt/stack/glance/glance/async_/flows/api_image_import.py", line 481, in 
get_flow
  glance-api[14039]: stores if
  glance-api[14039]:   File "/opt/stack/glance/glance/api/authorization.py", 
line 296, in forbidden_key
  glance-api[14039]: raise exception.Forbidden(message % key)
  glance-api[14039]: glance.common.exception.Forbidden: You are not permitted 
to modify 'os_glance_importing_to_stores' on this image.

  So apparently Nova is unable to use the user's credentials to initiate
  a copy-to-store operation. That surprises me and I think it likely
  isn't the access control we should be enforcing. However, if we're
  going to reject the operation, we should reject it at the time the
  HTTP response is sent, not later async, since we can check
  authorization right then and there.

  The problem in this case is that from the outside, I have no way of
  knowing that the task fails subsequently. I receive a 202, which means
  I should start polling for completion. The task fails to load/run and
  thus can't update any status on the image, and I'm left to wait for
  600s before I give up.

  So, at the very least, we're not checking the same set of permissions
  during the HTTP POST call, and we should be. I also would tend to
  argue that the user should be allowed to copy the image and not
  require an admin to do it, perhaps with some additional policy element
  to control that. However, I have to be able to determine when and when
  not to wait for 600s.

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1884587/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https:

[Yahoo-eng-team] [Bug 1884596] [NEW] image import copy-to-store will start multiple importing threads due to race condition

2020-06-22 Thread Dan Smith
Public bug reported:

I'm filing this bug a little prematurely because Abhi and I didn't get a
chance to fully discuss it. However, looking at the code and the
behavior I'm seeing due to another bug (1884587), I feel rather
confident.

Especially in a situation where glance is running on multiple control
plane nodes (i.e. any real-world situation), I believe there is a race
condition whereby two closely-timed requests to copy an image to a store
will result in two copy operations in glance proceeding in parallel. I
believe this to be the case due to a common "test-and-set that isn't
atomic" error.

In the API layer, glance checks that an import copy-to-store operation
isn't already in progress here:

https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/api/v2/images.py#L167

And if that passes, it proceeds to setup the task as a thread here:

https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/api/v2/images.py#L197

which may start running immediately or sometime in the future. Once
running, that code updates a property on the image to indicate that the
task is running here:

https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/async_/flows/api_image_import.py#L479-L484

Between those two events, if another API user makes the same request,
glance will not realize that a thread is already running to complete the
initial task and will start another. In a situation where a user spawns
a thousand new instances to a thousand compute nodes in a single
operation where the image needs copying first, it's highly plausible to
have _many_ duplicate glance operations going, impacting write
performance on the rbd cluster at the very least.

As evidence that this can happen, we see an abnormally extended race
window because of the aforementioned bug (1884587) where we fail to
update the property that indicates the task is running. In a test we see
a large number of them get started, followed by a cascade of failures
when they fail to update that image property, implying that many such
threads are running. If this situation is allowed to happen when the
property does *not* fail to update, I believe we would end up with
glance copying the image to the destination in multiple threads
simultaneously. That is much harder to simulate in practice in a
development environment, but the other bug makes it happen every time
since we never update the image property to prevent it and thus the
window is long.

Abhi also brought up the case where if this race occurs on the same
node, the second attempt *may* actually start copying the partial image
in the staging directory to the destination, finish early, and then mark
the image as "copied to $store" such that nova will attempt to use the
partial image immediately, resulting in a corrupted disk and various
levels of failure after that. Note that it's not clear if that's really
possible or not, but I'm putting it here so the glance gurus can
validate.

The use of the os_glance_importing_to_stores property to "lock" a copy
to a particular store is good, except that updating that list atomically
means that the above mentioned race will not have anything to check
after the update to see if it was the race loser. I don't see any checks
in the persistence layer to ensure that an UPDATE to the row with this
property doesn't already have a given store in it, or do any kind of
merge. This also leads me to worry that two parallel requests to copy an
image to two different stores may result in clobbering the list of
stores-in-progress and potentially also the final list of stores at
rest. This is just conjecture at this point, I just haven't seen
anywhere that situation is accounted for.

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1884596

Title:
  image import copy-to-store will start multiple importing threads due
  to race condition

Status in Glance:
  New

Bug description:
  I'm filing this bug a little prematurely because Abhi and I didn't get
  a chance to fully discuss it. However, looking at the code and the
  behavior I'm seeing due to another bug (1884587), I feel rather
  confident.

  Especially in a situation where glance is running on multiple control
  plane nodes (i.e. any real-world situation), I believe there is a race
  condition whereby two closely-timed requests to copy an image to a
  store will result in two copy operations in glance proceeding in
  parallel. I believe this to be the case due to a common "test-and-set
  that isn't atomic" error.

  In the API layer, glance checks that an import copy-to-store operation
  isn't already in progress here:

  
https://github.com/openstack/glance/blob/e6db0b10a703037f754007bef6f56451086850cd/glance/api/v2/images.py#L167

  And if that passes, 

[Yahoo-eng-team] [Bug 1885003] [NEW] Interrupted copy-to-store may corrupt a subsequent operation

2020-06-24 Thread Dan Smith
Public bug reported:

This is a hypothetical (but very possible) scenario that will result in
a corrupted image stored by glance. I don't have code to reproduce it,
but discussion seems to indicate that it is possible.

Scenario:

1. Upload image to glance to one store, everything is good
2. Start an image_import(method='copy-to-store') to copy the image to another 
store
3. Power failure, network failure, or `killall -9 glance-api`
4. After the failure, re-request the copy-to-store
5. That glance worker will see the residue of the image in the staging 
directory, which is only partial because the process never finished, and will 
start uploading that to the new store
6. Upon completion, the image will appear in two stores, but one of them will 
be quietly corrupted

** Affects: glance
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to Glance.
https://bugs.launchpad.net/bugs/1885003

Title:
  Interrupted copy-to-store may corrupt a subsequent operation

Status in Glance:
  New

Bug description:
  This is a hypothetical (but very possible) scenario that will result
  in a corrupted image stored by glance. I don't have code to reproduce
  it, but discussion seems to indicate that it is possible.

  Scenario:

  1. Upload image to glance to one store, everything is good
  2. Start an image_import(method='copy-to-store') to copy the image to another 
store
  3. Power failure, network failure, or `killall -9 glance-api`
  4. After the failure, re-request the copy-to-store
  5. That glance worker will see the residue of the image in the staging 
directory, which is only partial because the process never finished, and will 
start uploading that to the new store
  6. Upon completion, the image will appear in two stores, but one of them will 
be quietly corrupted

To manage notifications about this bug go to:
https://bugs.launchpad.net/glance/+bug/1885003/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1746294] [NEW] Scheduler requests unlimited results from placement

2018-01-30 Thread Dan Smith
Public bug reported:

The scheduler will request an infinitely-large host set from placement
during scheduling operations. This may be very large on big clouds and
makes for a huge JSON response from placement to scheduler each time a
single host needs to be selected.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: In Progress


** Tags: queens-rc-potential scheduler

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1746294

Title:
  Scheduler requests unlimited results from placement

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  The scheduler will request an infinitely-large host set from placement
  during scheduling operations. This may be very large on big clouds and
  makes for a huge JSON response from placement to scheduler each time a
  single host needs to be selected.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1746294/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1755602] [NEW] Ironic computes may not be discovered when node count is less than compute count

2018-03-13 Thread Dan Smith
Public bug reported:

In an ironic deployment being built from day zero, there is an ordering
problem, which generates a race condition for operators. Consider this
common example:

At config time, you create and start three nova-compute services
pointing at your ironic deployment. These three will be HA using the
ironic driver's hash ring functionality. At config time, there are no
ironic nodes present yet, which means running discover_hosts will create
no host mappings.

Next, a single ironic node is added, which is owned by one of the
computes per the hash rules. At this point, you can run discover_hosts
and whatever compute owns that node will get a host mapping. Then you
add a second ironic node, which causes all three nova-computes to
rebalance the hash ring. One or more of the ironic nodes will definitely
land on one of the other nova-computes and will suddenly be unreachable
because there is no host mapping until the next time discover_hosts is
run. Since we track the "mapped" bit on compute nodes, and compute nodes
move between hosts with ironic, we won't even notice that the new owner
nova-compute needs a host mapping. In fact, we won't notice until we get
lucky enough to land a never-mapped ironic node on a nova-compute for
the first time and then run discover_hosts after that point.

For an automated config management system, this is a lot of complexity
to handle in order to generate a stable output of a working system. In
many cases where you're using ironic to bootstrap another deployment
(i.e. tripleo) the number of nodes may be small (less than the computes)
for quite some time.

There are a couple obvious options I see:

1. Add a --and-services flag to nova-manage, which will also look for
all nova-compute services in the cell and make sure those have mappings.
This is ideal because we could get all services mapped at config time
without even having to have an ironic node in place yet (which is not
possible today). We can't do this efficiently right away because
nova.services does not have a mapped flag, and thus the scheduler
periodic should _not_ include services.

2. We could unset compute_node.mapped any time we re-home an ironic node
to a different nova-compute. This would cause our scheduler periodic to
notice the change and create a host mapping if it happens to move to an
unmapped nova-compute. This generates extra work during normal operating
state and also still leaves us with an interval of time where a
previously-usable ironic node becomes unusable until the host discovery
periodic task runs again.

IMHO, we should do #1. It's a backportable change, and it's actually a
better workflow for config automation tools than what we have today,
even discounting this race. We can do what we did before, which is do it
once for backports, and then add a mapped bit in master to make it more
efficient, allowing it to be included in the scheduler periodic task.

** Affects: nova
     Importance: Medium
 Assignee: Dan Smith (danms)
 Status: Confirmed


** Tags: cells

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Status: New => Confirmed

** Bug watch added: Red Hat Bugzilla #1554460
   https://bugzilla.redhat.com/show_bug.cgi?id=1554460

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1755602

Title:
  Ironic computes may not be discovered when node count is less than
  compute count

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  In an ironic deployment being built from day zero, there is an
  ordering problem, which generates a race condition for operators.
  Consider this common example:

  At config time, you create and start three nova-compute services
  pointing at your ironic deployment. These three will be HA using the
  ironic driver's hash ring functionality. At config time, there are no
  ironic nodes present yet, which means running discover_hosts will
  create no host mappings.

  Next, a single ironic node is added, which is owned by one of the
  computes per the hash rules. At this point, you can run discover_hosts
  and whatever compute owns that node will get a host mapping. Then you
  add a second ironic node, which causes all three nova-computes to
  rebalance the hash ring. One or more of the ironic nodes will
  definitely land on one of the other nova-computes and will suddenly be
  unreachable because there is no host mapping until the next time
  discover_hosts is run. Since we track the "mapped" bit on compute
  nodes, and compute nodes move between hosts with ironic, we won't even
  notice that the new owner nova-compute needs a host mapping. In fact,
  we won't notice until we get lucky enough to land a never-mapped
  ironic node on a nova-compute for the first time and then run
  discover_hosts a

[Yahoo-eng-team] [Bug 1798158] [NEW] Non-templated transport_url will fail if not defined in config

2018-10-16 Thread Dan Smith
Public bug reported:

If transport_url is not defined in the config, we will fail to format a
non-templated transport_url in the database like this:

ERROR nova.objects.cell_mapping [None req-34831485-adf4-4a0d-bb20-e1736d93a451 
None None] Failed to parse [DEFAULT]/transport_url to format cell mapping: 
AttributeError: 'NoneType' object has no attribute 'find'
ERROR nova.objects.cell_mapping Traceback (most recent call last):
ERROR nova.objects.cell_mapping   File 
"/opt/stack/nova/nova/objects/cell_mapping.py", line 150, in _format_mq_url
ERROR nova.objects.cell_mapping return CellMapping._format_url(url, 
CONF.transport_url)
ERROR nova.objects.cell_mapping   File 
"/opt/stack/nova/nova/objects/cell_mapping.py", line 101, in _format_url
ERROR nova.objects.cell_mapping default_url = urlparse.urlparse(default)
ERROR nova.objects.cell_mapping   File "/usr/lib/python2.7/urlparse.py", line 
143, in urlparse
ERROR nova.objects.cell_mapping tuple = urlsplit(url, scheme, 
allow_fragments)
ERROR nova.objects.cell_mapping   File "/usr/lib/python2.7/urlparse.py", line 
182, in urlsplit
ERROR nova.objects.cell_mapping i = url.find(':')
ERROR nova.objects.cell_mapping AttributeError: 'NoneType' object has no 
attribute 'find'
ERROR nova.objects.cell_mapping

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1798158

Title:
  Non-templated transport_url will fail if not defined in config

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  If transport_url is not defined in the config, we will fail to format
  a non-templated transport_url in the database like this:

  ERROR nova.objects.cell_mapping [None 
req-34831485-adf4-4a0d-bb20-e1736d93a451 None None] Failed to parse 
[DEFAULT]/transport_url to format cell mapping: AttributeError: 'NoneType' 
object has no attribute 'find'
  ERROR nova.objects.cell_mapping Traceback (most recent call last):
  ERROR nova.objects.cell_mapping   File 
"/opt/stack/nova/nova/objects/cell_mapping.py", line 150, in _format_mq_url
  ERROR nova.objects.cell_mapping return CellMapping._format_url(url, 
CONF.transport_url)
  ERROR nova.objects.cell_mapping   File 
"/opt/stack/nova/nova/objects/cell_mapping.py", line 101, in _format_url
  ERROR nova.objects.cell_mapping default_url = urlparse.urlparse(default)
  ERROR nova.objects.cell_mapping   File "/usr/lib/python2.7/urlparse.py", line 
143, in urlparse
  ERROR nova.objects.cell_mapping tuple = urlsplit(url, scheme, 
allow_fragments)
  ERROR nova.objects.cell_mapping   File "/usr/lib/python2.7/urlparse.py", line 
182, in urlsplit
  ERROR nova.objects.cell_mapping i = url.find(':')
  ERROR nova.objects.cell_mapping AttributeError: 'NoneType' object has no 
attribute 'find'
  ERROR nova.objects.cell_mapping

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1798158/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1719966] [NEW] Microversion 2.47 punches nova in its special place

2017-09-27 Thread Dan Smith
Public bug reported:

Testing with 500 instances in ACTIVE, and 500 in ERROR state, using curl
to pull all 1000 instances ten times in a row, 2.47 clearly shows a knee
in the curve on average response time:

https://imgur.com/a/2lmiw

We should...fix that and stuff.

** Affects: nova
 Importance: High
 Status: Confirmed

** Affects: nova/pike
 Importance: Undecided
 Status: New


** Tags: api performance

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1719966

Title:
  Microversion 2.47 punches nova in its special place

Status in OpenStack Compute (nova):
  Confirmed
Status in OpenStack Compute (nova) pike series:
  New

Bug description:
  Testing with 500 instances in ACTIVE, and 500 in ERROR state, using
  curl to pull all 1000 instances ten times in a row, 2.47 clearly shows
  a knee in the curve on average response time:

  https://imgur.com/a/2lmiw

  We should...fix that and stuff.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1719966/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1738094] [NEW] TEXT is not large enough to store RequestSpec

2017-12-13 Thread Dan Smith
Public bug reported:

This error occurs during Newton's online_data_migration phase:

error: (pymysql.err.DataError) (1406, u"Data too long for column 'spec'
at row 1") [SQL: u'INSERT INTO request_specs

Which comes from RequestSpec.instance_group.members being extremely
large

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1738094

Title:
  TEXT is not large enough to store RequestSpec

Status in OpenStack Compute (nova):
  New

Bug description:
  This error occurs during Newton's online_data_migration phase:

  error: (pymysql.err.DataError) (1406, u"Data too long for column
  'spec' at row 1") [SQL: u'INSERT INTO request_specs

  Which comes from RequestSpec.instance_group.members being extremely
  large

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1738094/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1648840] [NEW] libvirt driver leaves interface residue after failed start

2016-12-09 Thread Dan Smith
Public bug reported:

When the libvirt driver fails to start a VM due to reasons other than
neutron plug timeout, it leaves interfaces on the system from the vif
plugging. If a subsequent delete is performed and completes
successfully, these will be removed. However, in cases where
connectivity is preventing a normal delete, a local delete will be
performed at the api level and the interfaces will remain.

In at least one real world situation I have observed, a script was
creating test instances which were failing and leaving residue. After
the residue interface count reached about 6,000 on the system, VM
creates started failing with "Argument list too long" as libvirt was
choking on enumerating the interfaces it had left behind.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: In Progress

** Affects: nova/newton
 Importance: Undecided
 Status: New

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
 Assignee: (unassigned) => Dan Smith (danms)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1648840

Title:
  libvirt driver leaves interface residue after failed start

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) newton series:
  New

Bug description:
  When the libvirt driver fails to start a VM due to reasons other than
  neutron plug timeout, it leaves interfaces on the system from the vif
  plugging. If a subsequent delete is performed and completes
  successfully, these will be removed. However, in cases where
  connectivity is preventing a normal delete, a local delete will be
  performed at the api level and the interfaces will remain.

  In at least one real world situation I have observed, a script was
  creating test instances which were failing and leaving residue. After
  the residue interface count reached about 6,000 on the system, VM
  creates started failing with "Argument list too long" as libvirt was
  choking on enumerating the interfaces it had left behind.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1648840/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1652233] Re: mitaka is incompatible with newton - IncompatibleObjectVersion Version 2.1 of InstanceList is not supported

2017-01-03 Thread Dan Smith
Yeah, mixed-version controllers isn't supported. We've made some
progress towards being able to support it in master, but it's definitely
not going to work in mitaka/newton.

You have to upgrade your controllers simultaneously (well, most
critically, your conductor services), and then you can have any mix of
versions among your computes that you want.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1652233

Title:
  mitaka is incompatible with newton - IncompatibleObjectVersion Version
  2.1 of InstanceList is not supported

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Description
  ===

  I get an error after upgrade half of my cluster. Can't place any VMs.
  "RemoteError: Remote error: IncompatibleObjectVersion Version 2.1 of 
InstanceList is not supported"

  Steps to reproduce
  ==

  1) Install 4 nodes with mitaka
  2) Disable 2 nodes (1 api controller and 1 compute): nova service-disable
  3) Upgrade to newton on the disable nodes
  4) compute=mitaka to [upgrade_levels]
  5) db sync
  6) Start newton
  7) Try to place any VMs, it will fail
  8) See nova-compute.log on the mitaka nodes

  Expected result
  ===

  Successful upgrade one half of cluster, then another half

  Actual result
  =

  Nova can't place any VMs.

  Compute logs:

  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task 
[req-41e6df10-b33b-47f5-be0c-86793cbcae6e - - - - -] Error during 
ComputeManager._sync_scheduler_instance_info
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task Traceback 
(most recent call last):
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in 
run_periodic_tasks
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task task(self, 
context)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1637, in 
_sync_scheduler_instance_info
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task 
use_slave=True)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 177, in 
wrapper
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task args, 
kwargs)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/nova/conductor/rpcapi.py", line 236, in 
object_class_action_versions
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task args=args, 
kwargs=kwargs)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in 
call
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task 
retry=self.retry)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 97, in 
_send
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task 
timeout=timeout, retry=retry)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
464, in send
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task 
retry=retry)
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 
455, in _send
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task raise 
result
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task RemoteError: 
Remote error: IncompatibleObjectVersion Version 2.1 of InstanceList is not 
supported
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task [u'Traceback 
(most recent call last):\n', u'  File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 138, 
in _dispatch_and_reply\n', u'  File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 185, 
in _dispatch\n', u'  File 
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 127, 
in _do_dispatch\n:param incoming: incoming message\n', u'  File 
"/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 92, in 
object_class_action_versions\nobjname, object_versions[objname])\n', u'  
File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 
374, in obj_class_from_name\nsupported=latest_ver)\n', 
u'IncompatibleObjectVersion: Version 2.1 of InstanceList is not supported\n'].
  2016-12-23 07:26:11.434 15392 ERROR oslo_service.periodic_task

  nova-conductor:

  2016-12-23 08:01:00.489 9958 ERROR oslo_messaging.rpc.di

[Yahoo-eng-team] [Bug 1655494] [NEW] Newton scheduler clients should keep trying to report

2017-01-10 Thread Dan Smith
Public bug reported:

Newton scheduler clients will stop reporting any time they encounter a
setup-related error, which isn't very operator-friendly for the ocata
upgrade process.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: Confirmed


** Tags: newton-backport-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1655494

Title:
  Newton scheduler clients should keep trying to report

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Newton scheduler clients will stop reporting any time they encounter a
  setup-related error, which isn't very operator-friendly for the ocata
  upgrade process.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1655494/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1693911] [NEW] compute node statistics will lie if service records are deleted

2017-05-26 Thread Dan Smith
Public bug reported:

If a compute node references a deleted service, we will include it in
the compute node statistics output. This happens even if the compute
node record _is_ deleted, because of our join of the services table,
which causes us to get back rows anyway. This results in the stats
showing more resource than actually exists, and disagreeing with the sum
of all the individual hypervisor-show operations.

This is the query we're doing:

MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON 
compute_nodes.host=services.host WHERE services.binary="nova-compute" AND 
compute_nodes.deleted=0;
++
| SUM(memory_mb) |
++
|1047917 |
++
1 row in set (0.00 sec)
 
And this is what we *should* be doing

MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON 
compute_nodes.host=services.host WHERE services.binary="nova-compute" AND 
compute_nodes.deleted=0 AND services.deleted=0;
++
| SUM(memory_mb) |
++
| 655097 |
++
1 row in set (0.00 sec)

The second value is correct for the database in question.

** Affects: nova
 Importance: Undecided
 Status: Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1693911

Title:
  compute node statistics will lie if service records are deleted

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  If a compute node references a deleted service, we will include it in
  the compute node statistics output. This happens even if the compute
  node record _is_ deleted, because of our join of the services table,
  which causes us to get back rows anyway. This results in the stats
  showing more resource than actually exists, and disagreeing with the
  sum of all the individual hypervisor-show operations.

  This is the query we're doing:

  MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON 
compute_nodes.host=services.host WHERE services.binary="nova-compute" AND 
compute_nodes.deleted=0;
  ++
  | SUM(memory_mb) |
  ++
  |1047917 |
  ++
  1 row in set (0.00 sec)
   
  And this is what we *should* be doing

  MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON 
compute_nodes.host=services.host WHERE services.binary="nova-compute" AND 
compute_nodes.deleted=0 AND services.deleted=0;
  ++
  | SUM(memory_mb) |
  ++
  | 655097 |
  ++
  1 row in set (0.00 sec)

  The second value is correct for the database in question.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1693911/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1693911] Re: compute node statistics will lie if service records are deleted

2017-05-26 Thread Dan Smith
Dupe of 1692397

** Changed in: nova
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1693911

Title:
  compute node statistics will lie if service records are deleted

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  If a compute node references a deleted service, we will include it in
  the compute node statistics output. This happens even if the compute
  node record _is_ deleted, because of our join of the services table,
  which causes us to get back rows anyway. This results in the stats
  showing more resource than actually exists, and disagreeing with the
  sum of all the individual hypervisor-show operations.

  This is the query we're doing:

  MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON 
compute_nodes.host=services.host WHERE services.binary="nova-compute" AND 
compute_nodes.deleted=0;
  ++
  | SUM(memory_mb) |
  ++
  |1047917 |
  ++
  1 row in set (0.00 sec)
   
  And this is what we *should* be doing

  MariaDB [nova]> SELECT SUM(memory_mb) FROM compute_nodes JOIN services ON 
compute_nodes.host=services.host WHERE services.binary="nova-compute" AND 
compute_nodes.deleted=0 AND services.deleted=0;
  ++
  | SUM(memory_mb) |
  ++
  | 655097 |
  ++
  1 row in set (0.00 sec)

  The second value is correct for the database in question.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1693911/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1696125] Re: Detach interface failed - Unable to detach from guest transient domain (pike)

2017-06-09 Thread Dan Smith
** Changed in: nova
   Status: Fix Released => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1696125

Title:
  Detach interface failed - Unable to detach from guest transient domain
  (pike)

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Seeing this in Tempest runs on master (pike):

  http://logs.openstack.org/24/471024/2/check/gate-tempest-dsvm-neutron-
  linuxbridge-ubuntu-
  xenial/6b98d38/logs/screen-n-cpu.txt.gz?level=TRACE#_Jun_06_02_16_02_855503

  Jun 06 02:16:02.855503 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
WARNING nova.compute.manager [None req-b4a50024-a2fd-4279-b284-340d2074f1c1 
tempest-TestNetworkBasicOps-1479445685 tempest-TestNetworkBasicOps-1479445685] 
[instance: 2668bcb9-b13d-4b5b-8ee5-edbdee3b15a8] Detach interface failed, 
port_id=3843caa3-ab04-45f1-94d8-f330390e40fe, reason: Device detach failed for 
fa:16:3e:ab:e3:3f: Unable to detach from guest transient domain.: 
DeviceDetachFailed: Device detach failed for fa:16:3e:ab:e3:3f: Unable to 
detach from guest transient domain.
  Jun 06 02:16:02.884007 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server [None req-b4a50024-a2fd-4279-b284-340d2074f1c1 
tempest-TestNetworkBasicOps-1479445685 tempest-TestNetworkBasicOps-1479445685] 
Exception during message handling: InterfaceDetachFailed: Failed to detach 
network adapter device from 2668bcb9-b13d-4b5b-8ee5-edbdee3b15a8
  Jun 06 02:16:02.884180 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server Traceback (most recent call last):
  Jun 06 02:16:02.884286 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 
157, in _process_incoming
  Jun 06 02:16:02.884395 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
  Jun 06 02:16:02.884538 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
213, in dispatch
  Jun 06 02:16:02.884669 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, 
ctxt, args)
  Jun 06 02:16:02.884777 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
183, in _do_dispatch
  Jun 06 02:16:02.884869 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
  Jun 06 02:16:02.884968 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/opt/stack/new/nova/nova/exception_wrapper.py", line 77, in wrapped
  Jun 06 02:16:02.885069 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server function_name, call_dict, binary)
  Jun 06 02:16:02.885171 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  Jun 06 02:16:02.885272 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server self.force_reraise()
  Jun 06 02:16:02.885367 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  Jun 06 02:16:02.885461 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb)
  Jun 06 02:16:02.885554 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/opt/stack/new/nova/nova/exception_wrapper.py", line 68, in wrapped
  Jun 06 02:16:02.885649 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server return f(self, context, *args, **kw)
  Jun 06 02:16:02.885755 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 214, in decorated_function
  Jun 06 02:16:02.885856 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server kwargs['instance'], e, sys.exc_info())
  Jun 06 02:16:02.885950 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
  Jun 06 02:16:02.886053 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERROR oslo_messaging.rpc.server self.force_reraise()
  Jun 06 02:16:02.886143 ubuntu-xenial-ovh-bhs1-9149075 nova-compute[24118]: 
ERRO

[Yahoo-eng-team] [Bug 1698383] [NEW] Resource tracker regressed reporting negative memory

2017-06-16 Thread Dan Smith
Public bug reported:

Nova's resource tracker is expected to publish negative values to the
scheduler when resources are overcommitted. Nova's scheduler expects
this:

https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215

In change https://review.openstack.org/#/c/306670, these values were
filtered to never drop below zero, which is incorrect. That change was
making a complex alteration for ironic and cells, specifically to avoid
resources from ironic nodes showing up as negative when they were
unavailable. That was a cosmetic fix (which I believe has been corrected
for ironic only in this patch:

https://review.openstack.org/#/c/230487/

Regardless, since the scheduler does the same calculation to determine
available resources on the node, if the node reports 0 when the
scheduler calculates -100 for a given resource, the scheduler will
assume the node till has room (due to oversubscription) and will send
builds there destined to fail.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1698383

Title:
  Resource tracker regressed reporting negative memory

Status in OpenStack Compute (nova):
  New

Bug description:
  Nova's resource tracker is expected to publish negative values to the
  scheduler when resources are overcommitted. Nova's scheduler expects
  this:

  
https://github.com/openstack/nova/blob/a43dbba2b8feea063ed2d0c79780b4c3507cf89b/nova/scheduler/host_manager.py#L215

  In change https://review.openstack.org/#/c/306670, these values were
  filtered to never drop below zero, which is incorrect. That change was
  making a complex alteration for ironic and cells, specifically to
  avoid resources from ironic nodes showing up as negative when they
  were unavailable. That was a cosmetic fix (which I believe has been
  corrected for ironic only in this patch:

  https://review.openstack.org/#/c/230487/

  Regardless, since the scheduler does the same calculation to determine
  available resources on the node, if the node reports 0 when the
  scheduler calculates -100 for a given resource, the scheduler will
  assume the node till has room (due to oversubscription) and will send
  builds there destined to fail.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1698383/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1707071] [NEW] Compute nodes will fight over allocations during migration

2017-07-27 Thread Dan Smith
Public bug reported:

As far back as Ocata, compute nodes that manage allocations will end up
overwriting allocations from other compute nodes when doing a migration.
This stems from the fact that the Resource Tracker was designed to
manage a per-compute-node set of accounting, but placement is per-
instance accounting. When we try to create/update/delete allocations for
instances on compute nodes from the existing resource tracker code
paths, we end up deleting allocations that apply to other compute nodes
in the process.

For example, when an instance A is running against compute1, there is an
allocation for its resources against that node. When migrating that
instance to compute2, the target compute (or scheduler) may create
allocations for instance A against compute2, which overwrite those for
compute1. Then, compute1's periodic healing task runs, and deletes the
allocation for instance A against compute2, replacing it with one for
compute1. When migration completes, compute2 heals again and overwrites
the allocation with one for the new home of the instance. Then, compute1
may the allocation it thinks it owns, followed finally by another heal
on compute2. While this is going on, the scheduler (via placement) does
not have a consistent view of resources to make proper decisions.

In order to fix this, we need a combination of changes:

1. There should be allocations against both compute nodes for an instance 
during a migration
2. Compute nodes should respect the double claim, and not delete allocations 
for instances it used to own, if the allocation has no resources for its 
resource provider
3. Compute nodes should not delete allocations for instances unless they own 
the instance _and_ the instance is in DELETED/SHELVED_OFFLOADED state

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1707071

Title:
  Compute nodes will fight over allocations during migration

Status in OpenStack Compute (nova):
  New

Bug description:
  As far back as Ocata, compute nodes that manage allocations will end
  up overwriting allocations from other compute nodes when doing a
  migration. This stems from the fact that the Resource Tracker was
  designed to manage a per-compute-node set of accounting, but placement
  is per-instance accounting. When we try to create/update/delete
  allocations for instances on compute nodes from the existing resource
  tracker code paths, we end up deleting allocations that apply to other
  compute nodes in the process.

  For example, when an instance A is running against compute1, there is
  an allocation for its resources against that node. When migrating that
  instance to compute2, the target compute (or scheduler) may create
  allocations for instance A against compute2, which overwrite those for
  compute1. Then, compute1's periodic healing task runs, and deletes the
  allocation for instance A against compute2, replacing it with one for
  compute1. When migration completes, compute2 heals again and
  overwrites the allocation with one for the new home of the instance.
  Then, compute1 may the allocation it thinks it owns, followed finally
  by another heal on compute2. While this is going on, the scheduler
  (via placement) does not have a consistent view of resources to make
  proper decisions.

  In order to fix this, we need a combination of changes:

  1. There should be allocations against both compute nodes for an instance 
during a migration
  2. Compute nodes should respect the double claim, and not delete allocations 
for instances it used to own, if the allocation has no resources for its 
resource provider
  3. Compute nodes should not delete allocations for instances unless they own 
the instance _and_ the instance is in DELETED/SHELVED_OFFLOADED state

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1707071/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1713095] [NEW] Nova compute driver init happens before conductor is ready

2017-08-25 Thread Dan Smith
Public bug reported:

In nova/service.py we poll for conductor readiness before we allow
normal service startup behavior. The ironic driver does RPC to conductor
in its _refresh_hash_ring() code, which may expect conductor be up
before it's not. If so, we'll fail to start up because we called to
conductor, waited a long time, and then timed out.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1713095

Title:
  Nova compute driver init happens before conductor is ready

Status in OpenStack Compute (nova):
  New

Bug description:
  In nova/service.py we poll for conductor readiness before we allow
  normal service startup behavior. The ironic driver does RPC to
  conductor in its _refresh_hash_ring() code, which may expect conductor
  be up before it's not. If so, we'll fail to start up because we called
  to conductor, waited a long time, and then timed out.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1713095/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1660160] Re: No host-to-cell mapping found for selected host

2017-01-29 Thread Dan Smith
Something in your config has been preventing compute nodes from creating
their compute node records for much longer than the referenced patch has
been in place. I picked a random older run and found the same compute
node record create failure:

 http://logs.openstack.org/95/422795/4/check/gate-tripleo-ci-
centos-7-undercloud/9d4dda4/logs/var/log/nova/nova-
compute.txt.gz#_2017-01-20_15_58_59_030

The referenced patch does require those compute node records, just like
many other pieces of nova (your resource tracking will be wrong without
it) but it is only related in as much as it requires them to be there in
order to boot an instance. The ComputeNode records are very fundamental
to Nova and have been for years, before cellsv2 was even a thing.

Without the compute node records, the discover_hosts step will not be
able to create HostMapping records for the compute nodes, which is what
the "No host-to-cell mapping" message is about.

So, this is, IMHO, not a Nova bug but just something config-related on
the tripleo side. I'm not sure what exactly would cause that compute
node record create failure, but I expect it's something minor.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1660160

Title:
  No host-to-cell mapping found for selected host

Status in OpenStack Compute (nova):
  Invalid
Status in tripleo:
  Triaged

Bug description:
  This report is maybe not a bug but I found useful to share what happens in 
TripleO since this commit:
  https://review.openstack.org/#/c/319379/

  We are unable to deploy the overcloud nodes anymore (in other words,
  create servers with Nova / Ironic).

  Nova Conductor sends this message:
  "No host-to-cell mapping found for selected host"
  
http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/undercloud/var/log/nova/nova-conductor.txt.gz#_2017-01-27_19_21_56_348

  And it sounds like the compute host is not registered:
  
http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/undercloud/var/log/nova/nova-compute.txt.gz#_2017-01-27_18_56_56_543

  Nova Config is available here:
  
http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/etc/nova/nova.conf.txt.gz

  That's all the details I have now, feel free for more details if
  needed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1660160/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1661312] [NEW] Evacuation will corrupt instance allocations

2017-02-02 Thread Dan Smith
Public bug reported:

The following sequence of events will result in a corrupted instance
allocation in placement:

1. Instance running on host A, placement has allocations for instance on host A
2. Host A goes down
3. Instance is evacuated to host B, host B creates duplicated allocations in 
placement for instance
4. Host A comes up, notices that instance is gone, deletes all allocations for 
instance on both hosts A and B
5. Instance now has no allocations for a period
6. Eventually, host B will re-create the allocations for the instance

The period between #4 and #6 will have the scheduler making bad
decisions because it thinks host B is less loaded than it is.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1661312

Title:
  Evacuation will corrupt instance allocations

Status in OpenStack Compute (nova):
  New

Bug description:
  The following sequence of events will result in a corrupted instance
  allocation in placement:

  1. Instance running on host A, placement has allocations for instance on host 
A
  2. Host A goes down
  3. Instance is evacuated to host B, host B creates duplicated allocations in 
placement for instance
  4. Host A comes up, notices that instance is gone, deletes all allocations 
for instance on both hosts A and B
  5. Instance now has no allocations for a period
  6. Eventually, host B will re-create the allocations for the instance

  The period between #4 and #6 will have the scheduler making bad
  decisions because it thinks host B is less loaded than it is.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1661312/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit

2017-02-03 Thread Dan Smith
What you describe is fundamental to how nova works right now. We
speculate in the scheduler, and if we race between two, we handle it
with a reschedule. Nova specifically states that scheduling every last
resource is out of scope. When trying to do that (which is often the use
case for ironic) you're likely to hit this race as you run out of
capacity:

https://github.com/openstack/nova/blob/master/doc/source/project_scope.rst
#iaas-not-batch-processing

In the next few cycles we plan to move the claim process to the
placement engine, which will eliminate most of these race-to-claim type
issues, and in that situation things will be better for this kind of
arrangement.

Until that point, this is not a bug though, because it's specifically
how nova is designed to work.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1341420

Title:
  gap between scheduler selection and claim causes spurious failures
  when the instance is the last one to fit

Status in OpenStack Compute (nova):
  Invalid
Status in tripleo:
  New

Bug description:
  There is a race between the scheduler in select_destinations, which
  selects a set of hosts, and the nova compute manager, which claims
  resources on those hosts when building the instance. The race is
  particularly noticable with Ironic, where every request will consume a
  full host, but can turn up on libvirt etc too. Multiple schedulers
  will likely exacerbate this too unless they are in a version of python
  with randomised dictionary ordering, in which case they will make it
  better :).

  I've put https://review.openstack.org/106677 up to remove a comment
  which comes from before we introduced this race.

  One mitigating aspect to the race in the filter scheduler _schedule
  method attempts to randomly select hosts to avoid returning the same
  host in repeated requests, but the default minimum set it selects from
  is size 1 - so when heat requests a single instance, the same
  candidate is chosen every time. Setting that number higher can avoid
  all concurrent requests hitting the same host, but it will still be a
  race, and still likely to fail fairly hard at near-capacity situations
  (e.g. deploying all machines in a cluster with Ironic and Heat).

  Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or
  10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity
  on each hypervisor. Then deploy a bunch of VMs one at a time but very
  close together - e.g. use the python API to get cached keystone
  credentials, and boot 5 in a loop.

  If using Ironic you will want https://review.openstack.org/106676 to
  let you see which host is being returned from the selection.

  Possible fixes:
   - have the scheduler be a bit smarter about returning hosts - e.g. track 
destination selection counts since the last refresh and weight hosts by that 
count as well
   - reinstate actioning claims into the scheduler, allowing the audit to 
correct any claimed-but-not-started resource counts asynchronously
   - special case the retry behaviour if there are lots of resources available 
elsewhere in the cluster.

  Stats wise, I just testing a 29 instance deployment with ironic and a
  heat stack, with 45 machines to deploy onto (so 45 hosts in the
  scheduler set) and 4 failed with this race - which means they
  recheduled and failed 3 times each - or 12 cases of scheduler racing
  *at minimum*.

  background chat

  15:43 < lifeless> mikal: around? I need to sanity check something
  15:44 < lifeless> ulp, nope, am sure of it. filing a bug.
  15:45 < mikal> lifeless: ok
  15:46 < lifeless> mikal: oh, you're here, I will run it past you :)
  15:46 < lifeless> mikal: if you have ~5m
  15:46 < mikal> Sure
  15:46 < lifeless> so, symptoms
  15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly 
reliably. Some minor timeout related things to fix but nothing dramatic.
  15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in 
it -> about 50% of instances fail to come up
  15:47 < lifeless> this is with Ironic
  15:47 < mikal> Sure
  15:47 < lifeless> the failure on all the instances is the retry-three-times 
failure-of-death
  15:47 < lifeless> what I believe is happening is this
  15:48 < lifeless> the scheduler is allocating the same weighed list of hosts 
for requests that happen close enough together
  15:49 < lifeless> and I believe its able to do that because the target hosts 
(from select_destinations) need to actually hit the compute node manager and 
have
  15:49 < lifeless> with rt.instance_claim(context, instance, 
limits):
  15:49 < lifeless> happen in _build_and_run_instance
  15:49 < lifeless> before the resource usage is assigned
  15:49 < mikal> Is heat making 45 separate requests to the nova API?
  15:49 < lifeless> eys
 

[Yahoo-eng-team] [Bug 1659391] Re: Server list API does not show scheduled servers that are not assigned to any cell

2017-02-06 Thread Dan Smith
Cells are not optional in Nova as of Ocata. Since cells are required,
you should not see instances that are not assigned to a cell, because
such a thing is not possible (post-scheduling).

Creating an instance before nova is fully setup is not valid either.

These two things combined are doubly invalid.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1659391

Title:
  Server list API does not show scheduled servers that are not assigned
  to any cell

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  After merge of commit [1] command "nova list --all-" started returning
  list of servers that are assigned to some cell only. Revert of this
  change makes API return ALL servers including scheduled ones without
  assigned cells. In case server failed on scheduling step and hasn't
  been assigned to any cell, then we will never see it using "list
  servers" API.

  But, "list" operation should always show ALL servers.

  Steps to reproduce:
  1) install latest nova that contains commit [1], not configuring cell service 
and not creating default cell.
  2) create VM
  3) run any of following commands:
  $ nova list --all-
  $ openstack server list --all
  $ openstack server show %name-of-server%
  $ nova show %name-of-server%

  Expected: we see data of server we created on second step.
  Actual: our server is absent in "list" command results or "NotFound" error on 
"show" command using "name" of server.

  There can be other approach for reproducing it, but we need to use
  "pdb" before step where we assign cell to server.

  [1] https://review.openstack.org/#/c/396775/

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1659391/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1663729] [NEW] CellsV1 regression introduced with flavor migration to api database

2017-02-10 Thread Dan Smith
Public bug reported:

In Newton we migrated flavors to the api database, which requires using
the Flavor object for proper compatibility. A piece of cellsv1 was
missed which would cause it to start reporting resources incorrectly
after the migration happened and the flavors are removed from the main
database.

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: In Progress


** Tags: newton-backport-potential ocata-backport-potential

** Tags added: newton-backport-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1663729

Title:
  CellsV1 regression introduced with flavor migration to api database

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  In Newton we migrated flavors to the api database, which requires
  using the Flavor object for proper compatibility. A piece of cellsv1
  was missed which would cause it to start reporting resources
  incorrectly after the migration happened and the flavors are removed
  from the main database.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1663729/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1668310] [NEW] PCI device migration cannot continue with old deleted service records

2017-02-27 Thread Dan Smith
Public bug reported:

If deleted service records are present in the database, the Service
minimum version calculation should ignore them, but it does not. One
manifestation of this is the PCI device migration from mitaka/newton
will never complete, emitting an error message like this:

2017-02-27 07:40:19.665 ERROR nova.db.sqlalchemy.api [req-ad21480f-613a-
445b-a913-c54532b64ffa None None] Data migrations for PciDevice are not
safe, likely because not all services that access the DB directly are
updated to the latest version

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1668310

Title:
  PCI device migration cannot continue with old deleted service records

Status in OpenStack Compute (nova):
  New

Bug description:
  If deleted service records are present in the database, the Service
  minimum version calculation should ignore them, but it does not. One
  manifestation of this is the PCI device migration from mitaka/newton
  will never complete, emitting an error message like this:

  2017-02-27 07:40:19.665 ERROR nova.db.sqlalchemy.api [req-ad21480f-
  613a-445b-a913-c54532b64ffa None None] Data migrations for PciDevice
  are not safe, likely because not all services that access the DB
  directly are updated to the latest version

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1668310/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1670525] [NEW] Nova logs CellMapping objects at DEBUG

2017-03-06 Thread Dan Smith
Public bug reported:

This could contain credentials for the DB and MQ

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: In Progress


** Tags: newton-backport-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1670525

Title:
  Nova logs CellMapping objects at DEBUG

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  This could contain credentials for the DB and MQ

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1670525/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1672625] Re: Instance stuck in schedule state in Ocata release

2017-04-17 Thread Dan Smith
The missed steps are documented here:

https://docs.openstack.org/developer/nova/cells.html#first-time-setup

That should get you a cell record created, hosts discovered, and back on
track.

** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1672625

Title:
  Instance stuck in schedule state in Ocata release

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  I have built devstack multinode setup on ocata release. Unable to
  launch any instances. All the instances are stuck into "scheduling"
  state. The only error in n-api log is:

  n-api.log:1720:2017-03-14 11:48:59.805 25488 ERROR nova.compute.api 
[req-9a0f533b-b5af-4061-9b88-d00012762131 vks vks] No cells are configured, 
unable to list instances
  n-api.log:1730:2017-03-14 11:49:21.641 25488 ERROR nova.compute.api 
[req-5bdcd955-ddec-4283-9b07-ccd1221912b5 vks vks] No cells are configured, 
unable to list instances
  n-api.log:1740:2017-03-14 11:49:26.659 25488 ERROR nova.compute.api 
[req-8f9e858b-d80a-4770-a727-f94dcebcc986 vks vks] No cells are configured, 
unable to list instances
  n-api.log:1816:2017-03-14 11:49:48.611 25487 ERROR nova.compute.api 
[req-da86d769-49d7-4eec-b389-0dca123b7e16 vks vks] No cells are configured, 
unable to list instances
  n-api.log:1899:2017-03-14 11:51:04.481 25487 INFO nova.api.openstack.ws

  n-sch has no error log.

  o/p: of service list

  stack@stack:~$ nova service-list
  /usr/local/lib/python2.7/dist-packages/novaclient/client.py:278: UserWarning: 
The 'tenant_id' argument is deprecated in Ocata and its use may result in 
errors in future releases. As 'project_id' is provided, the 'tenant_id' 
argument will be ignored.
warnings.warn(msg)
  
++--+---+--+-+---++-+
  | Id | Binary   | Host  | Zone | Status  | State | Updated_at 
| Disabled Reason |
  
++--+---+--+-+---++-+
  | 3  | nova-conductor   | stack | internal | enabled | up| 
2017-03-14T07:21:38.00 | -   |
  | 5  | nova-scheduler   | stack | internal | enabled | up| 
2017-03-14T07:21:37.00 | -   |
  | 6  | nova-consoleauth | stack | internal | enabled | up| 
2017-03-14T07:21:44.00 | -   |
  | 7  | nova-compute | nfp   | nova | enabled | up| 
2017-03-14T07:21:38.00 | -   |
  
++--+---+--+-+---++-+

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1672625/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1684861] Re: Database online_data_migrations in newton fail due to missing keypairs

2017-04-20 Thread Dan Smith
** Changed in: nova
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1684861

Title:
  Database online_data_migrations in newton fail due to missing keypairs

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  Upgrading the deployment from Mitaka to Newton.
  This bug blocks people from upgrading to Ocata because the database migration 
for nova fails.

  Running nova newton 14.0.5, the database is 334

  root@moby:/backups# nova-manage db online_data_migrations
  Option "verbose" from group "DEFAULT" is deprecated for removal.  Its value 
may be silently ignored in the future.
  Running batches of 50 until complete
  50 rows matched query migrate_flavors, 50 migrated
  20 rows matched query migrate_flavors, 20 migrated
  Error attempting to run 
  30 rows matched query migrate_instances_add_request_spec, 30 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 50 migrated
  Error attempting to run 
  /usr/lib/python2.7/dist-packages/pkg_resources/__init__.py:188: 
RuntimeWarning: You have iterated over the result of 
pkg_resources.parse_version. This is a legacy behavior which is inconsistent 
with the new version class introduced in setuptools 8.0. In most cases, 
conversion to a tuple is unnecessary. For comparison of versions, sort the 
Version instances directly. If you have another use case requiring the tuple, 
please file a bug with the setuptools project describing that need.
stacklevel=1,
  50 rows matched query migrate_instances_add_request_spec, 5 migrated
  2017-04-20 14:48:36.586 396 ERROR nova.objects.keypair 
[req-565cbe62-030b-4b00-b9db-5ee82117889b - - - - -] Some instances are still 
missing keypair information. Unable to run keypair migration at this time.
  5 rows matched query migrate_aggregates, 5 migrated
  5 rows matched query migrate_instance_groups_to_api_db, 5 migrated
  2 rows matched query delete_build_requests_with_no_instance_uuid, 2 migrated
  Error attempting to run 
  50 rows matched query migrate_instances_add_request_spec, 0 migrated
  2017-04-20 14:48:40.620 396 ERROR nova.objects.keypair 
[req-565cbe62-030b-4b00-b9db-5ee82117889b - - - - -] Some instances are still 
missing keypair information. Unable to run keypair migration at this time.
  root@moby:/backups#

  Adding a 'raise' after 
https://github.com/openstack/nova/blob/stable/newton/nova/cmd/manage.py#L896
  you can see:

  root@moby:/backups# nova-manage db online_data_migrations
  Option "verbose" from group "DEFAULT" is deprecated for removal.  Its value 
may be silently ignored in the future.
  Running batches of 50 until complete
  Error attempting to run 
  error: 'NoneType' object has no attribute 'key_name'

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1684861/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1686744] [NEW] Unable to add compute host to aggregate if no ironic nodes present

2017-04-27 Thread Dan Smith
Public bug reported:

After the cell-ification of the aggregates API, it is not possible to
add a compute to an aggregate if that compute does not expose any
ComputeNode objects. This can happen  if the hash ring does not allocate
any ironic nodes to one of the computes (i.e. more services than ironic
nodes) or if there are not yet any nodes present in ironic. You get the
following message:

 openstack aggregate add host baremetal-hosts overcloud-
controller-0.localdomain

 Result:
 Host 'overcloud-controller-0.localdomain' is not mapped to any cell (HTTP 404) 
(Request-ID: req-
 42525c1d-c419-4ea4-bb7c-7caa1d57a613)

This is confusing because the service is exposed in service-list and
should be a candidate for adding to an aggregate.

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: Confirmed

** Changed in: nova
 Assignee: (unassigned) => Dan Smith (danms)

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1686744

Title:
  Unable to add compute host to aggregate if no ironic nodes present

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  After the cell-ification of the aggregates API, it is not possible to
  add a compute to an aggregate if that compute does not expose any
  ComputeNode objects. This can happen  if the hash ring does not
  allocate any ironic nodes to one of the computes (i.e. more services
  than ironic nodes) or if there are not yet any nodes present in
  ironic. You get the following message:

   openstack aggregate add host baremetal-hosts overcloud-
  controller-0.localdomain

   Result:
   Host 'overcloud-controller-0.localdomain' is not mapped to any cell (HTTP 
404) (Request-ID: req-
   42525c1d-c419-4ea4-bb7c-7caa1d57a613)

  This is confusing because the service is exposed in service-list and
  should be a candidate for adding to an aggregate.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1686744/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1539271] [NEW] Libvirt live migration stalls

2016-01-28 Thread Dan Smith
Public bug reported:

The following message in nova gate test logs shows that libvirt live
migration can stall on some sort of deadlock:

2016-01-28 16:53:20.878 INFO nova.virt.libvirt.driver [req-692a1f4f-
16aa-4d93-a694-1f7eef4df9f6 tempest-
LiveBlockMigrationTestJSON-1471114638 tempest-
LiveBlockMigrationTestJSON-1937054400] [instance:
7b1bc0e2-a6a9-4d85-a3f9-4568d52d1f1b] Migration running for 30 secs,
memory 100% remaining; (bytes processed=0, remaining=0, total=0)

Additionally, the libvirt logger thread seems to be deadlocked before
this happens.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1539271

Title:
  Libvirt live migration stalls

Status in OpenStack Compute (nova):
  New

Bug description:
  The following message in nova gate test logs shows that libvirt live
  migration can stall on some sort of deadlock:

  2016-01-28 16:53:20.878 INFO nova.virt.libvirt.driver [req-692a1f4f-
  16aa-4d93-a694-1f7eef4df9f6 tempest-
  LiveBlockMigrationTestJSON-1471114638 tempest-
  LiveBlockMigrationTestJSON-1937054400] [instance:
  7b1bc0e2-a6a9-4d85-a3f9-4568d52d1f1b] Migration running for 30 secs,
  memory 100% remaining; (bytes processed=0, remaining=0, total=0)

  Additionally, the libvirt logger thread seems to be deadlocked before
  this happens.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1539271/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1540526] [NEW] Too many lazy-loads in predictable situations

2016-02-01 Thread Dan Smith
Public bug reported:

During a normal tempest run, way (way) too many object lazy-loads are
being triggered, which causes extra RPC and database traffic. In a given
tempest run, we should be able to pretty much prevent any lazy-loads in
that predictable situation. The only case where we might want to have
some is where we are iterating objects and conditionally taking action
that needs to load extra information.

On a random devstack-tempest job run sampled on 1-Feb-2016, a lot of
lazy loads were seen:

  grep 'Lazy-loading' screen-n-cpu.txt.gz  -c
  624

We should be able to vastly reduce this number without much work.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1540526

Title:
  Too many lazy-loads in predictable situations

Status in OpenStack Compute (nova):
  New

Bug description:
  During a normal tempest run, way (way) too many object lazy-loads are
  being triggered, which causes extra RPC and database traffic. In a
  given tempest run, we should be able to pretty much prevent any lazy-
  loads in that predictable situation. The only case where we might want
  to have some is where we are iterating objects and conditionally
  taking action that needs to load extra information.

  On a random devstack-tempest job run sampled on 1-Feb-2016, a lot of
  lazy loads were seen:

grep 'Lazy-loading' screen-n-cpu.txt.gz  -c
624

  We should be able to vastly reduce this number without much work.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1540526/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1470153] [NEW] Nova object relationships ignore List objects

2015-06-30 Thread Dan Smith
Public bug reported:

In nova/tests/objects/test_objects.py, we have an important test called
test_relationships(). This ensures that we have version mappings between
objects that depend on each other, and that those versions and
relationships are bumped when one object changes versions.

That test currently excludes any objects that are based on the List
mixin, which obscures dependencies that do things like
Foo->BarList->Bar.

The test needs to be modified to not exclude List-based objects, and the
relationship map needs to be updated for the List objects that are
currently excluded.

** Affects: nova
 Importance: Low
 Assignee: Ryan Rossiter (rlrossit)
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1470153

Title:
  Nova object relationships ignore List objects

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  In nova/tests/objects/test_objects.py, we have an important test
  called test_relationships(). This ensures that we have version
  mappings between objects that depend on each other, and that those
  versions and relationships are bumped when one object changes
  versions.

  That test currently excludes any objects that are based on the List
  mixin, which obscures dependencies that do things like
  Foo->BarList->Bar.

  The test needs to be modified to not exclude List-based objects, and
  the relationship map needs to be updated for the List objects that are
  currently excluded.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1470153/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1470154] [NEW] List objects should use obj_relationships

2015-06-30 Thread Dan Smith
Public bug reported:

Nova's List-based objects have something called child_versions, which is
a naive mapping of the objects field and the version relationships
between the list object and the content object. This was created before
we generalized the work in obj_relationships, which normal objects now
use. The list-based objects still use child_versions, which means we
need a separate test and separate developer behaviors when updating
these.

For consistency, we should replace child_versions on all the list
objects with obj_relationships, remove the list-specific test in
test_objects.py, and make sure that the generalized tests properly cover
list objects and relationships between list and non-list objects.

** Affects: nova
 Importance: Low
 Assignee: Ryan Rossiter (rlrossit)
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1470154

Title:
  List objects should use obj_relationships

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Nova's List-based objects have something called child_versions, which
  is a naive mapping of the objects field and the version relationships
  between the list object and the content object. This was created
  before we generalized the work in obj_relationships, which normal
  objects now use. The list-based objects still use child_versions,
  which means we need a separate test and separate developer behaviors
  when updating these.

  For consistency, we should replace child_versions on all the list
  objects with obj_relationships, remove the list-specific test in
  test_objects.py, and make sure that the generalized tests properly
  cover list objects and relationships between list and non-list
  objects.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1470154/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1471887] [NEW] nova-compute will delete all instances if hostname changes

2015-07-06 Thread Dan Smith
Public bug reported:

The evacuate code as it is currently in nova will delete instances when
instance.host != $(hostname) of the host. This assumes that the instance
has been evacuated (because its hostname changed). In that case,
deleting the local residue is correct, but if the host's hostname
changes, then we will just delete data based on a hunch.

Nova-compute needs a better mechanism to detect if an evacuation  has
actually been requested before deleting the data.

See Blueprint robustify-evacuate

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1471887

Title:
  nova-compute will delete all instances if hostname changes

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  The evacuate code as it is currently in nova will delete instances
  when instance.host != $(hostname) of the host. This assumes that the
  instance has been evacuated (because its hostname changed). In that
  case, deleting the local residue is correct, but if the host's
  hostname changes, then we will just delete data based on a hunch.

  Nova-compute needs a better mechanism to detect if an evacuation  has
  actually been requested before deleting the data.

  See Blueprint robustify-evacuate

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1471887/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1478108] [NEW] Live migration should throttle itself

2015-07-24 Thread Dan Smith
Public bug reported:

Nova will accept an unbounded number of live migrations for a single
host, which will result in timeouts and failures (at least for libvirt).
Since live migrations are seriously IO intensive, allowing this to be
unlimited is just never going to be the right thing to do, especially
when we have functions in our own client to live migrate all instances
to other hosts (nova host-evacuate-live).

We recently added a build semaphore to allow capping the number of
parallel builds being attempted on a compute host for a similar reason.
This should be the same sort of thing for live migration.

** Affects: nova
 Importance: Low
 Status: New

** Changed in: nova
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1478108

Title:
  Live migration should throttle itself

Status in OpenStack Compute (nova):
  New

Bug description:
  Nova will accept an unbounded number of live migrations for a single
  host, which will result in timeouts and failures (at least for
  libvirt). Since live migrations are seriously IO intensive, allowing
  this to be unlimited is just never going to be the right thing to do,
  especially when we have functions in our own client to live migrate
  all instances to other hosts (nova host-evacuate-live).

  We recently added a build semaphore to allow capping the number of
  parallel builds being attempted on a compute host for a similar
  reason. This should be the same sort of thing for live migration.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1478108/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1493961] [NEW] nova-conductor object debug does not format

2015-09-09 Thread Dan Smith
Public bug reported:

The debug log statement in nova-conductor's object_backport_versions()
method doesn't format and looks like this:

2015-09-09 11:26:57.126 DEBUG nova.conductor.manager [req-9ff7962c-
c8b8-4579-8943-cbf2ef0be373 demo demo] Backporting %(obj)s to %(ver)s
with versions %(manifest)s from (pid=14735) object_backport_versions
/opt/stack/nova/nova/conductor/manager.py:506

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1493961

Title:
  nova-conductor object debug does not format

Status in OpenStack Compute (nova):
  New

Bug description:
  The debug log statement in nova-conductor's object_backport_versions()
  method doesn't format and looks like this:

  2015-09-09 11:26:57.126 DEBUG nova.conductor.manager [req-9ff7962c-
  c8b8-4579-8943-cbf2ef0be373 demo demo] Backporting %(obj)s to %(ver)s
  with versions %(manifest)s from (pid=14735) object_backport_versions
  /opt/stack/nova/nova/conductor/manager.py:506

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1493961/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1387244] Re: Increasing number of InstancePCIRequests.get_by_instance_uuid RPC calls during compute host auditing

2015-09-14 Thread Dan Smith
** Changed in: nova
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1387244

Title:
  Increasing number of InstancePCIRequests.get_by_instance_uuid RPC
  calls during compute host auditing

Status in OpenStack Compute (nova):
  Fix Released
Status in nova package in Ubuntu:
  Triaged

Bug description:
  Environment: Ubuntu 14.04/OpenStack Juno Release

  The periodic auditing on compute node becomes very RPC call intensive
  when a large number of instances are running on a cloud; the
  InstancePCIRequests.get_by_instance_uuid call is made on all instances
  running on the hypervisor - when this is multiplied across a large
  number of hypervisors, this impacts back onto the conductor processes
  as they try to service an increasing amount of RPC calls over time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1387244/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1498023] [NEW] _cleanup_incomplete_migrations() does not check for shared storage

2015-09-21 Thread Dan Smith
Public bug reported:

The newly-added periodic task to cleanup residue from failed migrations
does not properly consider shared storage before deleting instance
files. This could easily lead to data loss in such an environment
following a failed migration.

** Affects: nova
 Importance: High
 Assignee: Dan Smith (danms)
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498023

Title:
  _cleanup_incomplete_migrations() does not check for shared storage

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  The newly-added periodic task to cleanup residue from failed
  migrations does not properly consider shared storage before deleting
  instance files. This could easily lead to data loss in such an
  environment following a failed migration.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1498023/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1498023] Re: _cleanup_incomplete_migrations() does not check for shared storage

2015-09-21 Thread Dan Smith
** Changed in: nova
   Importance: High => Undecided

** Changed in: nova
   Status: In Progress => Invalid

** Changed in: nova
Milestone: liberty-rc1 => None

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1498023

Title:
  _cleanup_incomplete_migrations() does not check for shared storage

Status in OpenStack Compute (nova):
  Invalid

Bug description:
  The newly-added periodic task to cleanup residue from failed
  migrations does not properly consider shared storage before deleting
  instance files. This could easily lead to data loss in such an
  environment following a failed migration.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1498023/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1555287] [NEW] Libvirt driver broken for non-disk-image backends

2016-03-09 Thread Dan Smith
Public bug reported:

Recently the ceph job (and any other configuration that doesn't use disk
image as the backend storage) started failing like this:


2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher Traceback 
(most recent call last):
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
138, in _dispatch_and_reply
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
incoming.message))
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
183, in _dispatch
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
127, in _do_dispatch
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher result = 
func(ctxt, **new_args)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 110, in wrapped
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher payload)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
self.force_reraise()
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 89, in wrapped
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 359, in decorated_function
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
LOG.warning(msg, e, instance=instance)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
self.force_reraise()
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 328, in decorated_function
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 409, in decorated_function
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 316, in decorated_function
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
migration.instance_uuid, exc_info=True)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
self.force_reraise()
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 293, in decorated_function
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 387, in decorated_function
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher 
kwargs['instance'], e, sys.exc_info())
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in 
__exit__
2016-03-09 14:47:29.102 17597 ERROR oslo_messaging.rpc.

[Yahoo-eng-team] [Bug 1435586] [NEW] trigger security group refresh gives 'dict' object has no attribute 'uuid'

2015-03-23 Thread Dan Smith
Public bug reported:

During trigger_rules_refresh(), we get this from compute manager:

2015-03-23 03:50:49.677 ERROR oslo_messaging.rpc.dispatcher 
[req-117e72d4-8c12-4805-9c65-695b62fad491 alt_demo alt_demo] Exception during 
message handling: 'dict' object has no attribute 'uuid'
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher Traceback 
(most recent call last):
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
142, in _dispatch_and_reply
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
executor_callback))
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
186, in _dispatch
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
executor_callback)
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
130, in _do_dispatch
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher result = 
func(ctxt, **new_args)
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher payload)
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in 
__exit__
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/compute/manager.py", line 1301, in 
refresh_instance_security_rules
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
@utils.synchronized(instance.uuid)
2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
AttributeError: 'dict' object has no attribute 'uuid'

This happens because we're passing non-object instances to
refresh_instance_security_rules()

** Affects: nova
 Importance: Undecided
 Assignee: Dan Smith (danms)
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1435586

Title:
  trigger security group refresh gives 'dict' object has no attribute
  'uuid'

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  During trigger_rules_refresh(), we get this from compute manager:

  2015-03-23 03:50:49.677 ERROR oslo_messaging.rpc.dispatcher 
[req-117e72d4-8c12-4805-9c65-695b62fad491 alt_demo alt_demo] Exception during 
message handling: 'dict' object has no attribute 'uuid'
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher Traceback 
(most recent call last):
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
142, in _dispatch_and_reply
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
executor_callback))
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
186, in _dispatch
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
executor_callback)
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
130, in _do_dispatch
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher result 
= func(ctxt, **new_args)
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher payload)
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 85, in 
__exit__
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher   File 
"/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
  2015-03-23 03:50:49.677 32448 TRACE oslo_messaging.rpc.dispatcher return 

[Yahoo-eng-team] [Bug 1441243] [NEW] EnumField can be None and thus unrestricted

2015-04-07 Thread Dan Smith
Public bug reported:

The Enum objects field can be passed a valid_values=None set, which
disables the enum checking and defeats the whole purpose of the field.
This would allow something unversioned to creep into our RPC API, which
would be bad.

** Affects: nova
 Importance: High
 Assignee: Dan Smith (danms)
 Status: In Progress


** Tags: unified-objects

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1441243

Title:
  EnumField can be None and thus unrestricted

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  The Enum objects field can be passed a valid_values=None set, which
  disables the enum checking and defeats the whole purpose of the field.
  This would allow something unversioned to creep into our RPC API,
  which would be bad.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1441243/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1442236] [NEW] Bump compute RPC API to 4.0

2015-04-09 Thread Dan Smith
Public bug reported:

We badly need to bump the compute RPC version to 4.0 BEFORE we release
kilo.

** Affects: nova
 Importance: Critical
 Assignee: Dan Smith (danms)
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1442236

Title:
  Bump compute RPC API to 4.0

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  We badly need to bump the compute RPC version to 4.0 BEFORE we release
  kilo.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1442236/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1450624] [NEW] Nova waits for events from neutron on resize-revert that aren't coming

2015-04-30 Thread Dan Smith
er.py",
 line 6095, in finish_revert_migration
2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher 
block_device_info, power_on)
2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/virt/libvirt/driver.py",
 line 4446, in _create_domain_and_network
2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher raise 
exception.VirtualInterfaceCreateException()
2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher 
VirtualInterfaceCreateException: Virtual Interface creation failed

** Affects: nova
 Importance: High
 Assignee: Dan Smith (danms)
 Status: In Progress


** Tags: juno-backport-potential kilo-backport-potential libvirt neutron

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1450624

Title:
  Nova waits for events from neutron on resize-revert that aren't coming

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  On resize-revert, the original host was waiting for plug events from
  neutron before restarting the instance. These aren't sent since we
  don't ever unplug the vifs. Thus, we'll always fail like this:

  
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher Traceback 
(most recent call last):
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py",
 line 134, in _dispatch_and_reply
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher 
incoming.message))
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py",
 line 177, in _dispatch
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return 
self._do_dispatch(endpoint, method, ctxt, args)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py",
 line 123, in _do_dispatch
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher result 
= getattr(endpoint, method)(ctxt, **new_args)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/exception.py",
 line 88, in wrapped
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher payload)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py",
 line 82, in __exit__
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/exception.py",
 line 71, in wrapped
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return 
f(self, context, *args, **kw)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 298, in decorated_function
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher pass
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/openstack/common/excutils.py",
 line 82, in __exit__
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher 
six.reraise(self.type_, self.value, self.tb)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 284, in decorated_function
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 348, in decorated_function
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher return 
function(self, context, *args, **kwargs)
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher   File 
"/opt/bbc/openstack-10.0-bbc40/nova/local/lib/python2.7/site-packages/nova/compute/manager.py",
 line 326, in decorated_function
  2015-04-30 19:45:42.602 23513 TRACE oslo.messaging.rpc.dispatcher 
kwargs['

[Yahoo-eng-team] [Bug 1465799] [NEW] Instance object always re-saves flavor info

2015-06-16 Thread Dan Smith
Public bug reported:

Due to a bug in the logic during Instance._save_flavor(), the
instance.extra.flavor field will be saved every time we call
Instance.save(), even if no changes have been made. This generates more
database traffic for no reason.

** Affects: nova
 Importance: Medium
 Status: New

** Changed in: nova
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1465799

Title:
  Instance object always re-saves flavor info

Status in OpenStack Compute (Nova):
  New

Bug description:
  Due to a bug in the logic during Instance._save_flavor(), the
  instance.extra.flavor field will be saved every time we call
  Instance.save(), even if no changes have been made. This generates
  more database traffic for no reason.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1465799/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1373106] Re: jogo and sdague are making me sad

2014-09-23 Thread Dan Smith
** Changed in: nova
   Status: Opinion => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1373106

Title:
  jogo and sdague are making me sad

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Just like when my parents would fight pre-separation...

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1373106/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1396324] [NEW] Instance object has no attribute get_flavor()

2014-11-25 Thread Dan Smith
Public bug reported:

The notifications code in nova is receiving a SQLAlchemy object when
trying to send state update notifications, resulting in this in the
conductor log:

2014-11-25 03:13:40.200 ERROR nova.notifications 
[req-1a9ed96d-7ce2-4c7d-a409-a6959852ce6a AggregatesAdminTestXML-569323565 
AggregatesAdminTestXML-1788648791] [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Failed to send state update notification
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Traceback (most recent call last):
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]   File 
"/opt/stack/new/nova/nova/notifications.py", line 146, in send_update
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] old_display_name=old_display_name)
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]   File 
"/opt/stack/new/nova/nova/notifications.py", line 226, in 
_send_instance_update_notification
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] payload = info_from_instance(context, 
instance, None, None)
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]   File 
"/opt/stack/new/nova/nova/notifications.py", line 369, in info_from_instance
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] instance_type = instance.get_flavor()
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] AttributeError: 'Instance' object has no 
attribute 'get_flavor'
2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]

** Affects: nova
 Importance: Medium
 Assignee: Dan Smith (danms)
 Status: Confirmed

** Changed in: nova
   Importance: Undecided => Medium

** Changed in: nova
   Status: New => Confirmed

** Changed in: nova
Milestone: None => kilo-1

** Changed in: nova
 Assignee: (unassigned) => Dan Smith (danms)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1396324

Title:
  Instance object has no attribute get_flavor()

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  The notifications code in nova is receiving a SQLAlchemy object when
  trying to send state update notifications, resulting in this in the
  conductor log:

  2014-11-25 03:13:40.200 ERROR nova.notifications 
[req-1a9ed96d-7ce2-4c7d-a409-a6959852ce6a AggregatesAdminTestXML-569323565 
AggregatesAdminTestXML-1788648791] [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Failed to send state update notification
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] Traceback (most recent call last):
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]   File 
"/opt/stack/new/nova/nova/notifications.py", line 146, in send_update
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] old_display_name=old_display_name)
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]   File 
"/opt/stack/new/nova/nova/notifications.py", line 226, in 
_send_instance_update_notification
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] payload = info_from_instance(context, 
instance, None, None)
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]   File 
"/opt/stack/new/nova/nova/notifications.py", line 369, in info_from_instance
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] instance_type = instance.get_flavor()
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b] AttributeError: 'Instance' object has no 
attribute 'get_flavor'
  2014-11-25 03:13:40.200 27090 TRACE nova.notifications [instance: 
74bb24d3-ba69-41e2-b99a-1c35a2331c1b]

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1396324/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1403162] [NEW] fake_notifier: ValueError: Circular reference detected

2014-12-16 Thread Dan Smith
Public bug reported:

The fake_notifier code is using anyjson, which today is failing to
serialize something in a notification payload. Failure looks like this:

Traceback (most recent call last):
  File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/mock.py", line 
1201, in patched
return func(*args, **keywargs)
  File "nova/tests/unit/compute/test_compute.py", line 2774, in 
test_reboot_fail
self._test_reboot(False, fail_reboot=True)
  File "nova/tests/unit/compute/test_compute.py", line 2744, in _test_reboot
reboot_type=reboot_type)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 420, in assertRaises
self.assertThat(our_callable, matcher)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 431, in assertThat
mismatch_error = self._matchHelper(matchee, matcher, message, verbose)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 481, in _matchHelper
mismatch = matcher.match(matchee)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_exception.py",
 line 108, in match
mismatch = self.exception_matcher.match(exc_info)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_higherorder.py",
 line 62, in match
mismatch = matcher.match(matchee)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 412, in match
reraise(*matchee)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_exception.py",
 line 101, in match
result = matchee()
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 965, in __call__
return self._callable_object(*self._args, **self._kwargs)
  File "nova/exception.py", line 88, in wrapped
payload)
  File "nova/tests/unit/fake_notifier.py", line 57, in _notify
anyjson.serialize(payload)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/anyjson/__init__.py", 
line 141, in dumps
return implementation.dumps(value)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/anyjson/__init__.py", 
line 87, in dumps
return self._encode(data)
  File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/oslo/serialization/jsonutils.py",
 line 186, in dumps
return json.dumps(obj, default=default, **kwargs)
  File "/usr/lib64/python2.7/json/__init__.py", line 250, in dumps
sort_keys=sort_keys, **kw).encode(obj)
  File "/usr/lib64/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib64/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
ValueError: Circular reference detected

** Affects: nova
 Importance: Critical
 Assignee: Dan Smith (danms)
 Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1403162

Title:
  fake_notifier: ValueError: Circular reference detected

Status in OpenStack Compute (Nova):
  In Progress

Bug description:
  The fake_notifier code is using anyjson, which today is failing to
  serialize something in a notification payload. Failure looks like
  this:

  Traceback (most recent call last):
File "/home/dan/nova/.tox/py27/lib/python2.7/site-packages/mock.py", 
line 1201, in patched
  return func(*args, **keywargs)
File "nova/tests/unit/compute/test_compute.py", line 2774, in 
test_reboot_fail
  self._test_reboot(False, fail_reboot=True)
File "nova/tests/unit/compute/test_compute.py", line 2744, in 
_test_reboot
  reboot_type=reboot_type)
File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 420, in assertRaises
  self.assertThat(our_callable, matcher)
File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 431, in assertThat
  mismatch_error = self._matchHelper(matchee, matcher, message, verbose)
File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/testcase.py", 
line 481, in _matchHelper
  mismatch = matcher.match(matchee)
File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-packages/testtools/matchers/_exception.py",
 line 108, in match
  mismatch = self.exception_matcher.match(exc_info)
File 
"/home/dan/nova/.tox/py27/lib/python2.7/site-pack

[Yahoo-eng-team] [Bug 1275875] [NEW] Virt drivers should use standard image properties

2014-02-03 Thread Dan Smith
Public bug reported:

Several virt drivers are using non-standard driver-specific image
metadata properties. This creates an API contract between the external
user and the driver implementation. These non-standard ones should be
marked as deprecated in some way, enforced in v3, etc. We need a global
whitelist of keys and values that are allowed so that we can make sure
others don't leak in.

Examples:

nova/virt/vmwareapi/vmops.py:os_type = 
image_properties.get("vmware_ostype", "otherGuest")
nova/virt/vmwareapi/vmops.py:adapter_type = 
image_properties.get("vmware_adaptertype",
nova/virt/vmwareapi/vmops.py:disk_type = 
image_properties.get("vmware_disktype",
nova/virt/vmwareapi/vmops.py:vif_model = 
image_properties.get("hw_vif_model", "VirtualE1000")

nova/virt/xenapi/vm_utils.py:device_id =
image_properties.get('xenapi_device_id')

I think it's important to try to get this fixed (or as close as
possible) before the icehouse release.

** Affects: nova
 Importance: Medium
 Status: Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1275875

Title:
  Virt drivers should use standard image properties

Status in OpenStack Compute (Nova):
  Confirmed

Bug description:
  Several virt drivers are using non-standard driver-specific image
  metadata properties. This creates an API contract between the external
  user and the driver implementation. These non-standard ones should be
  marked as deprecated in some way, enforced in v3, etc. We need a
  global whitelist of keys and values that are allowed so that we can
  make sure others don't leak in.

  Examples:

  nova/virt/vmwareapi/vmops.py:os_type = 
image_properties.get("vmware_ostype", "otherGuest")
  nova/virt/vmwareapi/vmops.py:adapter_type = 
image_properties.get("vmware_adaptertype",
  nova/virt/vmwareapi/vmops.py:disk_type = 
image_properties.get("vmware_disktype",
  nova/virt/vmwareapi/vmops.py:vif_model = 
image_properties.get("hw_vif_model", "VirtualE1000")

  nova/virt/xenapi/vm_utils.py:device_id =
  image_properties.get('xenapi_device_id')

  I think it's important to try to get this fixed (or as close as
  possible) before the icehouse release.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1275875/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1276731] [NEW] simple_tenant_usage extension should not rely on looking up flavors

2014-02-05 Thread Dan Smith
Public bug reported:

The simple_tenant_usage extension gets the flavor data from the instance
and then looks up the flavor from the database to return usage
information. Since we now store all of the flavor data in the instance
itself, we should use that information instead of what the flavor
currently says is right. This both (a) makes it more accurate and (b)
avoids us failing to return usage info if a flavor disappears.

** Affects: nova
 Importance: Medium
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1276731

Title:
  simple_tenant_usage extension should not rely on looking up flavors

Status in OpenStack Compute (Nova):
  New

Bug description:
  The simple_tenant_usage extension gets the flavor data from the
  instance and then looks up the flavor from the database to return
  usage information. Since we now store all of the flavor data in the
  instance itself, we should use that information instead of what the
  flavor currently says is right. This both (a) makes it more accurate
  and (b) avoids us failing to return usage info if a flavor disappears.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1276731/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1280034] [NEW] compute_node_update broken with havana compute nodes

2014-02-13 Thread Dan Smith
Public bug reported:

This change:

https://review.openstack.org/#/c/66469

Changed the format of the data in the "values" dictionary of
compute_node_update. This causes an icehouse conductor to generate a
broken SQL query when called from a havana compute node:

http://logs.openstack.org/75/64075/13/check/check-grenade-
dsvm/b70c839/logs/new/screen-n-cond.txt.gz?level=TRACE

executors.base [-] Exception during message handling: (ProgrammingError) (1064, 
"You have an error in your SQL syntax; check the manual that corresponds to 
your MySQL server version for the right syntax to use near ': '2', 
u'io_workload': '0', u'num_instances': '2', u'num_vm_building': '0', u'nu' at 
line 1") 'UPDATE compute_nodes SET updated_at=%s, vcpus_used=%s, 
memory_mb_used=%s, free_ram_mb=%s, running_vms=%s, stats=%s WHERE 
compute_nodes.id = %s' (datetime.datetime(2014, 2, 12, 21, 2, 12, 395978), 4, 
1216, 6737, 4, {u'num_task_None': 2, u'io_workload': 0, u'num_instances': 2, 
u'num_vm_active': 1, u'num_task_scheduling': 0, u'num_vm_building': 0, 
u'num_proj_d0e1e781676f4fe5b1b81e31b8ae87de': 1, u'num_vcpus_used': 2, 
u'num_proj_a8a2f9c3e3bd44edb1c5fd2ae4cc7b3c': 1, u'num_os_type_None': 2, 
u'num_vm_error': 1}, 1)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base Traceback 
(most recent call last):
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/oslo.messaging/oslo/messaging/_executors/base.py", line 36, in 
_dispatch
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base 
incoming.reply(self.callback(incoming.ctxt, incoming.message))
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 134, in 
__call__
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return 
self._dispatch(endpoint, method, ctxt, args)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 104, in 
_dispatch
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base result = 
getattr(endpoint, method)(ctxt, **new_args)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/nova/nova/conductor/manager.py", line 458, in 
compute_node_update
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base result = 
self.db.compute_node_update(context, node['id'], values)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/nova/nova/db/api.py", line 228, in compute_node_update
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return 
IMPL.compute_node_update(context, compute_id, values)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 110, in wrapper
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return 
f(*args, **kwargs)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 166, in wrapped
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base return 
f(*args, **kwargs)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 614, in 
compute_node_update
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base 
compute_ref.update(values)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 456, 
in __exit__
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base 
self.commit()
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 368, 
in commit
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base 
self._prepare_impl()
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 347, 
in _prepare_impl
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base 
self.session.flush()
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base   File 
"/opt/stack/new/nova/nova/openstack/common/db/sqlalchemy/session.py", line 616, 
in _wrap
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base raise 
exception.DBError(e)
2014-02-12 21:02:12.401 18598 TRACE oslo.messaging._executors.base DBError: 
(ProgrammingError) (1064, "You have an error in your SQL syntax; check the 
manual that corresponds to your MySQL server version for the right syntax to 
use near ': '2', u'io_workload': '0', u'num_instances': '2', 
u'num_vm_building': '0', u'nu' at line 1") 'UPDATE compute_nodes SET 
updated_at=%s, vcpus_used=%s, memory_mb_used=%s, free_ram_mb=%s, 
running_vms=%s, st

[Yahoo-eng-team] [Bug 1284312] [NEW] vmware driver races to create instance images

2014-02-24 Thread Dan Smith
Public bug reported:

Change Ia0ebd674345734e7cfa31ccd400fdba93646c554 traded one race
condition for another. By ignoring all mkdir() calls that would
otherwise fail because an instance directory already exists, two nodes
racing to create a single image will corrupt or lose data, or fail in a
strange way. This call should fail in that case, but doesn't after the
recent patch was merged:

https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/vmops.py#L350

** Affects: nova
 Importance: High
 Assignee: Shawn Hartsock (hartsock)
 Status: New

** Affects: openstack-vmwareapi-team
 Importance: Critical
 Status: New


** Tags: vmware

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1284312

Title:
  vmware driver races to create instance images

Status in OpenStack Compute (Nova):
  New
Status in The OpenStack VMwareAPI subTeam:
  New

Bug description:
  Change Ia0ebd674345734e7cfa31ccd400fdba93646c554 traded one race
  condition for another. By ignoring all mkdir() calls that would
  otherwise fail because an instance directory already exists, two nodes
  racing to create a single image will corrupt or lose data, or fail in
  a strange way. This call should fail in that case, but doesn't after
  the recent patch was merged:

  
https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/vmops.py#L350

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1284312/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1250300] Re: chinese secgroup description make nova list failed

2014-03-14 Thread Dan Smith
Original poster confirms this is no longer a problem

** Changed in: nova
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1250300

Title:
  chinese secgroup description make nova list failed

Status in OpenStack Compute (Nova):
  Invalid

Bug description:
  I create a secgroup with chinese description just as following:
  hzguanqiang@debian:/data/log/nova$ nova secgroup-list
  ++--+-+
  | Id | Name | Description |
  ++--+-+
  | 11 | bingoxxx | 无  |

  Then I create an instance with this secgroup, It report an 500 error.

  And when I execute 'nova list' command, it failed with such error info
  in nova-api.log:

  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack Traceback (most recent 
call last):
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/nova/api/openstack/__init__.py", line 
111, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
req.get_response(self.application)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1053, in 
get_response
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack application, 
catch_exc_info=False)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/request.py", line 1022, in 
call_application
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack app_iter = 
application(self.environ, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
resp(environ, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/keystoneclient/middleware/auth_token.py",
 line 571, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
self.app(env, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
resp(environ, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
resp(environ, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/routes/middleware.py", line 131, in 
__call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack response = 
self.app(environ, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 159, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
resp(environ, start_response)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 147, in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack resp = 
self.call_func(req, *args, **self.kwargs)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/webob/dec.py", line 208, in call_func
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
self.func(req, *args, **kwargs)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py", line 904, 
in __call__
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack content_type, 
body, accept)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py", line 963, 
in _process_stack
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack action_result = 
self.dispatch(meth, request, action_args)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/nova/api/openstack/wsgi.py", line 1044, 
in dispatch
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack return 
method(req=request, **action_args)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", 
line 505, in detail
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack servers = 
self._get_servers(req, is_detail=True)
  2013-11-12 11:12:24.137 26386 TRACE nova.api.openstack   File 
"/usr/local/lib/python2.7/dist-packages/nova/api/openstack/compute/servers.py", 
line 567, in _get_servers
  2013-11-12 11:12:24.137 26386 TRACE nova.api.op

  1   2   >