[Yahoo-eng-team] [Bug 1348103] [NEW] nova to neutron port notification fails in cells environment

2014-07-24 Thread Liam Young
Public bug reported:

When deploying OpenStack Icehouse on Ubuntu trusty  in a cells configuration 
the callback from neutron to nova that notifies nova
when a port for an instance is ready to be used seems to be lost. This causes 
the spawning instance to go into an ERROR state and 
the following int the nova-compute.log:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1714, 
in _spawn
block_device_info)
  File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 
2266, in spawn
block_device_info)
  File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 
3681, in _create_domain_and_network
raise exception.VirtualInterfaceCreateException()
VirtualInterfaceCreateException: Virtual Interface creation failed


Adding "vif_plugging_is_fatal = False" and "vif_plugging_timeout = 5" to the 
compute nodes stops the missing message from being fatal and guests can then be 
spawned normally and accessed over the network.

This issue doesn't present itself when deploying in a non-cell
configuration.

I'll attatch logs from attempting to spawn a new guest (at about 07:52)
with:

nova  boot --image precise --flavor m1.small --key_name test --nic net-
id=b77ca278-6e00-4530-94fe-c946a6046acf server075238

where dc31c58f-e455-4a1a-b825-6777ccb8d3c1 is the resulting guest id

nova-cells 1:2014.1.1-0ubuntu1
nova-api-ec21:2014.1.1-0ubuntu1
nova-api-os-compute  1:2014.1.1-0ubuntu1
nova-cert  1:2014.1.1-0ubuntu1
nova-common   1:2014.1.1-0ubuntu1
nova-conductor   1:2014.1.1-0ubuntu1
nova-objectstore 1:2014.1.1-0ubuntu1
nova-scheduler 1:2014.1.1-0ubuntu1
neutron-common 1:2014.1.1-0ubuntu2
neutron-plugin-ml2 1:2014.1.1-0ubuntu2
neutron-server 1:2014.1.1-0ubuntu2
neutron-plugin-openvswitch-agent 1:2014.1.1-0ubuntu2
openvswitch-common  2.0.1+git20140120-0ubuntu2
openvswitch-switch  2.0.1+git20140120-0ubuntu2
neutron-plugin-ml2  1:2014.1.1-0ubuntu2

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1348103

Title:
  nova to neutron port notification fails in cells environment

Status in OpenStack Compute (Nova):
  New

Bug description:
  When deploying OpenStack Icehouse on Ubuntu trusty  in a cells configuration 
the callback from neutron to nova that notifies nova
  when a port for an instance is ready to be used seems to be lost. This causes 
the spawning instance to go into an ERROR state and 
  the following int the nova-compute.log:

  Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1714, 
in _spawn
  block_device_info)
File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 
2266, in spawn
  block_device_info)
File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 
3681, in _create_domain_and_network
  raise exception.VirtualInterfaceCreateException()
  VirtualInterfaceCreateException: Virtual Interface creation failed

  
  Adding "vif_plugging_is_fatal = False" and "vif_plugging_timeout = 5" to the 
compute nodes stops the missing message from being fatal and guests can then be 
spawned normally and accessed over the network.

  This issue doesn't present itself when deploying in a non-cell
  configuration.

  I'll attatch logs from attempting to spawn a new guest (at about
  07:52) with:

  nova  boot --image precise --flavor m1.small --key_name test --nic
  net-id=b77ca278-6e00-4530-94fe-c946a6046acf server075238

  where dc31c58f-e455-4a1a-b825-6777ccb8d3c1 is the resulting guest id

  nova-cells 1:2014.1.1-0ubuntu1
  nova-api-ec21:2014.1.1-0ubuntu1
  nova-api-os-compute  1:2014.1.1-0ubuntu1
  nova-cert  1:2014.1.1-0ubuntu1
  nova-common   1:2014.1.1-0ubuntu1
  nova-conductor   1:2014.1.1-0ubuntu1
  nova-objectstore 1:2014.1.1-0ubuntu1
  nova-scheduler 1:2014.1.1-0ubuntu1
  neutron-common 1:2014.1.1-0ubuntu2
  neutron-plugin-ml2 1:2014.1.1-0ubuntu2
  neutron-server 1:2014.1.1-0ubuntu2

[Yahoo-eng-team] [Bug 1359805] [NEW] 'Requested operation is not valid: domain is not running' from check-tempest-dsvm-neutron-full

2014-08-21 Thread Liam Young
Public bug reported:

I received the following error from the check-tempest-dsvm-neutron-full
test suite after submitting a nova patch:

2014-08-21 14:11:25.059 | Captured traceback:
2014-08-21 14:11:25.059 | ~~~
2014-08-21 14:11:25.059 | Traceback (most recent call last):
2014-08-21 14:11:25.059 |   File 
"tempest/api/compute/servers/test_server_actions.py", line 407, in 
test_suspend_resume_server
2014-08-21 14:11:25.059 | 
self.client.wait_for_server_status(self.server_id, 'SUSPENDED')
2014-08-21 14:11:25.059 |   File 
"tempest/services/compute/xml/servers_client.py", line 390, in 
wait_for_server_status
2014-08-21 14:11:25.059 | raise_on_error=raise_on_error)
2014-08-21 14:11:25.059 |   File "tempest/common/waiters.py", line 77, in 
wait_for_server_status
2014-08-21 14:11:25.059 | server_id=server_id)
2014-08-21 14:11:25.059 | BuildErrorException: Server 
a29ec7be-be83-4247-b7db-49bd4727d206 failed to build and is in ERROR status
2014-08-21 14:11:25.059 | Details: {'message': 'Requested operation is not 
valid: domain is not running', 'code': '500', 'details': 'None', 'created': 
'2014-08-21T13:49:49Z'}

** Affects: neutron
 Importance: Undecided
 Status: New

** Attachment added: "check-tempest-dsvm-neutron-full-console.txt"
   
https://bugs.launchpad.net/bugs/1359805/+attachment/4183601/+files/check-tempest-dsvm-neutron-full-console.txt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1359805

Title:
  'Requested operation is not valid: domain is not running' from check-
  tempest-dsvm-neutron-full

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  I received the following error from the check-tempest-dsvm-neutron-
  full test suite after submitting a nova patch:

  2014-08-21 14:11:25.059 | Captured traceback:
  2014-08-21 14:11:25.059 | ~~~
  2014-08-21 14:11:25.059 | Traceback (most recent call last):
  2014-08-21 14:11:25.059 |   File 
"tempest/api/compute/servers/test_server_actions.py", line 407, in 
test_suspend_resume_server
  2014-08-21 14:11:25.059 | 
self.client.wait_for_server_status(self.server_id, 'SUSPENDED')
  2014-08-21 14:11:25.059 |   File 
"tempest/services/compute/xml/servers_client.py", line 390, in 
wait_for_server_status
  2014-08-21 14:11:25.059 | raise_on_error=raise_on_error)
  2014-08-21 14:11:25.059 |   File "tempest/common/waiters.py", line 77, in 
wait_for_server_status
  2014-08-21 14:11:25.059 | server_id=server_id)
  2014-08-21 14:11:25.059 | BuildErrorException: Server 
a29ec7be-be83-4247-b7db-49bd4727d206 failed to build and is in ERROR status
  2014-08-21 14:11:25.059 | Details: {'message': 'Requested operation is 
not valid: domain is not running', 'code': '500', 'details': 'None', 'created': 
'2014-08-21T13:49:49Z'}

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1359805/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1314677] Re: nova-cells fails when using JSON file to store cell information

2014-09-12 Thread Liam Young
** Also affects: nova (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1314677

Title:
  nova-cells fails when using JSON file to store cell information

Status in OpenStack Compute (Nova):
  Fix Released
Status in “nova” package in Ubuntu:
  New
Status in “nova” source package in Trusty:
  New

Bug description:
  As recommended in http://docs.openstack.org/havana/config-
  reference/content/section_compute-cells.html#cell-config-optional-json
  I'm creating the nova-cells config with the cell information stored in
  a json file. However, when I do this nova-cells fails to start with
  this error in the logs:

  2014-04-29 11:52:05.240 16759 CRITICAL nova [-] __init__() takes exactly 3 
arguments (1 given)
  2014-04-29 11:52:05.240 16759 TRACE nova Traceback (most recent call last):
  2014-04-29 11:52:05.240 16759 TRACE nova   File "/usr/bin/nova-cells", line 
10, in 
  2014-04-29 11:52:05.240 16759 TRACE nova sys.exit(main())
  2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/cmd/cells.py", line 40, in main
  2014-04-29 11:52:05.240 16759 TRACE nova manager=CONF.cells.manager)
  2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 257, in create
  2014-04-29 11:52:05.240 16759 TRACE nova db_allowed=db_allowed)
  2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 139, in __init__
  2014-04-29 11:52:05.240 16759 TRACE nova self.manager = 
manager_class(host=self.host, *args, **kwargs)
  2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/cells/manager.py", line 87, in __init__
  2014-04-29 11:52:05.240 16759 TRACE nova self.state_manager = 
cell_state_manager()
  2014-04-29 11:52:05.240 16759 TRACE nova TypeError: __init__() takes exactly 
3 arguments (1 given)

  
  I have had a dig into the code and it appears that CellsManager creates an 
instance of CellStateManager with no arguments. CellStateManager __new__ runs 
and creates an instance of CellStateManagerFile which runs __new__ and __init__ 
with cell_state_cls and cells_config_path set. At this point __new__ returns 
CellStateManagerFile and the new instance's __init__() method is invoked 
(CellStateManagerFile.__init__) with the original arguments (there weren't any) 
which then results in the stack trace.

  It seems reasonable for CellStateManagerFile to derive the
  cells_config_path info for itself so I've patched it locally with

  === modified file 'state.py'
  --- state.py  2014-04-30 15:10:16 +
  +++ state.py  2014-04-30 15:10:26 +
  @@ -155,7 +155,7 @@
   config_path = CONF.find_file(cells_config)
   if not config_path:
   raise 
cfg.ConfigFilesNotFoundError(config_files=[cells_config])
  -return CellStateManagerFile(cell_state_cls, config_path)
  +return CellStateManagerFile(cell_state_cls)
   
   return CellStateManagerDB(cell_state_cls)
   
  @@ -450,7 +450,9 @@
   
   
   class CellStateManagerFile(CellStateManager):
  -def __init__(self, cell_state_cls, cells_config_path):
  +def __init__(self, cell_state_cls=None):
  +cells_config = CONF.cells.cells_config
  +cells_config_path = CONF.find_file(cells_config)
   self.cells_config_path = cells_config_path
   super(CellStateManagerFile, self).__init__(cell_state_cls)
   

  
  Ubuntu: 14.04
  nova-cells: 1:2014.1-0ubuntu1

  nova.conf:

  [DEFAULT]
  dhcpbridge_flagfile=/etc/nova/nova.conf
  dhcpbridge=/usr/bin/nova-dhcpbridge
  logdir=/var/log/nova
  state_path=/var/lib/nova
  lock_path=/var/lock/nova
  force_dhcp_release=True
  iscsi_helper=tgtadm
  libvirt_use_virtio_for_bridges=True
  connection_type=libvirt
  root_helper=sudo nova-rootwrap /etc/nova/rootwrap.conf
  verbose=True
  ec2_private_dns_show_ip=True
  api_paste_config=/etc/nova/api-paste.ini
  volumes_path=/var/lib/nova/volumes
  enabled_apis=ec2,osapi_compute,metadata
  auth_strategy=keystone
  compute_driver=libvirt.LibvirtDriver
  quota_driver=nova.quota.NoopQuotaDriver

  
  [cells]
  enable=True
  name=cell
  cell_type=compute
  cells_config=/etc/nova/cells.json

  
  cells.json: 
  {
  "parent": {
  "name": "parent",
  "api_url": "http://api.example.com:8774";,
  "transport_url": "rabbit://rabbit.example.com",
  "weight_offset": 0.0,
  "weight_scale": 1.0,
  "is_parent": true
  }
  }

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1314677/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.l

[Yahoo-eng-team] [Bug 1943863] Re: DPDK instances are failing to start: Failed to bind socket to /run/libvirt-vhost-user/vhu3ba44fdc-7c: No such file or directory

2021-09-22 Thread Liam Young
https://github.com/openstack-charmers/charm-layer-ovn/pull/52

** Also affects: neutron
   Importance: Undecided
   Status: New

** No longer affects: neutron

** No longer affects: neutron (Ubuntu)

** Also affects: charm-layer-ovn
   Importance: Undecided
   Status: New

** Changed in: charm-layer-ovn
   Status: New => Confirmed

** Changed in: charm-layer-ovn
   Importance: Undecided => High

** Changed in: charm-layer-ovn
 Assignee: (unassigned) => Liam Young (gnuoy)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1943863

Title:
  DPDK instances are failing to start: Failed to bind socket to
  /run/libvirt-vhost-user/vhu3ba44fdc-7c: No such file or directory

Status in charm-layer-ovn:
  Confirmed
Status in OpenStack nova-compute charm:
  Invalid

Bug description:
  == Env
  focal/ussuri + ovn, latest stable charms
  juju status: https://paste.ubuntu.com/p/2725tV47ym/
  Hardware: Huawei CH121 V5 with MZ532,4*25GE Mezzanine Card,PCIE 3.0 X16 NICs 
+ manually installed PMD for DPDK enablement (librte-pmd-hinic20.0 package)
   
  == Problem description

  DPDK instance can't be launched after the fresh deployment
  (focal/ussuri + OVN, latest stable charms), raising a below error:

  $ os server show dpdk-test-instance -f yaml
  OS-DCF:diskConfig: MANUAL
  OS-EXT-AZ:availability_zone: ''
  OS-EXT-SRV-ATTR:host: null
  OS-EXT-SRV-ATTR:hypervisor_hostname: null
  OS-EXT-SRV-ATTR:instance_name: instance-0218
  OS-EXT-STS:power_state: NOSTATE
  OS-EXT-STS:task_state: null
  OS-EXT-STS:vm_state: error
  OS-SRV-USG:launched_at: null
  OS-SRV-USG:terminated_at: null
  accessIPv4: ''
  accessIPv6: ''
  addresses: ''
  config_drive: 'True'
  created: '2021-09-15T18:51:00Z'
  fault:
    code: 500
    created: '2021-09-15T18:52:01Z'
    details: "Traceback (most recent call last):\n  File 
\"/usr/lib/python3/dist-packages/nova/conductor/manager.py\"\
  , line 651, in build_instances\nscheduler_utils.populate_retry(\n  
File \"\
  /usr/lib/python3/dist-packages/nova/scheduler/utils.py\", line 919, in 
populate_retry\n\
  \raise 
exception.MaxRetriesExceeded(reason=msg)\nnova.exception.MaxRetriesExceeded:\
  \ Exceeded maximum number of retries. Exceeded max scheduling attempts 3 
for instance\
  \ 1bb2d1b7-e2e9-4d76-a346-a9b06ff22c73. Last exception: internal error: 
process\
  \ exited while connecting to monitor: 2021-09-15T18:51:53.485265Z 
qemu-system-x86_64:\
  \ -chardev 
socket,id=charnet0,path=/run/libvirt-vhost-user/vhu3ba44fdc-7c,server:\
  \ Failed to bind socket to /run/libvirt-vhost-user/vhu3ba44fdc-7c: No 
such file\
  \ or directory\n"
    message: 'Exceeded maximum number of retries. Exceeded max scheduling 
attempts 3
  for instance 1bb2d1b7-e2e9-4d76-a346-a9b06ff22c73. Last exception: 
internal error:
  process exited while connecting to monitor: 2021-09-15T18:51:53.485265Z 
qemu-system-x86_64:
  -chardev '
  flavor: m1.medium.project.dpdk (4f452aa3-2b2c-4f2e-8465-5e3c2d8ec3f1)
  hostId: ''
  id: 1bb2d1b7-e2e9-4d76-a346-a9b06ff22c73
  image: auto-sync/ubuntu-bionic-18.04-amd64-server-20210907-disk1.img 
(3851450e-e73d-489b-a356-33650690ed7a)
  key_name: ubuntu-keypair
  name: dpdk-test-instance
  project_id: cdade870811447a89e2f0199373a0d95
  properties: ''
  status: ERROR
  updated: '2021-09-15T18:52:01Z'
  user_id: 13a0e7862c6641eeaaebbde1ae096f9e
  volumes_attached: ''

  For the record, a "generic" instances (e.g non-DPDK/non-SRIOV) are
  scheduling/starting without any issues.

  == Steps to reproduce

  openstack network create --external --provider-network-type vlan 
--provider-segment xxx --provider-physical-network dpdkfabric ext_net_dpdk
  openstack subnet create --allocation-pool start=,end= 
--network ext_net_dpdk --subnet-range /23 --gateway  
--no-dhcp ext_net_dpdk_subnet

  openstack aggregate create --zone nova dpdk
  openstack aggregate set --property dpdk=true dpdk

  openstack aggregate add host dpdk 

  openstack aggregate show dpdk --max-width=80

  openstack flavor set --property
  aggregate_instance_extra_specs:dpdk=true --property
  hw:mem_page_size=large m1.medium.dpdk

  openstack server create --config-drive true --network ext_net_dpdk
  --key-name ubuntu-keypair --image focal --flavor m1.medium.dpdk dpdk-
  test-instance

  == Analysis
  [before redeployment] nova-compute log : 
https://pastebin.canonical.com/p/FgPYNb3bPj/
  [fresh deployment] juju crashdump: 
https://drive.google.com/file/d/1W_w3CAUq4ggp4alDnpCk08mSaCL6Uaxk/view?usp=sharing

  

  # ovs-vsctl get open_vswitch . other_config
  {dpdk-extra="--pci-whitelist :3e:00.0 --pci-whitelist :40:00.0", 
dpdk-init="true", dpdk-lco

[Yahoo-eng-team] [Bug 1964117] [NEW] Unable to contact to IPv6 instance using ml2 ovs with ovs 2.16

2022-03-08 Thread Liam Young
Public bug reported:

Connectivity is fine with OVS 2.15 but after upgrading ovs, connectivity
is lost to remote units over ipv6. The traffic appears to be lost while
being processed by the openflow firewall associated with br-int.

The description below uses connectivity between Octavia units and
amphora to illustrate the issue but I don't think this issue is related
to Octavia.

OS: Ubuntu Focal
OVS: 2.16.0-0ubuntu2.1~cloud0
Kernel: 5.4.0-100-generic

With a fresh install of xena or after an upgrade of OVS from 2.15 (wallaby) to 
2.16 (xena) connectivity from the octavia units to the amphora is broken.
* Wallaby works as expected
* Disabling port security on the octavia units 
octavia-health-manager-octavia-N-listen-port restores connectivity.
* The flows on br-int and br-tun are the same after the upgrade from 2.15 to 
2.16
* Manually inserting permissive flows into the br-int flow table also restores 
connectivity.
* Testing environment is Openstack on top of Openstack.

Text below is reproduced here https://pastebin.ubuntu.com/p/hRWMx7d9HG/
as it maybe easier to read in a pastebin.

Below is reproduction of the issue first deploying wallaby to validate
connectivity before upgrading openvswitch.

Amphora:

$ openstack loadbalancer amphora list
+--+--+---++-+-+
| id   | loadbalancer_id  | 
status| role   | lb_network_ip   | ha_ip   |
+--+--+---++-+-+
| 30afe97a-bcd4-4537-a621-830de87568b0 | ae840c86-768d-4aae-b804-8fddf2880c78 | 
ALLOCATED | MASTER | fc00:92e3:d18a:36ed:f816:3eff:fed2:32e0 | 10.42.0.254 |
| 61e66eff-e83b-4a21-bc1f-1e1a0037b191 | ae840c86-768d-4aae-b804-8fddf2880c78 | 
ALLOCATED | BACKUP | fc00:92e3:d18a:36ed:f816:3eff:fe69:c85b | 10.42.0.254 |
+--+--+---++-+-+

$ openstack router show lb-mgmt -c name -c interfaces_info
+-+---+
| Field   | Value   

  |
+-+---+
| interfaces_info | [{"port_id": "191a2d27-9b15-4938-a818-b48fc405a27a", 
"ip_address": "fc00:92e3:d18a:36ed::", "subnet_id": 
"8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03"}] |
| name| lb-mgmt 

  |
+-+---+

Looking at ports on that subnet there is a port for each of the octavia units 
(named octavia-health-manager-octavia-N-listen-port ), a port on
each of the amphora listed above and a port for the lb-mgmt router.

$ openstack port list | grep 8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03
| 0943521f-2c1f-4152-8250-48d310e3918f | 
octavia-health-manager-octavia-1-listen-port | fa:16:3e:70:70:c9 | 
ip_address='fc00:92e3:d18a:36ed:f816:3eff:fe70:70c9', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03' | ACTIVE |
| 160b8854-0f20-471b-9ac4-53f8891f4edb |
  | fa:16:3e:45:7a:a6 | 
ip_address='fc00:92e3:d18a:36ed:f816:3eff:fe45:7aa6', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03' | ACTIVE |
| 191a2d27-9b15-4938-a818-b48fc405a27a |
  | fa:16:3e:3e:bd:45 | ip_address='fc00:92e3:d18a:36ed::', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03'   | ACTIVE |
| 2428b1d4-0cb2-420b-81a5-5e6ae34e4557 | 
octavia-health-manager-octavia-2-listen-port | fa:16:3e:05:f3:2a | 
ip_address='fc00:92e3:d18a:36ed:f816:3eff:fe05:f32a', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03' | ACTIVE |
| 2ea37e19-bd60-43cb-8191-aaf179667b1a |
  | fa:16:3e:d2:32:e0 | 
ip_address='fc00:92e3:d18a:36ed:f816:3eff:fed2:32e0', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03' | ACTIVE |
| 76742ab6-39ee-4b06-a37d-f2ecad2c892a | 
octavia-health-manager-octavia-0-listen-port | fa:16:3e:79:b6:46 | 
ip_address='fc00:92e3:d18a:36ed:f816:3eff:fe79:b646', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03' | ACTIVE |
| ffb3d106-7a14-4b4e-8300-2dd9ec9bc

[Yahoo-eng-team] [Bug 1964117] Re: Unable to contact to IPv6 instance using ml2 ovs with ovs 2.16

2022-03-11 Thread Liam Young
The issue seems to be in ovs, specifically this commit
https://github.com/openvswitch/ovs/commit/355fef6f2ccbcf78797b938421cb4cef9b59af13
. I have created a ppa
https://launchpad.net/~gnuoy/+archive/ubuntu/focal-xena/+packages that
has a copy of the openvswitch package from the xena-proposed UCA. The
only change I have made is backing out that commit (and temporarily
disabling auto pkg tests).

The following pastebin shows:
1) checking connectivity with ovs 2.15
2) upgrading to 2.16 and seeing that connectivity is broken
3) upgrading to 2.16 with 355fef6f2 reverted and seeing connectivity is restored


https://paste.ubuntu.com/p/nSHjRZzbmp/

** Also affects: openvswitch
   Importance: Undecided
   Status: New

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1964117

Title:
  Unable to contact to IPv6 instance using ml2 ovs with ovs 2.16

Status in neutron:
  Invalid
Status in openvswitch:
  New

Bug description:
  Connectivity is fine with OVS 2.15 but after upgrading ovs,
  connectivity is lost to remote units over ipv6. The traffic appears to
  be lost while being processed by the openflow firewall associated with
  br-int.

  The description below uses connectivity between Octavia units and
  amphora to illustrate the issue but I don't think this issue is
  related to Octavia.

  OS: Ubuntu Focal
  OVS: 2.16.0-0ubuntu2.1~cloud0
  Kernel: 5.4.0-100-generic

  With a fresh install of xena or after an upgrade of OVS from 2.15 (wallaby) 
to 2.16 (xena) connectivity from the octavia units to the amphora is broken.
  * Wallaby works as expected
  * Disabling port security on the octavia units 
octavia-health-manager-octavia-N-listen-port restores connectivity.
  * The flows on br-int and br-tun are the same after the upgrade from 2.15 to 
2.16
  * Manually inserting permissive flows into the br-int flow table also 
restores connectivity.
  * Testing environment is Openstack on top of Openstack.

  Text below is reproduced here
  https://pastebin.ubuntu.com/p/hRWMx7d9HG/ as it maybe easier to read
  in a pastebin.

  Below is reproduction of the issue first deploying wallaby to validate
  connectivity before upgrading openvswitch.

  Amphora:

  $ openstack loadbalancer amphora list
  
+--+--+---++-+-+
  | id   | loadbalancer_id  
| status| role   | lb_network_ip   | ha_ip   |
  
+--+--+---++-+-+
  | 30afe97a-bcd4-4537-a621-830de87568b0 | ae840c86-768d-4aae-b804-8fddf2880c78 
| ALLOCATED | MASTER | fc00:92e3:d18a:36ed:f816:3eff:fed2:32e0 | 10.42.0.254 |
  | 61e66eff-e83b-4a21-bc1f-1e1a0037b191 | ae840c86-768d-4aae-b804-8fddf2880c78 
| ALLOCATED | BACKUP | fc00:92e3:d18a:36ed:f816:3eff:fe69:c85b | 10.42.0.254 |
  
+--+--+---++-+-+

  $ openstack router show lb-mgmt -c name -c interfaces_info
  
+-+---+
  | Field   | Value 

|
  
+-+---+
  | interfaces_info | [{"port_id": "191a2d27-9b15-4938-a818-b48fc405a27a", 
"ip_address": "fc00:92e3:d18a:36ed::", "subnet_id": 
"8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03"}] |
  | name| lb-mgmt   

|
  
+-+---+

  Looking at ports on that subnet there is a port for each of the octavia units 
(named octavia-health-manager-octavia-N-listen-port ), a port on
  each of the amphora listed above and a port for the lb-mgmt router.

  $ openstack port list | grep 8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03
  | 0943521f-2c1f-4152-8250-48d310e3918f | 
octavia-health-manager-octavia-1-listen-port | fa:16:3e:70:70:c9 | 
ip_address='fc00:92e3:d18a:36ed:f816:3eff:fe70:70c9', 
subnet_id='8b4307a7-08a1-4f2b-a7e0-ce45a7ad0b03' | ACTIVE |
  | 160b8854-0f20-471b-9ac4-53f8891f4edb | 

[Yahoo-eng-team] [Bug 1826382] Re: Updates to placement api fail if placement endpoint changes

2019-04-25 Thread Liam Young
pute.manager 
self._update_to_placement(context, compute_node)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/compute/resource_tracker.py", line 912, in 
_update_to_placement
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager context, 
compute_node.uuid, name=compute_node.hypervisor_hostname)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/scheduler/client/__init__.py", line 35, in 
__run_method
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager return 
getattr(self.instance, __name)(*args, **kwargs)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 1006, in 
get_provider_tree_and_ensure_root
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
parent_provider_uuid=parent_provider_uuid)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 668, in 
_ensure_resource_provider
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager rps_to_refresh = 
self._get_providers_in_tree(context, uuid)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 74, in 
wrapper
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager return f(self, 
*a, **k)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 535, in 
_get_providers_in_tree
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager raise 
exception.ResourceProviderRetrievalFailed(uuid=uuid)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
nova.exception.ResourceProviderRetrievalFailed: Failed to get resource provider 
with UUID 4f7c6844-d3b8-4710-be2c-8691a93fb58b
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager

** Also affects: nova
   Importance: Undecided
   Status: New

** Changed in: nova
 Assignee: (unassigned) => Liam Young (gnuoy)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1826382

Title:
  Updates to placement api fail if placement endpoint changes

Status in OpenStack nova-compute charm:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  If the url of the placement api changes after nova-compute has been
  started then placement updates fail as nova-compute appears to cache
  the old endpoint url.

  To reproduce, update the placement endpoint to something incorrect in
  keystone and restart nova-compute. Errors contacting the placement api
  will be reported every minute or so. Now, correct the entry in
  keystone. The errors will continue despite the catalogue now being
  correct. Restarting nova-compute fixes the issue.

  In my deployment this occurred when the placement end point switched
  from http to https after the nova-compute node had started. This
  resulted in the following in the nova-compute log:

  2019-04-25 09:58:12.175 31793 ERROR nova.scheduler.client.report 
[req-18b4f522-e702-4ee1-ba85-e565c8e9ac1e - - - - -] [None] Failed to retrieve 
resource provider tree from placement API for UUID 
4f7c6844-d3b8-4710-be2c-8691a93fb58b. Got 400: 
  
  400 Bad Request
  
  Bad Request
  Your browser sent a request that this server could not understand.
  Reason: You're speaking plain HTTP to an SSL-enabled server port.
   Instead use the HTTPS scheme to access this URL, please.
  
  
  Apache/2.4.29 (Ubuntu) Server at 10.5.0.36 Port 443
  
  .
  2019-04-25 09:58:12.176 31793 DEBUG oslo_concurrency.lockutils 
[req-18b4f522-e702-4ee1-ba85-e565c8e9ac1e - - - - -] Lock "compute_resources" 
released by 
"nova.compute.resource_tracker.ResourceTracker._update_available_resource" :: 
held 0.099s inner 
/usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py:285
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
[req-18b4f522-e702-4ee1-ba85-e565c8e9ac1e - - - - -] Error updating resources 
for node juju-7a9f5c-zaza-19a393f3689b-16.project.serverstack.: 
nova.exception.ResourceProviderRetrievalFailed: Failed to get resource provider 
with UUID 4f7c6844-d3b8-4710-be2c-8691a93fb58b
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager Traceback (most 
recent call last):
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/compute/manager.py", line 7778, in 
_update_available_resource_for_node
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
rt.update_available_resource(context, nodename)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/

[Yahoo-eng-team] [Bug 1826382] Re: Updates to placement api fail if placement endpoint changes

2020-10-19 Thread Liam Young
Work around in the charm committed here:

https://review.opendev.org/#/c/755089
https://review.opendev.org/#/c/755076/

** Also affects: charm-keystone
   Importance: Undecided
   Status: New

** Also affects: charm-nova-cloud-controller
   Importance: Undecided
   Status: New

** Changed in: nova
 Assignee: Liam Young (gnuoy) => (unassigned)

** Changed in: charm-keystone
 Assignee: (unassigned) => Liam Young (gnuoy)

** Changed in: charm-nova-cloud-controller
 Assignee: (unassigned) => Liam Young (gnuoy)

** Changed in: charm-nova-compute
   Status: Triaged => Invalid

** Changed in: charm-keystone
   Status: New => Fix Committed

** Changed in: charm-nova-cloud-controller
   Status: New => Fix Committed

** Changed in: charm-keystone
   Importance: Undecided => High

** Changed in: charm-nova-cloud-controller
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1826382

Title:
  Updates to placement api fail if placement endpoint changes

Status in OpenStack keystone charm:
  Fix Committed
Status in OpenStack nova-cloud-controller charm:
  Fix Committed
Status in OpenStack nova-compute charm:
  Invalid
Status in OpenStack Compute (nova):
  Triaged

Bug description:
  If the url of the placement api changes after nova-compute has been
  started then placement updates fail as nova-compute appears to cache
  the old endpoint url.

  To reproduce, update the placement endpoint to something incorrect in
  keystone and restart nova-compute. Errors contacting the placement api
  will be reported every minute or so. Now, correct the entry in
  keystone. The errors will continue despite the catalogue now being
  correct. Restarting nova-compute fixes the issue.

  In my deployment this occurred when the placement end point switched
  from http to https after the nova-compute node had started. This
  resulted in the following in the nova-compute log:

  2019-04-25 09:58:12.175 31793 ERROR nova.scheduler.client.report 
[req-18b4f522-e702-4ee1-ba85-e565c8e9ac1e - - - - -] [None] Failed to retrieve 
resource provider tree from placement API for UUID 
4f7c6844-d3b8-4710-be2c-8691a93fb58b. Got 400: 
  
  400 Bad Request
  
  Bad Request
  Your browser sent a request that this server could not understand.
  Reason: You're speaking plain HTTP to an SSL-enabled server port.
   Instead use the HTTPS scheme to access this URL, please.
  
  
  Apache/2.4.29 (Ubuntu) Server at 10.5.0.36 Port 443
  
  .
  2019-04-25 09:58:12.176 31793 DEBUG oslo_concurrency.lockutils 
[req-18b4f522-e702-4ee1-ba85-e565c8e9ac1e - - - - -] Lock "compute_resources" 
released by 
"nova.compute.resource_tracker.ResourceTracker._update_available_resource" :: 
held 0.099s inner 
/usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py:285
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
[req-18b4f522-e702-4ee1-ba85-e565c8e9ac1e - - - - -] Error updating resources 
for node juju-7a9f5c-zaza-19a393f3689b-16.project.serverstack.: 
nova.exception.ResourceProviderRetrievalFailed: Failed to get resource provider 
with UUID 4f7c6844-d3b8-4710-be2c-8691a93fb58b
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager Traceback (most 
recent call last):
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/compute/manager.py", line 7778, in 
_update_available_resource_for_node
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
rt.update_available_resource(context, nodename)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/compute/resource_tracker.py", line 721, in 
update_available_resource
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
self._update_available_resource(context, resources)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py", line 274, in 
inner
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager return f(*args, 
**kwargs)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/nova/compute/resource_tracker.py", line 798, in 
_update_available_resource
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager 
self._update(context, cn)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/retrying.py", line 49, in wrapped_f
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager return 
Retrying(*dargs, **dkw).call(f, *args, **kw)
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   File 
"/usr/lib/python3/dist-packages/retrying.py", line 206, in call
  2019-04-25 09:58:12.177 31793 ERROR nova.compute.manager   

[Yahoo-eng-team] [Bug 1896603] Re: ovn-octavia-provider: Cannot create listener due to alowed_cidrs validation

2021-01-27 Thread Liam Young
** Also affects: ovn-octavia-provider (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1896603

Title:
  ovn-octavia-provider: Cannot create listener due to alowed_cidrs
  validation

Status in neutron:
  Fix Released
Status in ovn-octavia-provider package in Ubuntu:
  New

Bug description:
  Kuryr-Kubernetes tests running with ovn-octavia-provider started to
  fail with "Provider 'ovn' does not support a requested option: OVN
  provider does not support allowed_cidrs option" showing up in the
  o-api logs.

  We've tracked that to check [1] getting introduced. Apparently it's
  broken and makes the request explode even if the property isn't set at
  all. Please take a look at output from python-openstackclient [2]
  where body I used is just '{"listener": {"loadbalancer_id": "faca9a1b-
  30dc-45cb-80ce-2ab1c26b5521", "protocol": "TCP", "protocol_port": 80,
  "admin_state_up": true}}'.

  Also this is all over your gates as well, see o-api log [3]. Somehow
  ovn-octavia-provider tests skip 171 results there, so that's why it's
  green.

  [1] 
https://opendev.org/openstack/ovn-octavia-provider/src/branch/master/ovn_octavia_provider/driver.py#L142
  [2] http://paste.openstack.org/show/798197/
  [3] 
https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_4ba/751085/7/gate/ovn-octavia-provider-v2-dsvm-scenario/4bac575/controller/logs/screen-o-api.txt

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1896603/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1785235] [NEW] metadata retrieval fails when using a global nova-api-metadata service

2018-08-03 Thread Liam Young
Public bug reported:

Description
===
The nova-api-metadata service fails to provide metadata to guests
when it is providing metadata for multiple cells.


Steps to reproduce
==
Deploy a a environment with multiple cells and a single
nova-api-metadata service. Requests by the guests for metadata will fail.

Expected result
===
Guests would get metadata.

Actual result
=
Guests do not get metadata, they get a 404.


Environment
===
1. Exact version of OpenStack you are running.

$ dpkg -l | grep nova-comm
ii  nova-common  2:17.0.5-0ubuntu1~cloud0   
all  OpenStack Compute - common files


2. Which hypervisor did you use?
Libvirt + KVM

2. Which storage type did you use?
n/a

3. Which networking type did you use?
Neutron with OpenVSwitch

** Affects: nova
 Importance: Undecided
 Assignee: Liam Young (gnuoy)
 Status: New

** Changed in: nova
 Assignee: (unassigned) => Liam Young (gnuoy)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1785235

Title:
  metadata retrieval fails when using a global  nova-api-metadata
  service

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===
  The nova-api-metadata service fails to provide metadata to guests
  when it is providing metadata for multiple cells.

  
  Steps to reproduce
  ==
  Deploy a a environment with multiple cells and a single
  nova-api-metadata service. Requests by the guests for metadata will fail.

  Expected result
  ===
  Guests would get metadata.

  Actual result
  =
  Guests do not get metadata, they get a 404.

  
  Environment
  ===
  1. Exact version of OpenStack you are running.

  $ dpkg -l | grep nova-comm
  ii  nova-common  2:17.0.5-0ubuntu1~cloud0 
  all  OpenStack Compute - common files

  
  2. Which hypervisor did you use?
  Libvirt + KVM

  2. Which storage type did you use?
  n/a

  3. Which networking type did you use?
  Neutron with OpenVSwitch

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1785235/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1785237] [NEW] The section on the Neutron Metadata API proxy in cellsv2-layout.html is confusing and possibly wrong

2018-08-03 Thread Liam Young
Public bug reported:

I found it hard to understand what configuration was needed when
reading:

"The Neutron metadata API proxy should be global across all cells, and
thus be configured as an API-level service with access to the
[api_database]/connection information."

Which service is it referring to ns-metadata-proxy, neutron-metadata-
agent or nova-api-metadata?

Given that the 'api_database' section is only valid for nova that would
suggest its the nova-api-metadata but the nova-api-metadata receives all
its data via rpc (as far as I can tell) so it doesn't seem to need
api_database section.

** Affects: nova
 Importance: Undecided
 Assignee: Liam Young (gnuoy)
 Status: In Progress

** Changed in: nova
 Assignee: (unassigned) => Liam Young (gnuoy)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1785237

Title:
  The section on the Neutron Metadata API proxy in cellsv2-layout.html
  is confusing and possibly wrong

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  I found it hard to understand what configuration was needed when
  reading:

  "The Neutron metadata API proxy should be global across all cells, and
  thus be configured as an API-level service with access to the
  [api_database]/connection information."

  Which service is it referring to ns-metadata-proxy, neutron-metadata-
  agent or nova-api-metadata?

  Given that the 'api_database' section is only valid for nova that
  would suggest its the nova-api-metadata but the nova-api-metadata
  receives all its data via rpc (as far as I can tell) so it doesn't
  seem to need api_database section.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1785237/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1815844] Re: iscsi multipath dm-N device only used on first volume attachment

2019-02-14 Thread Liam Young
I don't think this is related to the charm, it looks like a bug in
upstream nova.

** Also affects: nova (Ubuntu)
   Importance: Undecided
   Status: New

** No longer affects: nova (Ubuntu)

** Also affects: nova
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1815844

Title:
  iscsi multipath dm-N device only used on first volume attachment

Status in OpenStack nova-compute charm:
  New
Status in OpenStack Compute (nova):
  New

Bug description:
  With nova-compute from cloud:xenial-queens and use-multipath=true
  iscsi multipath is configured and the dm-N devices used on the first
  attachment but subsequent attachments only use a single path.

  The back-end storage is a Purestorage array.
  The multipath.conf is attached
  The issue is easily reproduced as shown below:

  jog@pnjostkinfr01:~⟫ openstack volume create pure2 --size 10 --type pure
  +-+--+
  | Field   | Value|
  +-+--+
  | attachments | []   |
  | availability_zone   | nova |
  | bootable| false|
  | consistencygroup_id | None |
  | created_at  | 2019-02-13T23:07:40.00   |
  | description | None |
  | encrypted   | False|
  | id  | e286161b-e8e8-47b0-abe3-4df411993265 |
  | migration_status| None |
  | multiattach | False|
  | name| pure2|
  | properties  |  |
  | replication_status  | None |
  | size| 10   |
  | snapshot_id | None |
  | source_volid| None |
  | status  | creating |
  | type| pure |
  | updated_at  | None |
  | user_id | c1fa4ae9a0b446f2ba64eebf92705d53 |
  +-+--+

  jog@pnjostkinfr01:~⟫ openstack volume show pure2
  ++--+
  | Field  | Value|
  ++--+
  | attachments| []   |
  | availability_zone  | nova |
  | bootable   | false|
  | consistencygroup_id| None |
  | created_at | 2019-02-13T23:07:40.00   |
  | description| None |
  | encrypted  | False|
  | id | e286161b-e8e8-47b0-abe3-4df411993265 |
  | migration_status   | None |
  | multiattach| False|
  | name   | pure2|
  | os-vol-host-attr:host  | cinder@cinder-pure#cinder-pure   |
  | os-vol-mig-status-attr:migstat | None |
  | os-vol-mig-status-attr:name_id | None |
  | os-vol-tenant-attr:tenant_id   | 9be499fd1eee48dfb4dc6faf3cc0a1d7 |
  | properties |  |
  | replication_status | None |
  | size   | 10   |
  | snapshot_id| None |
  | source_volid   | None |
  | status | available|
  | type   | pure |
  | updated_at | 2019-02-13T23:07:41.00   |
  | user_id| c1fa4ae9a0b446f2ba64eebf92705d53 |
  ++--+

  Add the volume to an instance:
  jog@pnjostkinfr01:~⟫ openstack server add volume T1 pure2
  jog@pnjostkinfr01:~⟫ openstack server show T1 
 

[Yahoo-eng-team] [Bug 1742421] [NEW] Cells Layout (v2) in nova doc misleading about upcalls

2018-01-10 Thread Liam Young
Public bug reported:


- [X] This doc is inaccurate in this way: Documentation suggests nova v2 cells 
do not make 'upcalls' but they do when talking to the placement api.
- [ ] This is a doc addition request.
- [ ] I have a fix to the document that I can paste below including example: 
input and output. 


It is important to note that services in the lower cell boxes
only have the ability to call back to the placement API and no other
API-layer services via RPC, nor do they have access to the API database
for global visibility of resources across the cloud. This is intentional
and provides security and failure domain isolation benefits, but also has 
impacts on somethings that would otherwise require this any-to-any 
communication style. Check the release notes for the version of Nova you 
are using for the most up-to-date information about any caveats that may be
present due to this limitation.


---
Release: 17.0.0.0b3.dev323 on 2018-01-09 21:52
SHA: 90a92d33edaea2b7411a5fd528f3159a486e1fd0
Source: 
https://git.openstack.org/cgit/openstack/nova/tree/doc/source/user/cellsv2-layout.rst
URL: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1742421

Title:
  Cells Layout (v2) in nova doc misleading about upcalls

Status in OpenStack Compute (nova):
  New

Bug description:
  
  - [X] This doc is inaccurate in this way: Documentation suggests nova v2 
cells do not make 'upcalls' but they do when talking to the placement api.
  - [ ] This is a doc addition request.
  - [ ] I have a fix to the document that I can paste below including example: 
input and output. 

  
  It is important to note that services in the lower cell boxes
  only have the ability to call back to the placement API and no other
  API-layer services via RPC, nor do they have access to the API database
  for global visibility of resources across the cloud. This is intentional
  and provides security and failure domain isolation benefits, but also has 
  impacts on somethings that would otherwise require this any-to-any 
  communication style. Check the release notes for the version of Nova you 
  are using for the most up-to-date information about any caveats that may be
  present due to this limitation.

  
  ---
  Release: 17.0.0.0b3.dev323 on 2018-01-09 21:52
  SHA: 90a92d33edaea2b7411a5fd528f3159a486e1fd0
  Source: 
https://git.openstack.org/cgit/openstack/nova/tree/doc/source/user/cellsv2-layout.rst
  URL: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1742421/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1742649] [NEW] map_instances default batch size is too small.

2018-01-11 Thread Liam Young
Public bug reported:

Description
===

map_instances seemingly hung for hours on a cloud with ~19 instance
records. I think the following fixes are valid (in order of preference):

1) nova_manage should examine the amount of instances that need mapping and 
make an informed choice about batch size if max_count is not set.
2) max_counts default should be raised. It is currently 50 and I cannot imagine 
what use case 50 is a good default for. For small clouds the max_count is 
almost irrelevant, for medium/large clouds 50 is far too low.
3) Update max_count description. It currently reads "Maximum number of 
instances to map" but I think it should also point out that this is the batch 
size that instances will be processed in.

Steps to reproduce
==

Fire up a large number of instances on a cloud and run map_instances
without max_count set:

nova-manage --config-file /etc/nova/nova.conf cell_v2 map_instances
--cell_uuid 

Expected result
===

The command should complete in a reasonable time (under an hour)

Actual result
=

Command runs for over three hours

Environment
===
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/

   If this is from a distro please provide

# dpkg -l | grep nova
ii  nova-api-os-compute  2:16.0.3-0ubuntu1~cloud0   
all  OpenStack Compute - OpenStack Compute API frontend
ii  nova-common  2:16.0.3-0ubuntu1~cloud0   
all  OpenStack Compute - common files
ii  nova-conductor   2:16.0.3-0ubuntu1~cloud0   
all  OpenStack Compute - conductor service
ii  nova-placement-api   2:16.0.3-0ubuntu1~cloud0   
all  OpenStack Compute - placement API frontend
ii  nova-scheduler   2:16.0.3-0ubuntu1~cloud0   
all  OpenStack Compute - virtual machine scheduler
ii  python-nova  2:16.0.3-0ubuntu1~cloud0   
all  OpenStack Compute Python libraries

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1742649

Title:
  map_instances default batch size is too small.

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===

  map_instances seemingly hung for hours on a cloud with ~19
  instance records. I think the following fixes are valid (in order of
  preference):

  1) nova_manage should examine the amount of instances that need mapping and 
make an informed choice about batch size if max_count is not set.
  2) max_counts default should be raised. It is currently 50 and I cannot 
imagine what use case 50 is a good default for. For small clouds the max_count 
is almost irrelevant, for medium/large clouds 50 is far too low.
  3) Update max_count description. It currently reads "Maximum number of 
instances to map" but I think it should also point out that this is the batch 
size that instances will be processed in.

  Steps to reproduce
  ==

  Fire up a large number of instances on a cloud and run map_instances
  without max_count set:

  nova-manage --config-file /etc/nova/nova.conf cell_v2 map_instances
  --cell_uuid 

  Expected result
  ===

  The command should complete in a reasonable time (under an hour)

  Actual result
  =

  Command runs for over three hours

  Environment
  ===
  1. Exact version of OpenStack you are running. See the following
list for all releases: http://docs.openstack.org/releases/

 If this is from a distro please provide

  # dpkg -l | grep nova
  ii  nova-api-os-compute  2:16.0.3-0ubuntu1~cloud0 
  all  OpenStack Compute - OpenStack Compute API frontend
  ii  nova-common  2:16.0.3-0ubuntu1~cloud0 
  all  OpenStack Compute - common files
  ii  nova-conductor   2:16.0.3-0ubuntu1~cloud0 
  all  OpenStack Compute - conductor service
  ii  nova-placement-api   2:16.0.3-0ubuntu1~cloud0 
  all  OpenStack Compute - placement API frontend
  ii  nova-scheduler   2:16.0.3-0ubuntu1~cloud0 
  all  OpenStack Compute - virtual machine scheduler
  ii  python-nova  2:16.0.3-0ubuntu1~cloud0 
  all  OpenStack Compute Python libraries

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1742649/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://

[Yahoo-eng-team] [Bug 1472712] Re: Using SSL with rabbitmq prevents communication between nova-compute and conductor after latest nova updates

2015-07-31 Thread Liam Young
** Also affects: python-oslo.messaging (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: oslo.messaging
   Status: Confirmed => Invalid

** Changed in: nova
   Status: New => Invalid

** Changed in: python-oslo.messaging (Ubuntu)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1472712

Title:
  Using SSL with rabbitmq prevents communication between nova-compute
  and conductor after latest nova updates

Status in OpenStack Compute (nova):
  Invalid
Status in oslo.messaging:
  Invalid
Status in python-oslo.messaging package in Ubuntu:
  Confirmed

Bug description:
  On the latest update of the Ubuntu OpenStack packages, it was
  discovered that the nova-compute/nova-conductor
  (1:2014.1.4-0ubuntu2.1) packages encountered a bug with using SSL to
  connect to rabbitmq.

  When this problem occurs, the compute node cannot connect to the
  controller, and this message is constantly displayed:

  WARNING nova.conductor.api [req-4022395c-9501-47cf-bf8e-476e1cc58772
  None None] Timed out waiting for nova-conductor. Is it running? Or did
  this service start before nova-conductor?

  Investigation revealed that having rabbitmq configured with SSL was
  the root cause of this problem.  This seems to have been introduced
  with the current version of the nova packages.   Rabbitmq was not
  updated as part of this distribution update, but the messaging library
  (python-oslo.messaging 1.3.0-0ubuntu1.1) was updated.   So the problem
  could exist in any of these components.

  Versions installed:
  Openstack version: Icehouse
  Ubuntu 14.04.2 LTS
  nova-conductor1:2014.1.4-0ubuntu2.1
  nova-compute1:2014.1.4-0ubuntu2.1
  rabbitmq-server  3.2.4-1
  openssl:amd64/trusty-security   1.0.1f-1ubuntu2.15

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1472712/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1327218] Re: Volume detach failure because of invalid bdm.connection_info

2015-09-07 Thread Liam Young
The fix went into 2015.1.0 and 2015.1.1 is now in the cloud archive.

** Changed in: nova (Ubuntu)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1327218

Title:
  Volume detach failure because of invalid bdm.connection_info

Status in OpenStack Compute (nova):
  Fix Released
Status in nova package in Ubuntu:
  Fix Released
Status in nova source package in Trusty:
  New

Bug description:
  Example of this here:

  http://logs.openstack.org/33/97233/1/check/check-grenade-
  dsvm/f7b8a11/logs/old/screen-n-cpu.txt.gz?level=TRACE#_2014-06-02_14_13_51_125

     File "/opt/stack/old/nova/nova/compute/manager.py", line 4153, in 
_detach_volume
   connection_info = jsonutils.loads(bdm.connection_info)
     File "/opt/stack/old/nova/nova/openstack/common/jsonutils.py", line 164, 
in loads
   return json.loads(s)
     File "/usr/lib/python2.7/json/__init__.py", line 326, in loads
   return _default_decoder.decode(s)
     File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
   obj, end = self.raw_decode(s, idx=_w(s, 0).end())
   TypeError: expected string or buffer

  and this was in grenade with stable/icehouse nova commit 7431cb9

  There's nothing unusual about the test which triggers this - simply
  attaches a volume to an instance, waits for it to show up in the
  instance and then tries to detach it

  logstash query for this:

    message:"Exception during message handling" AND message:"expected
  string or buffer" AND message:"connection_info =
  jsonutils.loads(bdm.connection_info)" AND tags:"screen-n-cpu.txt"

  but it seems to be very rare

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1327218/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1314677] [NEW] nova-cells fails when using JSON file to store cell information

2014-04-30 Thread Liam Young
Public bug reported:

As recommended in http://docs.openstack.org/havana/config-
reference/content/section_compute-cells.html#cell-config-optional-json
I'm creating the nova-cells config with the cell information stored in a
json file. However, when I do this nova-cells fails to start with this
error in the logs:

2014-04-29 11:52:05.240 16759 CRITICAL nova [-] __init__() takes exactly 3 
arguments (1 given)
2014-04-29 11:52:05.240 16759 TRACE nova Traceback (most recent call last):
2014-04-29 11:52:05.240 16759 TRACE nova   File "/usr/bin/nova-cells", line 10, 
in 
2014-04-29 11:52:05.240 16759 TRACE nova sys.exit(main())
2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/cmd/cells.py", line 40, in main
2014-04-29 11:52:05.240 16759 TRACE nova manager=CONF.cells.manager)
2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 257, in create
2014-04-29 11:52:05.240 16759 TRACE nova db_allowed=db_allowed)
2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/service.py", line 139, in __init__
2014-04-29 11:52:05.240 16759 TRACE nova self.manager = 
manager_class(host=self.host, *args, **kwargs)
2014-04-29 11:52:05.240 16759 TRACE nova   File 
"/usr/lib/python2.7/dist-packages/nova/cells/manager.py", line 87, in __init__
2014-04-29 11:52:05.240 16759 TRACE nova self.state_manager = 
cell_state_manager()
2014-04-29 11:52:05.240 16759 TRACE nova TypeError: __init__() takes exactly 3 
arguments (1 given)


I have had a dig into the code and it appears that CellsManager creates an 
instance of CellStateManager with no arguments. CellStateManager __new__ runs 
and creates an instance of CellStateManagerFile which runs __new__ and __init__ 
with cell_state_cls and cells_config_path set. At this point __new__ returns 
CellStateManagerFile and the new instance's __init__() method is invoked 
(CellStateManagerFile.__init__) with the original arguments (there weren't any) 
which then results in the stack trace.

It seems reasonable for CellStateManagerFile to derive the
cells_config_path info for itself so I've patched it locally with

=== modified file 'state.py'
--- state.py2014-04-30 15:10:16 +
+++ state.py2014-04-30 15:10:26 +
@@ -155,7 +155,7 @@
 config_path = CONF.find_file(cells_config)
 if not config_path:
 raise cfg.ConfigFilesNotFoundError(config_files=[cells_config])
-return CellStateManagerFile(cell_state_cls, config_path)
+return CellStateManagerFile(cell_state_cls)
 
 return CellStateManagerDB(cell_state_cls)
 
@@ -450,7 +450,9 @@
 
 
 class CellStateManagerFile(CellStateManager):
-def __init__(self, cell_state_cls, cells_config_path):
+def __init__(self, cell_state_cls=None):
+cells_config = CONF.cells.cells_config
+cells_config_path = CONF.find_file(cells_config)
 self.cells_config_path = cells_config_path
 super(CellStateManagerFile, self).__init__(cell_state_cls)
 


Ubuntu: 14.04
nova-cells: 1:2014.1-0ubuntu1

nova.conf:

[DEFAULT]
dhcpbridge_flagfile=/etc/nova/nova.conf
dhcpbridge=/usr/bin/nova-dhcpbridge
logdir=/var/log/nova
state_path=/var/lib/nova
lock_path=/var/lock/nova
force_dhcp_release=True
iscsi_helper=tgtadm
libvirt_use_virtio_for_bridges=True
connection_type=libvirt
root_helper=sudo nova-rootwrap /etc/nova/rootwrap.conf
verbose=True
ec2_private_dns_show_ip=True
api_paste_config=/etc/nova/api-paste.ini
volumes_path=/var/lib/nova/volumes
enabled_apis=ec2,osapi_compute,metadata
auth_strategy=keystone
compute_driver=libvirt.LibvirtDriver
quota_driver=nova.quota.NoopQuotaDriver


[cells]
enable=True
name=cell
cell_type=compute
cells_config=/etc/nova/cells.json


cells.json: 
{
"parent": {
"name": "parent",
"api_url": "http://api.example.com:8774";,
"transport_url": "rabbit://rabbit.example.com",
"weight_offset": 0.0,
"weight_scale": 1.0,
"is_parent": true
}
}

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1314677

Title:
  nova-cells fails when using JSON file to store cell information

Status in OpenStack Compute (Nova):
  New

Bug description:
  As recommended in http://docs.openstack.org/havana/config-
  reference/content/section_compute-cells.html#cell-config-optional-json
  I'm creating the nova-cells config with the cell information stored in
  a json file. However, when I do this nova-cells fails to start with
  this error in the logs:

  2014-04-29 11:52:05.240 16759 CRITICAL nova [-] __init__() takes exactly 3 
arguments (1 given)
  2014-04-29 11:52:05.240 16759 TRACE nova Traceback (most recent call last):
  2014-04-29 11:52:05.240 16759 TRACE nova   File "/usr/bin/nova-cell