Reviewed: https://review.opendev.org/665634 Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=84a07102e2db92333ce416a520151f1a1856f162 Submitter: Zuul Branch: master
commit 84a07102e2db92333ce416a520151f1a1856f162 Author: Mark Goddard <[email protected]> Date: Mon Jun 17 10:49:21 2019 +0100 Run nova api_db sync before db sync Backport: Stein, Rocky Since cells v2 was introduced, nova operators must run two commands to migrate the database schemas of nova's databases - nova-manage api_db sync and nova-manage db sync. It is necessary to run them in this order, since the db sync may depend on schema changes made to the api database in the api_db sync. Executing the db sync first may fail, for example with the following seen in a Queens to Rocky upgrade: nova-manage db sync ERROR: Could not access cell0. Has the nova_api database been created? Has the nova_cell0 database been created? Has "nova-manage api_db sync" been run? Has "nova-manage cell_v2 map_cell0" been run? Is [api_database]/connection set in nova.conf? Is the cell0 database connection URL correct? Error: (pymysql.err.InternalError) (1054, u"Unknown column 'cell_mappings.disabled' in 'field list'") [SQL: u'SELECT cell_mappings.created_at AS cell_mappings_created_at, cell_mappings.updated_at AS cell_mappings_updated_at, cell_mappings.id AS cell_mappings_id, cell_mappings.uuid AS cell_mappings_uuid, cell_mappings.name AS cell_mappings_name, cell_mappings.transport_url AS cell_mappings_transport_url, cell_mappings.database_connection AS cell_mappings_database_connection, cell_mappings.disabled AS cell_mappings_disabled \nFROM cell_mappings \nWHERE cell_mappings.uuid = %(uuid_1)s \n LIMIT %(param_1)s'] [parameters: {u'uuid_1': '00000000-0000-0000-0000-000000000000', u'param_1': 1}] (Background on this error at: http://sqlalche.me/e/2j85) Despite this error, the command actually exits zero, so kolla does not notice the failure. This change reorders these two commands to run the api_db sync before the db sync. Change-Id: I0514c02aaee1a413a7979c3a426e0484df839861 Closes-Bug: #1832860 ** Changed in: kolla Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1832860 Title: Failed instances stuck in BUILD state after Rocky upgrade Status in kolla: Fix Released Status in kolla rocky series: New Status in kolla stein series: New Status in kolla train series: Fix Released Status in OpenStack Compute (nova): In Progress Bug description: Steps to reproduce ================== Starting with a cloud running the Queens release, upgrade to Rocky. Create a flavor that cannot fit on any compute node, e.g. openstack flavor create --ram 100000000 --disk 2147483647 --vcpus 10000 huge Then create an instance using that flavor: openstack server create huge --flavor huge --image cirros --network demo-net Expected ======== The instance fails to boot and ends up in the ERROR state. Actual ====== The instance fails to boot and gets stuck in the BUILD state. From nova-conductor.log: 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1244, in schedule_and_build_instances 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server tags=tags) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/conductor/manager.py", line 1193, in _bury_in_cell0 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server instance.create() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return fn(self, *args, **kwargs) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/objects/instance.py", line 600, in create 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server db_inst = db.instance_create(self._context, updates) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/api.py", line 748, in instance_create 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return IMPL.instance_create(context, values) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 170, in wrapper 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 154, in wrapper 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server ectxt.value = e.inner_exc 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server self.force_reraise() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server six.reraise(self.type_, self.value, self.tb) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 142, in wrapper 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 227, in wrapped 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(context, *args, **kwargs) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 1774, in instance_create 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server ec2_instance_create(context, instance_ref['uuid']) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 170, in wrapper 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 227, in wrapped 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return f(context, *args, **kwargs) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py", line 5286, in ec2_instance_create 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server ec2_instance_ref.save(context.session) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/models.py", line 50, in save 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server session.flush() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2254, in flush 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server self._flush(objects) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2380, in _flush 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server transaction.rollback(_capture_exception=True) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__ 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server compat.reraise(exc_type, exc_value, exc_tb) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2344, in _flush 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server flush_context.execute() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 391, in execute 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server rec.execute(self) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 556, in execute 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server uow 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 181, in save_obj 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server mapper, table, insert) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 866, in _emit_insert_statements 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server execute(statement, params) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return meth(self, multiparams, params) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server return connection._execute_clauseelement(self, multiparams, params) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server compiled_sql, distilled_params 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server context) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1409, in _handle_dbapi_exception 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server util.raise_from_cause(newraise, exc_info) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server reraise(type(exception), exception, tb=exc_tb, cause=cause) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server context) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 507, in do_execute 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server cursor.execute(statement, parameters) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 170, in execute 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server result = self._query(query) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 328, in _query 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server conn.query(q) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 516, in query 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server self._affected_rows = self._read_query_result(unbuffered=unbuffered) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 727, in _read_query_result 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server result.read() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1066, in read 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server first_packet = self.connection._read_packet() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 683, in _read_packet 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server packet.check_error() 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/protocol.py", line 220, in check_error 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server err.raise_mysql_exception(self._data) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server raise errorclass(errno, errval) 2019-06-12 15:00:24.443 6 ERROR oslo_messaging.rpc.server DBError: (pymysql.err.InternalError) (1054, u"Unknown column 'trusted_certs' in 'field list'") [SQL: u'INSERT INTO instance_extra (created_at, upd ated_at, deleted_at, deleted, instance_uuid, device_metadata, numa_topology, pci_requests, flavor, vcpu_model, migration_context, keypairs, trusted_certs) VALUES (%(created_at)s, %(updated_at)s, %(deleted _at)s, %(deleted)s, %(instance_uuid)s, %(device_metadata)s, %(numa_topology)s, %(pci_requests)s, %(flavor)s, %(vcpu_model)s, %(migration_context)s, %(keypairs)s, %(trusted_certs)s)'] [parameters: {'instan ce_uuid': u'df1bd38c-67cb-4eb0-b2d2-ac08233dadae', 'keypairs': '{"nova_object.version": "1.3", "nova_object.name": "KeyPairList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"}', 'p ci_requests': '[]', 'vcpu_model': None, 'device_metadata': None, 'created_at': datetime.datetime(2019, 6, 12, 15, 0, 24, 430084), 'updated_at': None, 'numa_topology': None, 'trusted_certs': None, 'deleted ': 0, 'migration_context': None, 'flavor': '{"new": null, "old": null, "cur": {"nova_object.version": "1.2", "nova_object.name": "Flavor", "nova_object.data": {"disabled": false, "root_gb": 214 ... (234 c haracters truncated) ... , "swap": 0, "rxtx_factor": 1.0, "is_public": true, "deleted_at": null, "vcpu_weight": 0, "id": 6, "name": "huge"}, "nova_object.namespace": "nova"}}', 'deleted_at': None}] (Backg round on this error at: http://sqlalche.me/e/2j85) Workaround ========== On the controller, perform a nova DB sync: docker exec -it nova_api nova-manage db sync Despite this making no changes to the database (checked with mysqldump), it seems to 'fix' nova. New instances created using the 'huge' flavor will go to the ERROR state. To manage notifications about this bug go to: https://bugs.launchpad.net/kolla/+bug/1832860/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

