** Changed in: nova/icehouse Status: Fix Committed => Fix Released
-- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1326901 Title: ServiceBinaryExists - binary for nova-conductor already exists Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) icehouse series: Fix Released Status in nova package in Ubuntu: Fix Released Status in nova source package in Trusty: Triaged Status in nova source package in Utopic: Fix Released Bug description: We're hitting an intermittent issue where ServiceBinaryExists is raised for nova-conductor on deployment. From nova-conductor's upstart log ( /var/log/upstart/nova- conductor.log ): 2014-05-15 12:02:25.206 34494 INFO nova.openstack.common.periodic_task [-] Skipping periodic task _periodic_update_dns because its interval is negative 2014-05-15 12:02:25.241 34494 INFO nova.openstack.common.service [-] Starting 8 workers 2014-05-15 12:02:25.242 34494 INFO nova.openstack.common.service [-] Started child 34501 2014-05-15 12:02:25.244 34494 INFO nova.openstack.common.service [-] Started child 34502 2014-05-15 12:02:25.246 34494 INFO nova.openstack.common.service [-] Started child 34503 2014-05-15 12:02:25.246 34501 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.247 34502 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.247 34494 INFO nova.openstack.common.service [-] Started child 34504 2014-05-15 12:02:25.249 34503 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.251 34504 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.254 34505 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.250 34494 INFO nova.openstack.common.service [-] Started child 34505 2014-05-15 12:02:25.261 34494 INFO nova.openstack.common.service [-] Started child 34506 2014-05-15 12:02:25.263 34494 INFO nova.openstack.common.service [-] Started child 34507 2014-05-15 12:02:25.266 34494 INFO nova.openstack.common.service [-] Started child 34508 2014-05-15 12:02:25.267 34507 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.268 34506 AUDIT nova.service [-] Starting conductor node (version 2014.1) 2014-05-15 12:02:25.271 34508 AUDIT nova.service [-] Starting conductor node (version 2014.1) /usr/lib/python2.7/dist-packages/nova/openstack/common/db/sqlalchemy/session.py:379: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 match = pattern.match(integrity_error.message) /usr/lib/python2.7/dist-packages/nova/openstack/common/db/sqlalchemy/session.py:379: DeprecationWarning: BaseException.message has been deprecated as of Python 2.6 match = pattern.match(integrity_error.message) Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 346, in fire_timers timer() File "/usr/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 56, in __call__ cb(*args, **kw) File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 194, in main 2014-05-15 12:02:25.862 34502 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on localhost:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds. result = function(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/nova/openstack/common/service.py", line 480, in run_service service.start() File "/usr/lib/python2.7/dist-packages/nova/service.py", line 172, in start self.service_ref = self._create_service_ref(ctxt) File "/usr/lib/python2.7/dist-packages/nova/service.py", line 224, in _create_service_ref service = self.conductor_api.service_create(context, svc_values) File "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 202, in service_create return self._manager.service_create(context, values) File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 966, in wrapper return func(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 461, in service_create svc = self.db.service_create(context, values) File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 139, in service_create return IMPL.service_create(context, values) File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 146, in wrapper return f(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 521, in service_create binary=values.get('binary')) ServiceBinaryExists: Service with host glover binary nova-conductor exists. 2014-05-15 12:02:25.864 34503 ERROR nova.openstack.common.threadgroup [-] Service with host glover binary nova-conductor exists. 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup Traceback (most recent call last): 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line 117, in wait 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup x.wait() 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/threadgroup.py", line 49, in wait 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return self.thread.wait() 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 168, in wait 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return self._exit_event.wait() 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch() 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 187, in switch 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return self.greenlet.switch() 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 194, in main 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/openstack/common/service.py", line 480, in run_service 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup service.start() 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/service.py", line 172, in start 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup self.service_ref = self._create_service_ref(ctxt) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/service.py", line 224, in _create_service_ref 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup service = self.conductor_api.service_create(context, svc_values) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 202, in service_create 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return self._manager.service_create(context, values) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 966, in wrapper 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return func(*args, **kwargs) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 461, in service_create 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup svc = self.db.service_create(context, values) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 139, in service_create 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return IMPL.service_create(context, values) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 146, in wrapper 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup return f(*args, **kwargs) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 521, in service_create 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup binary=values.get('binary')) 2014-05-15 12:02:25.864 34503 TRACE nova.openstack.common.threadgroup ServiceBinaryExists: Service with host glover binary nova-conductor exists. After looking into the traceback, I came across this piece of code which starts a service. Of note is the 2nd except path. This exception was introduced in January to handle races when creating a record. It seems as if we are racing similarly in our situation and might need to handle ServiceBinaryExists the same way ServiceTopicExists is being handled. I'll submit a patch for this. From nova/service.py: 158 def start(self): 159 verstr = version.version_string_with_package() 160 LOG.audit(_('Starting %(topic)s node (version %(version)s)'), 161 {'topic': self.topic, 'version': verstr}) 162 self.basic_config_check() 163 self.manager.init_host() 164 self.model_disconnected = False 165 ctxt = context.get_admin_context() 166 try: 167 self.service_ref = self.conductor_api.service_get_by_args(ctxt, 168 self.host, self.binary) 169 self.service_id = self.service_ref['id'] 170 except exception.NotFound: 171 try: 172 self.service_ref = self._create_service_ref(ctxt) 173 except exception.ServiceTopicExists: 174 # NOTE(danms): If we race to create a record with a sibling 175 # worker, don't fail here. 176 self.service_ref = self.conductor_api.service_get_by_args(ctxt, 177 self.host, self.binary) $ git blame -L 158,178 nova/service.py b3f5aba0 (Andy Smith 2010-12-09 15:25:14 -0800 158) def start(self): 481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 159) verstr = version.version_string_with_package() 481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 160) LOG.audit(_('Starting %(topic)s node (version %(versi 481d6ff1 (Daniel P. Berrange 2012-12-17 12:17:59 +0000 161) {'topic': self.topic, 'version': verstr}) e8386a27 (Rafi Khardalian 2013-01-23 01:55:09 +0000 162) self.basic_config_check() 065257fb (Vishvananda Ishaya 2010-09-23 12:43:41 -0700 163) self.manager.init_host() 9c98cfb4 (Vishvananda Ishaya 2010-08-30 00:55:19 -0700 164) self.model_disconnected = False 5e3da586 (Vishvananda Ishaya 2010-10-01 05:57:17 -0700 165) ctxt = context.get_admin_context() 57a103b3 (Vishvananda Ishaya 2010-09-02 14:13:22 -0700 166) try: e9d34263 (Russell Bryant 2013-01-14 17:39:29 -0500 167) self.service_ref = self.conductor_api.service_get e9d34263 (Russell Bryant 2013-01-14 17:39:29 -0500 168) self.host, self.binary) e34bc343 (Wenhao Xu 2013-01-21 19:07:34 +0800 169) self.service_id = self.service_ref['id'] 57a103b3 (Vishvananda Ishaya 2010-09-02 14:13:22 -0700 170) except exception.NotFound: f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 171) try: f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 172) self.service_ref = self._create_service_ref(c f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 173) except exception.ServiceTopicExists: f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 174) # NOTE(danms): If we race to create a record f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 175) # worker, don't fail here. f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 176) self.service_ref = self.conductor_api.service f6c341b4 (Dan Smith 2014-01-27 14:03:57 -0800 177) self.host, self.binary) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1326901/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp