After changing the ownership the engine is up!! thanks for your help!!!:)
On Tue, Mar 19, 2019 at 3:25 PM Simone Tiraboschi <[email protected]> wrote: > > > On Tue, Mar 19, 2019 at 2:21 PM ada per <[email protected]> wrote: > >> Thanks for you reply. >> >> Can you please provide step by step instructions on how to upgrade the >> vdsm from a node command line? >> > > Can you please report the version of vdsm you are using? > > then check the ownership of > > /rhev/data-center/00000000-0000-0000-0000-000000000000/05b2b2d5-a80e-4622-9410-8e1e9d362f3f/images/bb890447-f1f7-46af-8e57-543d61f0bd08/81685d19-0060-4f5d-a4cd-c5efa24aecfe > > if it's not vdsm:kvm, change it and then try again with hosted-engine > --vm-start > > >> >> On Tue, Mar 19, 2019 at 2:49 PM Simone Tiraboschi <[email protected]> >> wrote: >> >>> Hi Ada, >>> here the error: >>> >>> 2019-03-19 14:08:25,833+0200 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] >>> RPC call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:312) >>> 2019-03-19 14:08:25,839+0200 INFO (vm/a492d2eb) [vdsm.api] FINISH >>> prepareImage error=Volume does not exist: >>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) from=internal, >>> task_id=dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257 (api:52) >>> 2019-03-19 14:08:25,839+0200 ERROR (vm/a492d2eb) >>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257') >>> Unexpected error (task:875) >>> Traceback (most recent call last): >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line >>> 882, in _run >>> return fn(*args, **kargs) >>> File "<string>", line 2, in prepareImage >>> File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, >>> in method >>> ret = func(*args, **kwargs) >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line >>> 3199, in prepareImage >>> legality = dom.produceVolume(imgUUID, volUUID).getLegality() >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 822, >>> in produceVolume >>> volUUID) >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line >>> 801, in __init__ >>> self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID, >>> volUUID) >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", >>> line 71, in __init__ >>> volUUID) >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line >>> 86, in __init__ >>> self.validate() >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line >>> 112, in validate >>> self.validateVolumePath() >>> File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py", >>> line 131, in validateVolumePath >>> raise se.VolumeDoesNotExist(self.volUUID) >>> VolumeDoesNotExist: Volume does not exist: >>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) >>> 2019-03-19 14:08:25,840+0200 INFO (vm/a492d2eb) >>> [storage.TaskManager.Task] (Task='dc8fbf34-8d7e-47a3-8e02-0a5e5cb90257') >>> aborting: Task is aborted: "Volume does not exist: >>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',)" - code 201 (task:1181) >>> 2019-03-19 14:08:25,840+0200 ERROR (vm/a492d2eb) [storage.Dispatcher] >>> FINISH prepareImage error=Volume does not exist: >>> (u'81685d19-0060-4f5d-a4cd-c5efa24aecfe',) (dispatcher:83) >>> >>> I think it's still https://bugzilla.redhat.com/1666795 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=1666795> >>> >>> Can you please try updating vdsm to vdsm-4.30.10 since the bug is >>> reported as solved in that version? >>> >>> >>> >>> >>> On Tue, Mar 19, 2019 at 12:30 PM ada per <[email protected]> wrote: >>> >>>> an vdsm: >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Mar 19, 2019 at 1:24 PM ada per <[email protected]> wrote: >>>> >>>>> Thank you! please see attached files: >>>>> >>>>> On Tue, Mar 19, 2019 at 12:52 PM Simone Tiraboschi < >>>>> [email protected]> wrote: >>>>> >>>>>> Can you please check/attach also >>>>>> /var/log/ovirt-hosted-engine-ha/broker.log and /var/log/vdsm/vdsm.log ? >>>>>> >>>>>> On Tue, Mar 19, 2019 at 11:36 AM ada per <[email protected]> wrote: >>>>>> >>>>>>> Hello everyone, >>>>>>> >>>>>>> For a strange reason the hosted engine went down and I cannot >>>>>>> restart it. I tried manually restarting it without any success can you >>>>>>> please advice? >>>>>>> >>>>>>> For all the nodes the engine status is the same as the one below. >>>>>>> --== Host nodex. (id: 6) status ==-- >>>>>>> conf_on_shared_storage : True >>>>>>> Status up-to-date : True >>>>>>> Hostname : nodex >>>>>>> Host ID : 6 >>>>>>> Engine status : {"reason": "bad vm status", >>>>>>> "health": "bad", "vm": "down_unexpected", "detail": "Down"} >>>>>>> Score : 3400 >>>>>>> stopped : False >>>>>>> Local maintenance : False >>>>>>> crc32 : 323a9f45 >>>>>>> local_conf_timestamp : 2648874 >>>>>>> Host timestamp : 2648874 >>>>>>> Extra metadata (valid at timestamp): >>>>>>> metadata_parse_version=1 >>>>>>> metadata_feature_version=1 >>>>>>> timestamp=2648874 (Tue Mar 19 12:25:44 2019) >>>>>>> host-id=6 >>>>>>> score=3400 >>>>>>> vm_conf_refresh_time=2648874 (Tue Mar 19 12:25:44 2019) >>>>>>> conf_on_shared_storage=True >>>>>>> maintenance=False >>>>>>> state=GlobalMaintenance >>>>>>> stopped=False >>>>>>> >>>>>>> When I try the commands >>>>>>> root@node5# hosted-engine --vm-shutdown >>>>>>> I ge the response: >>>>>>> root@node5# Command VM.shutdown with args {'delay': '120', >>>>>>> 'message': 'VM is shutting down!', 'vmID': >>>>>>> 'a492d2eb-1dfd-470d-a141-3e55d2189275'} failed:(code=1, message=Virtual >>>>>>> machine does not exist) >>>>>>> >>>>>>> But when I run : hosted-engine --vm-start >>>>>>> I get the response: VM exists and is down, cleaning up and restarting >>>>>>> >>>>>>> >>>>>>> >>>>>>> Below you can see the # journalctl -u ovirt-ha-agent logs >>>>>>> >>>>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Unhandled >>>>>>> monitoring loop exception >>>>>>> Traceback >>>>>>> (most recent call last): >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 430, in start_monitoring >>>>>>> >>>>>>> self._monitoring_loop() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 449, in _monitoring_loop >>>>>>> for >>>>>>> old_state, state, delay in self.fsm: >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/fsm/machine.py", >>>>>>> line 127, in next >>>>>>> >>>>>>> new_data = self.refresh(self._state.data) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/state_machine.py", >>>>>>> line 81, in refresh >>>>>>> >>>>>>> stats.update(self.hosted_engine.collect_stats()) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 737, in collect_stats >>>>>>> >>>>>>> all_stats = self._broker.get_stats_from_storage() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>>>> line 143, in get_stats_from_storage >>>>>>> result >>>>>>> = self._proxy.get_stats() >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ >>>>>>> return >>>>>>> self.__send(self.__name, args) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request >>>>>>> >>>>>>> verbose=self.__verbose >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request >>>>>>> return >>>>>>> self.single_request(host, handler, request_body, verbose) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request >>>>>>> >>>>>>> self.send_content(h, request_body) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content >>>>>>> >>>>>>> connection.endheaders(request_body) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders >>>>>>> >>>>>>> self._send_output(message_body) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output >>>>>>> >>>>>>> self.send(msg) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send >>>>>>> >>>>>>> self.connect() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", >>>>>>> line 52, in connect >>>>>>> >>>>>>> self.sock.connect(base64.b16decode(self.host)) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth >>>>>>> return >>>>>>> getattr(self._sock,name)(*args) >>>>>>> error: >>>>>>> [Errno 2] No such file or directory >>>>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent >>>>>>> call >>>>>>> last): >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>>>> line 131, in _run_agent >>>>>>> return >>>>>>> action(he) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>>>> line 55, in action_proper >>>>>>> return >>>>>>> he.start_monitoring() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 437, in start_monitoring >>>>>>> >>>>>>> self.publish(stopped) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 337, in publish >>>>>>> >>>>>>> self._push_to_storage(blocks) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 708, in _push_to_storage >>>>>>> >>>>>>> self._broker.put_stats_on_storage(self.host_id, blocks) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>>>> line 113, in put_stats_on_storage >>>>>>> >>>>>>> self._proxy.put_stats(host_id, xmlrpclib.Binary(data)) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ >>>>>>> return >>>>>>> self.__send(self.__name, args) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request >>>>>>> >>>>>>> verbose=self.__verbose >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request >>>>>>> return >>>>>>> self.single_request(host, handler, request_body, verbose) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1301, in single_request >>>>>>> >>>>>>> self.send_content(h, request_body) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/xmlrpclib.py", line 1448, in send_content >>>>>>> >>>>>>> connection.endheaders(request_body) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/httplib.py", line 1037, in endheaders >>>>>>> >>>>>>> self._send_output(message_body) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/httplib.py", line 881, in _send_output >>>>>>> >>>>>>> self.send(msg) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/httplib.py", line 843, in send >>>>>>> >>>>>>> self.connect() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", >>>>>>> line 52, in connect >>>>>>> >>>>>>> self.sock.connect(base64.b16decode(self.host)) >>>>>>> File >>>>>>> "/usr/lib64/python2.7/socket.py", line 224, in meth >>>>>>> return >>>>>>> getattr(self._sock,name)(*args) >>>>>>> error: >>>>>>> [Errno 2] No such file or directory >>>>>>> Mar 14 12:04:42 node7. ovirt-ha-agent[4134]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent >>>>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service: main >>>>>>> process exited, code=exited, status=157/n/a >>>>>>> Mar 14 12:04:42 node7. systemd[1]: Unit ovirt-ha-agent.service >>>>>>> entered failed state. >>>>>>> Mar 14 12:04:42 node7. systemd[1]: ovirt-ha-agent.service failed. >>>>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service holdoff >>>>>>> time over, scheduling restart. >>>>>>> Mar 14 12:04:52 node7. systemd[1]: Stopped oVirt Hosted Engine High >>>>>>> Availability Monitoring Agent. >>>>>>> Mar 14 12:04:52 node7. systemd[1]: Started oVirt Hosted Engine High >>>>>>> Availability Monitoring Agent. >>>>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to >>>>>>> start necessary monitors >>>>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent >>>>>>> call >>>>>>> last): >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>>>> line 131, in _run_agent >>>>>>> >>>>>>> return action(he) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>>>> line 55, in action_proper >>>>>>> >>>>>>> return he.start_monitoring() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 413, in start_monitoring >>>>>>> >>>>>>> self._initialize_broker() >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>> line 537, in _initialize_broker >>>>>>> >>>>>>> m.get('options', {})) >>>>>>> File >>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>>>> line 86, in start_monitor >>>>>>> >>>>>>> ).format(t=type, o=options, e=e) >>>>>>> >>>>>>> RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: >>>>>>> [Errno 2] No such file or directory, [monitor: 'ping', options: >>>>>>> {'addr': '19 >>>>>>> Mar 14 12:04:52 node7. ovirt-ha-agent[31765]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent >>>>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service: main >>>>>>> process exited, code=exited, status=157/n/a >>>>>>> Mar 14 12:04:52 node7. systemd[1]: Unit ovirt-ha-agent.service >>>>>>> entered failed state. >>>>>>> Mar 14 12:04:52 node7. systemd[1]: ovirt-ha-agent.service failed. >>>>>>> Mar 14 12:05:02 node7. systemd[1]: ovirt-ha-agent.service holdoff >>>>>>> time over, scheduling restart. >>>>>>> Mar 14 12:05:02 node7. systemd[1]: Stopped oVirt Hosted Engine High >>>>>>> Availability Monitoring Agent. >>>>>>> Mar 14 12:05:02 node7. systemd[1]: Started oVirt Hosted Engine High >>>>>>> Availability Monitoring Agent. >>>>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to >>>>>>> stop engine vm with /usr/sbin/hosted-engine --vm-poweroff: Co >>>>>>> (code=1, >>>>>>> message=Virtual machine does not exist: {'vmId': >>>>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'}) >>>>>>> Mar 14 12:06:55 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to >>>>>>> stop engine VM: Command VM.destroy with args {'vmID': 'a492d2 >>>>>>> (code=1, >>>>>>> message=Virtual machine does not exist: {'vmId': >>>>>>> u'a492d2eb-1dfd-470d-a141-3e55d2189275'}) >>>>>>> Mar 15 14:28:16 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:28:36 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:29:00 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:29:22 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:29:44 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:30:06 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:30:28 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:30:50 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:31:12 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:31:33 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:31:56 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:32:18 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:32:40 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> Mar 15 14:33:02 node7. ovirt-ha-agent[31822]: ovirt-ha-agent >>>>>>> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM >>>>>>> stopped on localhost >>>>>>> _______________________________________________ >>>>>>> Users mailing list -- [email protected] >>>>>>> To unsubscribe send an email to [email protected] >>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>>> oVirt Code of Conduct: >>>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>>> List Archives: >>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/NS2SASAK66TEO3MZQYIW64HCDLXVTIL6/ >>>>>>> >>>>>>
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/EOP23EOVRVZ7JIQT5CT26EBOFKJSNOEZ/

