Hi, > Just to clarify: you mean the host_id in > /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id, > correct?
Exactly. Put the cluster to global maintenance first. Or kill all agents (has the same effect). Martin On Fri, Jun 30, 2017 at 12:47 PM, cmc <[email protected]> wrote: > Just to clarify: you mean the host_id in > /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id, > correct? > > On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak <[email protected]> wrote: >> Hi, >> >> cleaning metadata won't help in this case. Try transferring the >> spm_ids you got from the engine to the proper hosted engine hosts so >> the hosted engine ids match the spm_ids. Then restart all hosted >> engine services. I would actually recommend restarting all hosts after >> this change, but I have no idea how many VMs you have running. >> >> Martin >> >> On Thu, Jun 29, 2017 at 8:27 PM, cmc <[email protected]> wrote: >>> Tried running a 'hosted-engine --clean-metadata" as per >>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since >>> ovirt-ha-agent was not running anyway, but it fails with the following >>> error: >>> >>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed >>> to start monitoring domain >>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>> during domain acquisition >>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent >>> call last): >>> File >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>> line 191, in _run_agent >>> return action(he) >>> File >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>> line 67, in action_clean >>> return he.clean(options.force_cleanup) >>> File >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>> line 345, in clean >>> self._initialize_domain_monitor() >>> File >>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>> line 823, in _initialize_domain_monitor >>> raise Exception(msg) >>> Exception: Failed to start monitoring domain >>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>> during domain acquisition >>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent >>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt >>> '0' >>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors >>> occurred, giving up. Please review the log and consider filing a bug. >>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down >>> >>> On Thu, Jun 29, 2017 at 6:10 PM, cmc <[email protected]> wrote: >>>> Actually, it looks like sanlock problems: >>>> >>>> "SanlockInitializationError: Failed to initialize sanlock, the >>>> number of errors has exceeded the limit" >>>> >>>> >>>> >>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc <[email protected]> wrote: >>>>> Sorry, I am mistaken, two hosts failed for the agent with the following >>>>> error: >>>>> >>>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine >>>>> ERROR Failed to start monitoring domain >>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>> during domain acquisition >>>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine >>>>> ERROR Shutting down the agent because of 3 failures in a row! >>>>> >>>>> What could cause these timeouts? Some other service not running? >>>>> >>>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc <[email protected]> wrote: >>>>>> Both services are up on all three hosts. The broke logs just report: >>>>>> >>>>>> Thread-6549::INFO::2017-06-29 >>>>>> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup) >>>>>> Connection established >>>>>> Thread-6549::INFO::2017-06-29 >>>>>> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle) >>>>>> Connection closed >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Cam >>>>>> >>>>>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak <[email protected]> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> please make sure that both ovirt-ha-agent and ovirt-ha-broker services >>>>>>> are restarted and up. The error says the agent can't talk to the >>>>>>> broker. Is there anything in the broker.log? >>>>>>> >>>>>>> Best regards >>>>>>> >>>>>>> Martin Sivak >>>>>>> >>>>>>> On Thu, Jun 29, 2017 at 4:42 PM, cmc <[email protected]> wrote: >>>>>>>> I've restarted those two services across all hosts, have taken the >>>>>>>> Hosted Engine host out of maintenance, and when I try to migrate the >>>>>>>> Hosted Engine over to another host, it reports that all three hosts >>>>>>>> 'did not satisfy internal filter HA because it is not a Hosted Engine >>>>>>>> host'. >>>>>>>> >>>>>>>> On the host that the Hosted Engine is currently on it reports in the >>>>>>>> agent.log: >>>>>>>> >>>>>>>> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR >>>>>>>> Connection closed: Connection closed >>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent >>>>>>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception >>>>>>>> getting service path: Connection closed >>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent >>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent >>>>>>>> call last): >>>>>>>> File >>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>>>>> line 191, in _run_agent >>>>>>>> return action(he) >>>>>>>> File >>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py", >>>>>>>> line 64, in action_proper >>>>>>>> return >>>>>>>> he.start_monitoring() >>>>>>>> File >>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>>> line 411, in start_monitoring >>>>>>>> >>>>>>>> self._initialize_sanlock() >>>>>>>> File >>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>>> line 691, in _initialize_sanlock >>>>>>>> >>>>>>>> constants.SERVICE_TYPE + constants.LOCKSPACE_EXTENSION) >>>>>>>> File >>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", >>>>>>>> line 162, in get_service_path >>>>>>>> .format(str(e))) >>>>>>>> RequestError: Failed >>>>>>>> to get service path: Connection closed >>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent >>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent >>>>>>>> >>>>>>>> On Thu, Jun 29, 2017 at 1:25 PM, Martin Sivak <[email protected]> >>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> yep, you have to restart the ovirt-ha-agent and ovirt-ha-broker >>>>>>>>> services. >>>>>>>>> >>>>>>>>> The scheduling message just means that the host has score 0 or is not >>>>>>>>> reporting score at all. >>>>>>>>> >>>>>>>>> Martin >>>>>>>>> >>>>>>>>> On Thu, Jun 29, 2017 at 1:33 PM, cmc <[email protected]> wrote: >>>>>>>>>> Thanks Martin, do I have to restart anything? When I try to use the >>>>>>>>>> 'migrate' operation, it complains that the other two hosts 'did not >>>>>>>>>> satisfy internal filter HA because it is not a Hosted Engine host..' >>>>>>>>>> (even though I reinstalled both these hosts with the 'deploy hosted >>>>>>>>>> engine' option, which suggests that something needs restarting. >>>>>>>>>> Should >>>>>>>>>> I worry about the sanlock errors, or will that be resolved by the >>>>>>>>>> change in host_id? >>>>>>>>>> >>>>>>>>>> Kind regards, >>>>>>>>>> >>>>>>>>>> Cam >>>>>>>>>> >>>>>>>>>> On Thu, Jun 29, 2017 at 12:22 PM, Martin Sivak <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> Change the ids so they are distinct. I need to check if there is a >>>>>>>>>>> way >>>>>>>>>>> to read the SPM ids from the engine as using the same numbers would >>>>>>>>>>> be >>>>>>>>>>> the best. >>>>>>>>>>> >>>>>>>>>>> Martin >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 29, 2017 at 12:46 PM, cmc <[email protected]> wrote: >>>>>>>>>>>> Is there any way of recovering from this situation? I'd prefer to >>>>>>>>>>>> fix >>>>>>>>>>>> the issue rather than re-deploy, but if there is no recovery path, >>>>>>>>>>>> I >>>>>>>>>>>> could perhaps try re-deploying the hosted engine. In which case, >>>>>>>>>>>> would >>>>>>>>>>>> the best option be to take a backup of the Hosted Engine, and then >>>>>>>>>>>> shut it down, re-initialise the SAN partition (or use another >>>>>>>>>>>> partition) and retry the deployment? Would it be better to use the >>>>>>>>>>>> older backup from the bare metal engine that I originally used, or >>>>>>>>>>>> use >>>>>>>>>>>> a backup from the Hosted Engine? I'm not sure if any VMs have been >>>>>>>>>>>> added since switching to Hosted Engine. >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately I have very little time left to get this working >>>>>>>>>>>> before >>>>>>>>>>>> I have to hand it over for eval (by end of Friday). >>>>>>>>>>>> >>>>>>>>>>>> Here are some log snippets from the cluster that are current >>>>>>>>>>>> >>>>>>>>>>>> In /var/log/vdsm/vdsm.log on the host that has the Hosted Engine: >>>>>>>>>>>> >>>>>>>>>>>> 2017-06-29 10:50:15,071+0100 INFO (monitor/207221b) >>>>>>>>>>>> [storage.SANLock] >>>>>>>>>>>> Acquiring host id for domain 207221b2-959b-426b-b945-18e1adfed62f >>>>>>>>>>>> (id: >>>>>>>>>>>> 3) (clusterlock:282) >>>>>>>>>>>> 2017-06-29 10:50:15,072+0100 ERROR (monitor/207221b) >>>>>>>>>>>> [storage.Monitor] >>>>>>>>>>>> Error acquiring host id 3 for domain >>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f (monitor:558) >>>>>>>>>>>> Traceback (most recent call last): >>>>>>>>>>>> File "/usr/share/vdsm/storage/monitor.py", line 555, in >>>>>>>>>>>> _acquireHostId >>>>>>>>>>>> self.domain.acquireHostId(self.hostId, async=True) >>>>>>>>>>>> File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId >>>>>>>>>>>> self._manifest.acquireHostId(hostId, async) >>>>>>>>>>>> File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId >>>>>>>>>>>> self._domainLock.acquireHostId(hostId, async) >>>>>>>>>>>> File >>>>>>>>>>>> "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", >>>>>>>>>>>> line 297, in acquireHostId >>>>>>>>>>>> raise se.AcquireHostIdFailure(self._sdUUID, e) >>>>>>>>>>>> AcquireHostIdFailure: Cannot acquire host id: >>>>>>>>>>>> ('207221b2-959b-426b-b945-18e1adfed62f', SanlockException(22, >>>>>>>>>>>> 'Sanlock >>>>>>>>>>>> lockspace add failure', 'Invalid argument')) >>>>>>>>>>>> >>>>>>>>>>>> From /var/log/ovirt-hosted-engine-ha/agent.log on the same host: >>>>>>>>>>>> >>>>>>>>>>>> MainThread::ERROR::2017-06-19 >>>>>>>>>>>> 13:30:50,592::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor) >>>>>>>>>>>> Failed to start monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>>>>>>>>> during domain acquisition >>>>>>>>>>>> MainThread::WARNING::2017-06-19 >>>>>>>>>>>> 13:30:50,593::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Error while monitoring engine: Failed to start monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>>>>>>>>> during domain acquisition >>>>>>>>>>>> MainThread::WARNING::2017-06-19 >>>>>>>>>>>> 13:30:50,593::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Unexpected error >>>>>>>>>>>> Traceback (most recent call last): >>>>>>>>>>>> File >>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>>>>>>> line 443, in start_monitoring >>>>>>>>>>>> self._initialize_domain_monitor() >>>>>>>>>>>> File >>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>>>>>>> line 823, in _initialize_domain_monitor >>>>>>>>>>>> raise Exception(msg) >>>>>>>>>>>> Exception: Failed to start monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>>>>>>>>> during domain acquisition >>>>>>>>>>>> MainThread::ERROR::2017-06-19 >>>>>>>>>>>> 13:30:50,593::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Shutting down the agent because of 3 failures in a row! >>>>>>>>>>>> >>>>>>>>>>>> From sanlock.log: >>>>>>>>>>>> >>>>>>>>>>>> 2017-06-29 11:17:06+0100 1194149 [2530]: add_lockspace >>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0 >>>>>>>>>>>> conflicts with name of list1 s5 >>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0 >>>>>>>>>>>> >>>>>>>>>>>> From the two other hosts: >>>>>>>>>>>> >>>>>>>>>>>> host 2: >>>>>>>>>>>> >>>>>>>>>>>> vdsm.log >>>>>>>>>>>> >>>>>>>>>>>> 2017-06-29 10:53:47,755+0100 ERROR (jsonrpc/4) >>>>>>>>>>>> [jsonrpc.JsonRpcServer] >>>>>>>>>>>> Internal server error (__init__:570) >>>>>>>>>>>> Traceback (most recent call last): >>>>>>>>>>>> File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", >>>>>>>>>>>> line >>>>>>>>>>>> 565, in _handle_request >>>>>>>>>>>> res = method(**params) >>>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line >>>>>>>>>>>> 202, in _dynamicMethod >>>>>>>>>>>> result = fn(*methodArgs) >>>>>>>>>>>> File "/usr/share/vdsm/API.py", line 1454, in >>>>>>>>>>>> getAllVmIoTunePolicies >>>>>>>>>>>> io_tune_policies_dict = self._cif.getAllVmIoTunePolicies() >>>>>>>>>>>> File "/usr/share/vdsm/clientIF.py", line 448, in >>>>>>>>>>>> getAllVmIoTunePolicies >>>>>>>>>>>> 'current_values': v.getIoTune()} >>>>>>>>>>>> File "/usr/share/vdsm/virt/vm.py", line 2803, in getIoTune >>>>>>>>>>>> result = self.getIoTuneResponse() >>>>>>>>>>>> File "/usr/share/vdsm/virt/vm.py", line 2816, in >>>>>>>>>>>> getIoTuneResponse >>>>>>>>>>>> res = self._dom.blockIoTune( >>>>>>>>>>>> File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", >>>>>>>>>>>> line >>>>>>>>>>>> 47, in __getattr__ >>>>>>>>>>>> % self.vmid) >>>>>>>>>>>> NotConnectedError: VM u'a79e6b0e-fff4-4cba-a02c-4c00be151300' was >>>>>>>>>>>> not >>>>>>>>>>>> started yet or was shut down >>>>>>>>>>>> >>>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/agent.log >>>>>>>>>>>> >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:33,636::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan) >>>>>>>>>>>> Found OVF_STORE: imgUUID:222610db-7880-4f4f-8559-a3635fd73555, >>>>>>>>>>>> volUUID:c6e0d29b-eabf-4a09-a330-df54cfdd73f1 >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:33,926::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) >>>>>>>>>>>> Extracting Engine VM OVF from the OVF_STORE >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:33,938::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) >>>>>>>>>>>> OVF_STORE volume path: >>>>>>>>>>>> /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/images/222610db-7880-4f4f-8559-a3635fd73555/c6e0d29b-eabf-4a09-a330-df54cfdd73f1 >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:33,967::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) >>>>>>>>>>>> Found an OVF for HE VM, trying to convert >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:33,971::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) >>>>>>>>>>>> Got vm.conf from OVF_STORE >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:36,736::states::678::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score) >>>>>>>>>>>> Score is 0 due to unexpected vm shutdown at Thu Jun 29 10:53:59 >>>>>>>>>>>> 2017 >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:36,736::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Current state EngineUnexpectedlyDown (score: 0) >>>>>>>>>>>> MainThread::INFO::2017-06-29 >>>>>>>>>>>> 10:56:46,772::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf) >>>>>>>>>>>> Reloading vm.conf from the shared storage domain >>>>>>>>>>>> >>>>>>>>>>>> /var/log/messages: >>>>>>>>>>>> >>>>>>>>>>>> Jun 29 10:53:46 kvm-ldn-02 kernel: dd: sending ioctl 80306d02 to a >>>>>>>>>>>> partition! >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> host 1: >>>>>>>>>>>> >>>>>>>>>>>> /var/log/messages also in sanlock.log >>>>>>>>>>>> >>>>>>>>>>>> Jun 29 11:01:02 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:02+0100 >>>>>>>>>>>> 678325 [9132]: s4531 delta_acquire host_id 1 busy1 1 2 1193177 >>>>>>>>>>>> 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03 >>>>>>>>>>>> Jun 29 11:01:03 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:03+0100 >>>>>>>>>>>> 678326 [24159]: s4531 add_lockspace fail result -262 >>>>>>>>>>>> >>>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/agent.log: >>>>>>>>>>>> >>>>>>>>>>>> MainThread::ERROR::2017-06-27 >>>>>>>>>>>> 15:21:01,143::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor) >>>>>>>>>>>> Failed to start monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>>>>>>>>> during domain acquisition >>>>>>>>>>>> MainThread::WARNING::2017-06-27 >>>>>>>>>>>> 15:21:01,144::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Error while monitoring engine: Failed to start monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>>>>>>>>> during domain acquisition >>>>>>>>>>>> MainThread::WARNING::2017-06-27 >>>>>>>>>>>> 15:21:01,144::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Unexpected error >>>>>>>>>>>> Traceback (most recent call last): >>>>>>>>>>>> File >>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>>>>>>> line 443, in start_monitoring >>>>>>>>>>>> self._initialize_domain_monitor() >>>>>>>>>>>> File >>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", >>>>>>>>>>>> line 823, in _initialize_domain_monitor >>>>>>>>>>>> raise Exception(msg) >>>>>>>>>>>> Exception: Failed to start monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout >>>>>>>>>>>> during domain acquisition >>>>>>>>>>>> MainThread::ERROR::2017-06-27 >>>>>>>>>>>> 15:21:01,144::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) >>>>>>>>>>>> Shutting down the agent because of 3 failures in a row! >>>>>>>>>>>> MainThread::INFO::2017-06-27 >>>>>>>>>>>> 15:21:06,717::hosted_engine::848::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status) >>>>>>>>>>>> VDSM domain monitor status: PENDING >>>>>>>>>>>> MainThread::INFO::2017-06-27 >>>>>>>>>>>> 15:21:09,335::hosted_engine::776::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor) >>>>>>>>>>>> Failed to stop monitoring domain >>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f): Storage domain is >>>>>>>>>>>> member of pool: u'domain=207221b2-959b-426b-b945-18e1adfed62f' >>>>>>>>>>>> MainThread::INFO::2017-06-27 >>>>>>>>>>>> 15:21:09,339::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run) >>>>>>>>>>>> Agent shutting down >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks for any help, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Cam >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 28, 2017 at 11:25 AM, cmc <[email protected]> wrote: >>>>>>>>>>>>> Hi Martin, >>>>>>>>>>>>> >>>>>>>>>>>>> yes, on two of the machines they have the same host_id. The other >>>>>>>>>>>>> has >>>>>>>>>>>>> a different host_id. >>>>>>>>>>>>> >>>>>>>>>>>>> To update since yesterday: I reinstalled and deployed Hosted >>>>>>>>>>>>> Engine on >>>>>>>>>>>>> the other host (so all three hosts in the cluster now have it >>>>>>>>>>>>> installed). The second one I deployed said it was able to host the >>>>>>>>>>>>> engine (unlike the first I reinstalled), so I tried putting the >>>>>>>>>>>>> host >>>>>>>>>>>>> with the Hosted Engine on it into maintenance to see if it would >>>>>>>>>>>>> migrate over. It managed to move all hosts but the Hosted Engine. >>>>>>>>>>>>> And >>>>>>>>>>>>> now the host that said it was able to host the engine says >>>>>>>>>>>>> 'unavailable due to HA score'. The host that it was trying to move >>>>>>>>>>>>> from is now in 'preparing for maintenance' for the last 12 hours. >>>>>>>>>>>>> >>>>>>>>>>>>> The summary is: >>>>>>>>>>>>> >>>>>>>>>>>>> kvm-ldn-01 - one of the original, pre-Hosted Engine hosts, >>>>>>>>>>>>> reinstalled >>>>>>>>>>>>> with 'Deploy Hosted Engine'. No icon saying it can host the Hosted >>>>>>>>>>>>> Hngine, host_id of '2' in >>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf. >>>>>>>>>>>>> 'add_lockspace' fails in sanlock.log >>>>>>>>>>>>> >>>>>>>>>>>>> kvm-ldn-02 - the other host that was pre-existing before Hosted >>>>>>>>>>>>> Engine >>>>>>>>>>>>> was created. Reinstalled with 'Deploy Hosted Engine'. Had an icon >>>>>>>>>>>>> saying that it was able to host the Hosted Engine, but after >>>>>>>>>>>>> migration >>>>>>>>>>>>> was attempted when putting kvm-ldn-03 into maintenance, it >>>>>>>>>>>>> reports: >>>>>>>>>>>>> 'unavailable due to HA score'. It has a host_id of '1' in >>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf. No errors in >>>>>>>>>>>>> sanlock.log >>>>>>>>>>>>> >>>>>>>>>>>>> kvm-ldn-03 - this was the host I deployed Hosted Engine on, which >>>>>>>>>>>>> was >>>>>>>>>>>>> not part of the original cluster. I restored the bare-metal engine >>>>>>>>>>>>> backup in the Hosted Engine on this host when deploying it, >>>>>>>>>>>>> without >>>>>>>>>>>>> error. It currently has the Hosted Engine on it (as the only VM >>>>>>>>>>>>> after >>>>>>>>>>>>> I put that host into maintenance to test the HA of Hosted Engine). >>>>>>>>>>>>> Sanlock log shows conflicts >>>>>>>>>>>>> >>>>>>>>>>>>> I will look through all the logs for any other errors. Please let >>>>>>>>>>>>> me >>>>>>>>>>>>> know if you need any logs or other clarification/information. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Campbell >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jun 28, 2017 at 9:25 AM, Martin Sivak <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> can you please check the contents of >>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf or >>>>>>>>>>>>>> /etc/ovirt-hosted-engine-ha/agent.conf (I am not sure which one >>>>>>>>>>>>>> it is >>>>>>>>>>>>>> right now) and search for host-id? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Make sure the IDs are different. If they are not, then there is >>>>>>>>>>>>>> a bug somewhere. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Martin >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 6:26 PM, cmc <[email protected]> wrote: >>>>>>>>>>>>>>> I see this on the host it is trying to migrate in >>>>>>>>>>>>>>> /var/log/sanlock: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2017-06-27 17:10:40+0100 527703 [2407]: s3528 lockspace >>>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0 >>>>>>>>>>>>>>> 2017-06-27 17:13:00+0100 527843 [27446]: s3528 delta_acquire >>>>>>>>>>>>>>> host_id 1 >>>>>>>>>>>>>>> busy1 1 2 1042692 >>>>>>>>>>>>>>> 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03 >>>>>>>>>>>>>>> 2017-06-27 17:13:01+0100 527844 [2407]: s3528 add_lockspace >>>>>>>>>>>>>>> fail result -262 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The sanlock service is running. Why would this occur? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> C >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 5:21 PM, cmc <[email protected]> wrote: >>>>>>>>>>>>>>>> Hi Martin, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for the reply. I have done this, and the deployment >>>>>>>>>>>>>>>> completed >>>>>>>>>>>>>>>> without error. However, it still will not allow the Hosted >>>>>>>>>>>>>>>> Engine >>>>>>>>>>>>>>>> migrate to another host. The >>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf got created ok on >>>>>>>>>>>>>>>> the host >>>>>>>>>>>>>>>> I re-installed, but the ovirt-ha-broker.service, though it >>>>>>>>>>>>>>>> starts, >>>>>>>>>>>>>>>> reports: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --------------------8<------------------- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jun 27 14:58:26 kvm-ldn-01 systemd[1]: Starting oVirt Hosted >>>>>>>>>>>>>>>> Engine >>>>>>>>>>>>>>>> High Availability Communications Broker... >>>>>>>>>>>>>>>> Jun 27 14:58:27 kvm-ldn-01 ovirt-ha-broker[6101]: >>>>>>>>>>>>>>>> ovirt-ha-broker >>>>>>>>>>>>>>>> ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker >>>>>>>>>>>>>>>> ERROR >>>>>>>>>>>>>>>> Failed to read metadata from >>>>>>>>>>>>>>>> /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata >>>>>>>>>>>>>>>> Traceback >>>>>>>>>>>>>>>> (most >>>>>>>>>>>>>>>> recent call last): >>>>>>>>>>>>>>>> File >>>>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py", >>>>>>>>>>>>>>>> line 129, in get_raw_stats_for_service_type >>>>>>>>>>>>>>>> f = >>>>>>>>>>>>>>>> os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC) >>>>>>>>>>>>>>>> OSError: >>>>>>>>>>>>>>>> [Errno 2] >>>>>>>>>>>>>>>> No such file or directory: >>>>>>>>>>>>>>>> '/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata' >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> --------------------8<------------------- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I checked the path, and it exists. I can run 'less -f' on it >>>>>>>>>>>>>>>> fine. The >>>>>>>>>>>>>>>> perms are slightly different on the host that is running the >>>>>>>>>>>>>>>> VM vs the >>>>>>>>>>>>>>>> one that is reporting errors (600 vs 660), ownership is >>>>>>>>>>>>>>>> vdsm:qemu. Is >>>>>>>>>>>>>>>> this a san locking issue? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for any help, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Cam >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 1:41 PM, Martin Sivak >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>> Should it be? It was not in the instructions for the >>>>>>>>>>>>>>>>>> migration from >>>>>>>>>>>>>>>>>> bare-metal to Hosted VM >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The hosted engine will only migrate to hosts that have the >>>>>>>>>>>>>>>>> services >>>>>>>>>>>>>>>>> running. Please put one other host to maintenance and select >>>>>>>>>>>>>>>>> Hosted >>>>>>>>>>>>>>>>> engine action: DEPLOY in the reinstall dialog. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best regards >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Martin Sivak >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 1:23 PM, cmc <[email protected]> >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> I changed the 'os.other.devices.display.protocols.value.3.6 = >>>>>>>>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display >>>>>>>>>>>>>>>>>> protocols >>>>>>>>>>>>>>>>>> as 4 and the hosted engine now appears in the list of VMs. I >>>>>>>>>>>>>>>>>> am >>>>>>>>>>>>>>>>>> guessing the compatibility version was causing it to use the >>>>>>>>>>>>>>>>>> 3.6 >>>>>>>>>>>>>>>>>> version. However, I am still unable to migrate the engine VM >>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>> another host. When I try putting the host it is currently on >>>>>>>>>>>>>>>>>> into >>>>>>>>>>>>>>>>>> maintenance, it reports: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Error while executing action: Cannot switch the Host(s) to >>>>>>>>>>>>>>>>>> Maintenance mode. >>>>>>>>>>>>>>>>>> There are no available hosts capable of running the engine >>>>>>>>>>>>>>>>>> VM. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Running 'hosted-engine --vm-status' still shows 'Engine >>>>>>>>>>>>>>>>>> status: >>>>>>>>>>>>>>>>>> unknown stale-data'. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The ovirt-ha-broker service is only running on one host. It >>>>>>>>>>>>>>>>>> was set to >>>>>>>>>>>>>>>>>> 'disabled' in systemd. It won't start as there is no >>>>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf on the other two >>>>>>>>>>>>>>>>>> hosts. >>>>>>>>>>>>>>>>>> Should it be? It was not in the instructions for the >>>>>>>>>>>>>>>>>> migration from >>>>>>>>>>>>>>>>>> bare-metal to Hosted VM >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Cam >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 1:07 PM, cmc <[email protected]> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> Hi Tomas, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> So in my >>>>>>>>>>>>>>>>>>> /usr/share/ovirt-engine/conf/osinfo-defaults.properties on >>>>>>>>>>>>>>>>>>> my >>>>>>>>>>>>>>>>>>> engine VM, I have: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value = >>>>>>>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus >>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value.3.6 = >>>>>>>>>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> That seems to match - I assume since this is 4.1, the 3.6 >>>>>>>>>>>>>>>>>>> should not apply >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Is there somewhere else I should be looking? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Cam >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 11:40 AM, Tomas Jelinek >>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek >>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> > On 22 Jun 2017, at 12:31, Martin Sivak >>>>>>>>>>>>>>>>>>>>> > <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > Tomas, what fields are needed in a VM to pass the check >>>>>>>>>>>>>>>>>>>>> > that causes >>>>>>>>>>>>>>>>>>>>> > the following error? >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> >>>>> WARN >>>>>>>>>>>>>>>>>>>>> >>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>>>>>>>>>>>>>>>>>>>> >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation >>>>>>>>>>>>>>>>>>>>> >>>>> of action >>>>>>>>>>>>>>>>>>>>> >>>>> 'ImportVm' >>>>>>>>>>>>>>>>>>>>> >>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT >>>>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>>>> >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> to match the OS and VM Display type;-) >>>>>>>>>>>>>>>>>>>>> Configuration is in osinfo….e.g. if that is import from >>>>>>>>>>>>>>>>>>>>> older releases on >>>>>>>>>>>>>>>>>>>>> Linux this is typically caused by the cahgen of cirrus to >>>>>>>>>>>>>>>>>>>>> vga for non-SPICE >>>>>>>>>>>>>>>>>>>>> VMs >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> yep, the default supported combinations for 4.0+ is this: >>>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value = >>>>>>>>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > Thanks. >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > On Thu, Jun 22, 2017 at 12:19 PM, cmc >>>>>>>>>>>>>>>>>>>>> > <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >> Hi Martin, >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>>>> >>> just as a random comment, do you still have the >>>>>>>>>>>>>>>>>>>>> >>> database backup from >>>>>>>>>>>>>>>>>>>>> >>> the bare metal -> VM attempt? It might be possible to >>>>>>>>>>>>>>>>>>>>> >>> just try again >>>>>>>>>>>>>>>>>>>>> >>> using it. Or in the worst case.. update the offending >>>>>>>>>>>>>>>>>>>>> >>> value there >>>>>>>>>>>>>>>>>>>>> >>> before restoring it to the new engine instance. >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> I still have the backup. I'd rather do the latter, as >>>>>>>>>>>>>>>>>>>>> >> re-running the >>>>>>>>>>>>>>>>>>>>> >> HE deployment is quite lengthy and involved (I have to >>>>>>>>>>>>>>>>>>>>> >> re-initialise >>>>>>>>>>>>>>>>>>>>> >> the FC storage each time). Do you know what the >>>>>>>>>>>>>>>>>>>>> >> offending value(s) >>>>>>>>>>>>>>>>>>>>> >> would be? Would it be in the Postgres DB or in a >>>>>>>>>>>>>>>>>>>>> >> config file >>>>>>>>>>>>>>>>>>>>> >> somewhere? >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> Cheers, >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >> Cam >>>>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>>> >>> Regards >>>>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>>>> >>> Martin Sivak >>>>>>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>>>>>> >>> On Thu, Jun 22, 2017 at 11:39 AM, cmc >>>>>>>>>>>>>>>>>>>>> >>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>> Hi Yanir, >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> Thanks for the reply. >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>>> First of all, maybe a chain reaction of : >>>>>>>>>>>>>>>>>>>>> >>>>> WARN >>>>>>>>>>>>>>>>>>>>> >>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>>>>>>>>>>>>>>>>>>>> >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation >>>>>>>>>>>>>>>>>>>>> >>>>> of action >>>>>>>>>>>>>>>>>>>>> >>>>> 'ImportVm' >>>>>>>>>>>>>>>>>>>>> >>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT >>>>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>>>> >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS >>>>>>>>>>>>>>>>>>>>> >>>>> is causing the hosted engine vm not to be set up >>>>>>>>>>>>>>>>>>>>> >>>>> correctly and >>>>>>>>>>>>>>>>>>>>> >>>>> further >>>>>>>>>>>>>>>>>>>>> >>>>> actions were made when the hosted engine vm wasnt >>>>>>>>>>>>>>>>>>>>> >>>>> in a stable state. >>>>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>>>> >>>>> As for now, are you trying to revert back to a >>>>>>>>>>>>>>>>>>>>> >>>>> previous/initial >>>>>>>>>>>>>>>>>>>>> >>>>> state ? >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> I'm not trying to revert it to a previous state for >>>>>>>>>>>>>>>>>>>>> >>>> now. This was a >>>>>>>>>>>>>>>>>>>>> >>>> migration from a bare metal engine, and it didn't >>>>>>>>>>>>>>>>>>>>> >>>> report any error >>>>>>>>>>>>>>>>>>>>> >>>> during the migration. I'd had some problems on my >>>>>>>>>>>>>>>>>>>>> >>>> first attempts at >>>>>>>>>>>>>>>>>>>>> >>>> this migration, whereby it never completed (due to a >>>>>>>>>>>>>>>>>>>>> >>>> proxy issue) but >>>>>>>>>>>>>>>>>>>>> >>>> I managed to resolve this. Do you know of a way to >>>>>>>>>>>>>>>>>>>>> >>>> get the Hosted >>>>>>>>>>>>>>>>>>>>> >>>> Engine VM into a stable state, without rebuilding >>>>>>>>>>>>>>>>>>>>> >>>> the entire cluster >>>>>>>>>>>>>>>>>>>>> >>>> from scratch (since I have a lot of VMs on it)? >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> Thanks for any help. >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> Regards, >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>> Cam >>>>>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>>>>> >>>>> Regards, >>>>>>>>>>>>>>>>>>>>> >>>>> Yanir >>>>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>>>> >>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc >>>>>>>>>>>>>>>>>>>>> >>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> Hi Jenny/Martin, >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> Any idea what I can do here? The hosted engine VM >>>>>>>>>>>>>>>>>>>>> >>>>>> has no log on any >>>>>>>>>>>>>>>>>>>>> >>>>>> host in /var/log/libvirt/qemu, and I fear that if >>>>>>>>>>>>>>>>>>>>> >>>>>> I need to put the >>>>>>>>>>>>>>>>>>>>> >>>>>> host into maintenance, e.g., to upgrade it that I >>>>>>>>>>>>>>>>>>>>> >>>>>> created it on >>>>>>>>>>>>>>>>>>>>> >>>>>> (which >>>>>>>>>>>>>>>>>>>>> >>>>>> I think is hosting it), or if it fails for any >>>>>>>>>>>>>>>>>>>>> >>>>>> reason, it won't get >>>>>>>>>>>>>>>>>>>>> >>>>>> migrated to another host, and I will not be able >>>>>>>>>>>>>>>>>>>>> >>>>>> to manage the >>>>>>>>>>>>>>>>>>>>> >>>>>> cluster. It seems to be a very dangerous position >>>>>>>>>>>>>>>>>>>>> >>>>>> to be in. >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> Cam >>>>>>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc >>>>>>>>>>>>>>>>>>>>> >>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks Martin. The hosts are all part of the same >>>>>>>>>>>>>>>>>>>>> >>>>>>> cluster. >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> I get these errors in the engine.log on the >>>>>>>>>>>>>>>>>>>>> >>>>>>> engine: >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z WARN >>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation >>>>>>>>>>>>>>>>>>>>> >>>>>>> of action >>>>>>>>>>>>>>>>>>>>> >>>>>>> 'ImportVm' >>>>>>>>>>>>>>>>>>>>> >>>>>>> failed for user SYST >>>>>>>>>>>>>>>>>>>>> >>>>>>> EM. Reasons: >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS >>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z INFO >>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand] >>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock freed >>>>>>>>>>>>>>>>>>>>> >>>>>>> to object >>>>>>>>>>>>>>>>>>>>> >>>>>>> 'EngineLock:{exclusiveLocks='[a >>>>>>>>>>>>>>>>>>>>> >>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM, >>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName >>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine>, >>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine=<VM_NAME, >>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_NAME_ALREADY_USED>]', >>>>>>>>>>>>>>>>>>>>> >>>>>>> sharedLocks= >>>>>>>>>>>>>>>>>>>>> >>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM, >>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName >>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine>]'}' >>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z ERROR >>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter] >>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed >>>>>>>>>>>>>>>>>>>>> >>>>>>> importing the Hosted >>>>>>>>>>>>>>>>>>>>> >>>>>>> Engine VM >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> The sanlock.log reports conflicts on that same >>>>>>>>>>>>>>>>>>>>> >>>>>>> host, and a >>>>>>>>>>>>>>>>>>>>> >>>>>>> different >>>>>>>>>>>>>>>>>>>>> >>>>>>> error on the other hosts, not sure if they are >>>>>>>>>>>>>>>>>>>>> >>>>>>> related. >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> And this in the >>>>>>>>>>>>>>>>>>>>> >>>>>>> /var/log/ovirt-hosted-engine-ha/agent log on the >>>>>>>>>>>>>>>>>>>>> >>>>>>> host >>>>>>>>>>>>>>>>>>>>> >>>>>>> which I deployed the hosted engine VM on: >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> MainThread::ERROR::2017-06-19 >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF) >>>>>>>>>>>>>>>>>>>>> >>>>>>> Unable to extract HEVM OVF >>>>>>>>>>>>>>>>>>>>> >>>>>>> MainThread::ERROR::2017-06-19 >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store) >>>>>>>>>>>>>>>>>>>>> >>>>>>> Failed extracting VM OVF from the OVF_STORE >>>>>>>>>>>>>>>>>>>>> >>>>>>> volume, falling back >>>>>>>>>>>>>>>>>>>>> >>>>>>> to >>>>>>>>>>>>>>>>>>>>> >>>>>>> initial vm.conf >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> I've seen some of these issues reported in >>>>>>>>>>>>>>>>>>>>> >>>>>>> bugzilla, but they were >>>>>>>>>>>>>>>>>>>>> >>>>>>> for >>>>>>>>>>>>>>>>>>>>> >>>>>>> older versions of oVirt (and appear to be >>>>>>>>>>>>>>>>>>>>> >>>>>>> resolved). >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> I will install that package on the other two >>>>>>>>>>>>>>>>>>>>> >>>>>>> hosts, for which I >>>>>>>>>>>>>>>>>>>>> >>>>>>> will >>>>>>>>>>>>>>>>>>>>> >>>>>>> put them in maintenance as vdsm is installed as >>>>>>>>>>>>>>>>>>>>> >>>>>>> an upgrade. I >>>>>>>>>>>>>>>>>>>>> >>>>>>> guess >>>>>>>>>>>>>>>>>>>>> >>>>>>> restarting vdsm is a good idea after that? >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> Campbell >>>>>>>>>>>>>>>>>>>>> >>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak >>>>>>>>>>>>>>>>>>>>> >>>>>>> <[email protected]> >>>>>>>>>>>>>>>>>>>>> >>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> you do not have to install it on all hosts. But >>>>>>>>>>>>>>>>>>>>> >>>>>>>> you should have >>>>>>>>>>>>>>>>>>>>> >>>>>>>> more >>>>>>>>>>>>>>>>>>>>> >>>>>>>> than one and ideally all hosted engine enabled >>>>>>>>>>>>>>>>>>>>> >>>>>>>> nodes should >>>>>>>>>>>>>>>>>>>>> >>>>>>>> belong to >>>>>>>>>>>>>>>>>>>>> >>>>>>>> the same engine cluster. >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> Best regards >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> Martin Sivak >>>>>>>>>>>>>>>>>>>>> >>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc >>>>>>>>>>>>>>>>>>>>> >>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Hi Jenny, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Does ovirt-hosted-engine-ha need to be >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> installed across all >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> hosts? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Could that be the reason it is failing to see >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> it properly? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Cam >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Hi Jenny, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Logs are attached. I can see errors in there, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> but am unsure how >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> they >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> arose. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Campbell >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> <[email protected]> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> From the output it looks like the agent is >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> down, try starting >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> it by >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> running: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> systemctl start ovirt-ha-agent. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> The engine is supposed to see the hosted >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> engine storage domain >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> import it >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> to the system, then it should import the >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> hosted engine vm. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Can you attach the agent log from the host >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> and the engine log from the engine vm >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> (/var/log/ovirt-engine/engine.log)? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Jenny >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> <[email protected]> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hi Jenny, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What version are you running? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 4.1.2.2-1.el7.centos >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> For the hosted engine vm to be imported and >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> displayed in the >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> engine, you >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> must first create a master storage domain. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> To provide a bit more detail: this was a >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> migration of a >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> bare-metal >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> engine in an existing cluster to a hosted >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> engine VM for that >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> cluster. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> As part of this migration, I built an >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> entirely new host and >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> ran >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 'hosted-engine --deploy' (followed these >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> instructions: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> I restored the backup from the engine and it >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> completed >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> without any >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> errors. I didn't see any instructions >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> regarding a master >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> storage >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domain in the page above. The cluster has >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> two existing master >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> storage >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domains, one is fibre channel, which is up, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> and one ISO >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domain, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> which >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> is currently offline. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What do you mean the hosted engine commands >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> are failing? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> happens >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> when >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> you run hosted-engine --vm-status now? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Interestingly, whereas when I ran it before, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> it exited with >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> no >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> output >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> and a return code of '1', it now reports: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> --== Host 1 status ==-- >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> conf_on_shared_storage : True >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Status up-to-date : False >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hostname : >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Host ID : 1 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Engine status : unknown >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> stale-data >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Score : 0 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> stopped : True >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Local maintenance : False >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> crc32 : 0217f07b >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> local_conf_timestamp : 2911 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Host timestamp : 2897 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Extra metadata (valid at timestamp): >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> metadata_parse_version=1 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> metadata_feature_version=1 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> timestamp=2897 (Thu Jun 15 16:22:54 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 2017) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> host-id=1 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> score=0 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> vm_conf_refresh_time=2911 (Thu Jun 15 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 16:23:08 2017) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> conf_on_shared_storage=True >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> maintenance=False >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> state=AgentStopped >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> stopped=True >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Yet I can login to the web GUI fine. I guess >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> it is not HA due >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> being >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> in an unknown state currently? Does the >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> hosted-engine-ha rpm >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> be installed across all nodes in the >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> cluster, btw? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Thanks for the help, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Cam >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> Jenny Tokar >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> <[email protected]> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> I've migrated from a bare-metal engine to >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> a hosted engine. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> There >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> were >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> no errors during the install, however, the >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> hosted engine >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> did not >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> started. I tried running: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> hosted-engine --status >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> on the host I deployed it on, and it >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> returns nothing (exit >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is 1 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> however). I could not ping it either. So I >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> tried starting >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> it via >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> 'hosted-engine --vm-start' and it returned: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Virtual machine does not exist >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> But it then became available. I logged >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> into it >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> successfully. It >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is not >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> in the list of VMs however. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Any ideas why the hosted-engine commands >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> fail, and why it >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is not >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> the list of virtual machines? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for any help, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Cam >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Users mailing list >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Users mailing list >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>> >>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>>>>>>>>>>>>> >>>>>> _______________________________________________ >>>>>>>>>>>>>>>>>>>>> >>>>>> Users mailing list >>>>>>>>>>>>>>>>>>>>> >>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>> >>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>>>>> > _______________________________________________ >>>>>>>>>>>>>>>>>>>>> > Users mailing list >>>>>>>>>>>>>>>>>>>>> > [email protected] >>>>>>>>>>>>>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> _______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

