Re: [ovirt-users] HostedEngine VM not visible, but running

cmc Fri, 30 Jun 2017 05:01:37 -0700

So I can run from any node: hosted-engine --set-maintenance
--mode=global. By 'agents', you mean the ovirt-ha-agent, right? This
shouldn't affect the running of any VMs, correct? Sorry for the
questions, just want to do it correctly and not make assumptions :)


Cheers,

C

On Fri, Jun 30, 2017 at 12:12 PM, Martin Sivak <[email protected]> wrote:
> Hi,
>
>> Just to clarify: you mean the host_id in
>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>> correct?
>
> Exactly.
>
> Put the cluster to global maintenance first. Or kill all agents (has
> the same effect).
>
> Martin
>
> On Fri, Jun 30, 2017 at 12:47 PM, cmc <[email protected]> wrote:
>> Just to clarify: you mean the host_id in
>> /etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
>> correct?
>>
>> On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak <[email protected]> wrote:
>>> Hi,
>>>
>>> cleaning metadata won't help in this case. Try transferring the
>>> spm_ids you got from the engine to the proper hosted engine hosts so
>>> the hosted engine ids match the spm_ids. Then restart all hosted
>>> engine services. I would actually recommend restarting all hosts after
>>> this change, but I have no idea how many VMs you have running.
>>>
>>> Martin
>>>
>>> On Thu, Jun 29, 2017 at 8:27 PM, cmc <[email protected]> wrote:
>>>> Tried running a 'hosted-engine --clean-metadata" as per
>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>>>> ovirt-ha-agent was not running anyway, but it fails with the following
>>>> error:
>>>>
>>>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>>>> to start monitoring domain
>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>> during domain acquisition
>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>>>> call last):
>>>>   File 
>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 191, in _run_agent
>>>>     return action(he)
>>>>   File 
>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>> line 67, in action_clean
>>>>     return he.clean(options.force_cleanup)
>>>>   File 
>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 345, in clean
>>>>     self._initialize_domain_monitor()
>>>>   File 
>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>> line 823, in _initialize_domain_monitor
>>>>     raise Exception(msg)
>>>> Exception: Failed to start monitoring domain
>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>> during domain acquisition
>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>>>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt 
>>>> '0'
>>>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>>>> occurred, giving up. Please review the log and consider filing a bug.
>>>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>>>
>>>> On Thu, Jun 29, 2017 at 6:10 PM, cmc <[email protected]> wrote:
>>>>> Actually, it looks like sanlock problems:
>>>>>
>>>>>    "SanlockInitializationError: Failed to initialize sanlock, the
>>>>> number of errors has exceeded the limit"
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc <[email protected]> wrote:
>>>>>> Sorry, I am mistaken, two hosts failed for the agent with the following 
>>>>>> error:
>>>>>>
>>>>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>>>> ERROR Failed to start monitoring domain
>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>> during domain acquisition
>>>>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>>>> ERROR Shutting down the agent because of 3 failures in a row!
>>>>>>
>>>>>> What could cause these timeouts? Some other service not running?
>>>>>>
>>>>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc <[email protected]> wrote:
>>>>>>> Both services are up on all three hosts. The broke logs just report:
>>>>>>>
>>>>>>> Thread-6549::INFO::2017-06-29
>>>>>>> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>>>>>> Connection established
>>>>>>> Thread-6549::INFO::2017-06-29
>>>>>>> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>>>>>> Connection closed
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Cam
>>>>>>>
>>>>>>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak <[email protected]> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> please make sure that both ovirt-ha-agent and ovirt-ha-broker services
>>>>>>>> are restarted and up. The error says the agent can't talk to the
>>>>>>>> broker. Is there anything in the broker.log?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>>
>>>>>>>> Martin Sivak
>>>>>>>>
>>>>>>>> On Thu, Jun 29, 2017 at 4:42 PM, cmc <[email protected]> wrote:
>>>>>>>>> I've restarted those two services across all hosts, have taken the
>>>>>>>>> Hosted Engine host out of maintenance, and when I try to migrate the
>>>>>>>>> Hosted Engine over to another host, it reports that all three hosts
>>>>>>>>> 'did not satisfy internal filter HA because it is not a Hosted Engine
>>>>>>>>> host'.
>>>>>>>>>
>>>>>>>>> On the host that the Hosted Engine is currently on it reports in the 
>>>>>>>>> agent.log:
>>>>>>>>>
>>>>>>>>> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>>>>>>>>> Connection closed: Connection closed
>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>>>>>>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception
>>>>>>>>> getting service path: Connection closed
>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
>>>>>>>>> call last):
>>>>>>>>>                                                     File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>>>> line 191, in _run_agent
>>>>>>>>>                                                       return 
>>>>>>>>> action(he)
>>>>>>>>>                                                     File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>>>> line 64, in action_proper
>>>>>>>>>                                                       return
>>>>>>>>> he.start_monitoring()
>>>>>>>>>                                                     File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>> line 411, in start_monitoring
>>>>>>>>>                                                       
>>>>>>>>> self._initialize_sanlock()
>>>>>>>>>                                                     File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>> line 691, in _initialize_sanlock
>>>>>>>>>
>>>>>>>>> constants.SERVICE_TYPE + constants.LOCKSPACE_EXTENSION)
>>>>>>>>>                                                     File
>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>>>> line 162, in get_service_path
>>>>>>>>>                                                       .format(str(e)))
>>>>>>>>>                                                   RequestError: Failed
>>>>>>>>> to get service path: Connection closed
>>>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>>>>>>>>>
>>>>>>>>> On Thu, Jun 29, 2017 at 1:25 PM, Martin Sivak <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> yep, you have to restart the ovirt-ha-agent and ovirt-ha-broker 
>>>>>>>>>> services.
>>>>>>>>>>
>>>>>>>>>> The scheduling message just means that the host has score 0 or is not
>>>>>>>>>> reporting score at all.
>>>>>>>>>>
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 29, 2017 at 1:33 PM, cmc <[email protected]> wrote:
>>>>>>>>>>> Thanks Martin, do I have to restart anything? When I try to use the
>>>>>>>>>>> 'migrate' operation, it complains that the other two hosts 'did not
>>>>>>>>>>> satisfy internal filter HA because it is not a Hosted Engine host..'
>>>>>>>>>>> (even though I reinstalled both these hosts with the 'deploy hosted
>>>>>>>>>>> engine' option, which suggests that something needs restarting. 
>>>>>>>>>>> Should
>>>>>>>>>>> I worry about the sanlock errors, or will that be resolved by the
>>>>>>>>>>> change in host_id?
>>>>>>>>>>>
>>>>>>>>>>> Kind regards,
>>>>>>>>>>>
>>>>>>>>>>> Cam
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 29, 2017 at 12:22 PM, Martin Sivak <[email protected]> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Change the ids so they are distinct. I need to check if there is a 
>>>>>>>>>>>> way
>>>>>>>>>>>> to read the SPM ids from the engine as using the same numbers 
>>>>>>>>>>>> would be
>>>>>>>>>>>> the best.
>>>>>>>>>>>>
>>>>>>>>>>>> Martin
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 29, 2017 at 12:46 PM, cmc <[email protected]> wrote:
>>>>>>>>>>>>> Is there any way of recovering from this situation? I'd prefer to 
>>>>>>>>>>>>> fix
>>>>>>>>>>>>> the issue rather than re-deploy, but if there is no recovery 
>>>>>>>>>>>>> path, I
>>>>>>>>>>>>> could perhaps try re-deploying the hosted engine. In which case, 
>>>>>>>>>>>>> would
>>>>>>>>>>>>> the best option be to take a backup of the Hosted Engine, and then
>>>>>>>>>>>>> shut it down, re-initialise the SAN partition (or use another
>>>>>>>>>>>>> partition) and retry the deployment? Would it be better to use the
>>>>>>>>>>>>> older backup from the bare metal engine that I originally used, 
>>>>>>>>>>>>> or use
>>>>>>>>>>>>> a backup from the Hosted Engine? I'm not sure if any VMs have been
>>>>>>>>>>>>> added since switching to Hosted Engine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately I have very little time left to get this working 
>>>>>>>>>>>>> before
>>>>>>>>>>>>> I have to hand it over for eval (by end of Friday).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here are some log snippets from the cluster that are current
>>>>>>>>>>>>>
>>>>>>>>>>>>> In /var/log/vdsm/vdsm.log on the host that has the Hosted Engine:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-06-29 10:50:15,071+0100 INFO  (monitor/207221b) 
>>>>>>>>>>>>> [storage.SANLock]
>>>>>>>>>>>>> Acquiring host id for domain 207221b2-959b-426b-b945-18e1adfed62f 
>>>>>>>>>>>>> (id:
>>>>>>>>>>>>> 3) (clusterlock:282)
>>>>>>>>>>>>> 2017-06-29 10:50:15,072+0100 ERROR (monitor/207221b) 
>>>>>>>>>>>>> [storage.Monitor]
>>>>>>>>>>>>> Error acquiring host id 3 for domain
>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f (monitor:558)
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>>   File "/usr/share/vdsm/storage/monitor.py", line 555, in 
>>>>>>>>>>>>> _acquireHostId
>>>>>>>>>>>>>     self.domain.acquireHostId(self.hostId, async=True)
>>>>>>>>>>>>>   File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>>>>>>>>>>>>     self._manifest.acquireHostId(hostId, async)
>>>>>>>>>>>>>   File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>>>>>>>>>>>>     self._domainLock.acquireHostId(hostId, async)
>>>>>>>>>>>>>   File 
>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>>>>>>>>>>> line 297, in acquireHostId
>>>>>>>>>>>>>     raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>>>>>>>>>>> AcquireHostIdFailure: Cannot acquire host id:
>>>>>>>>>>>>> ('207221b2-959b-426b-b945-18e1adfed62f', SanlockException(22, 
>>>>>>>>>>>>> 'Sanlock
>>>>>>>>>>>>> lockspace add failure', 'Invalid argument'))
>>>>>>>>>>>>>
>>>>>>>>>>>>> From /var/log/ovirt-hosted-engine-ha/agent.log on the same host:
>>>>>>>>>>>>>
>>>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>> 13:30:50,592::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>>>> Failed to start monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>>>> 13:30:50,593::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Error while monitoring engine: Failed to start monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>>>> 13:30:50,593::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Unexpected error
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>>   File 
>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>>>     self._initialize_domain_monitor()
>>>>>>>>>>>>>   File 
>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 823, in _initialize_domain_monitor
>>>>>>>>>>>>>     raise Exception(msg)
>>>>>>>>>>>>> Exception: Failed to start monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>> 13:30:50,593::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Shutting down the agent because of 3 failures in a row!
>>>>>>>>>>>>>
>>>>>>>>>>>>> From sanlock.log:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-06-29 11:17:06+0100 1194149 [2530]: add_lockspace
>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>> conflicts with name of list1 s5
>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>
>>>>>>>>>>>>> From the two other hosts:
>>>>>>>>>>>>>
>>>>>>>>>>>>> host 2:
>>>>>>>>>>>>>
>>>>>>>>>>>>> vdsm.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2017-06-29 10:53:47,755+0100 ERROR (jsonrpc/4) 
>>>>>>>>>>>>> [jsonrpc.JsonRpcServer]
>>>>>>>>>>>>> Internal server error (__init__:570)
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>>   File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", 
>>>>>>>>>>>>> line
>>>>>>>>>>>>> 565, in _handle_request
>>>>>>>>>>>>>     res = method(**params)
>>>>>>>>>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
>>>>>>>>>>>>> 202, in _dynamicMethod
>>>>>>>>>>>>>     result = fn(*methodArgs)
>>>>>>>>>>>>>   File "/usr/share/vdsm/API.py", line 1454, in 
>>>>>>>>>>>>> getAllVmIoTunePolicies
>>>>>>>>>>>>>     io_tune_policies_dict = self._cif.getAllVmIoTunePolicies()
>>>>>>>>>>>>>   File "/usr/share/vdsm/clientIF.py", line 448, in 
>>>>>>>>>>>>> getAllVmIoTunePolicies
>>>>>>>>>>>>>     'current_values': v.getIoTune()}
>>>>>>>>>>>>>   File "/usr/share/vdsm/virt/vm.py", line 2803, in getIoTune
>>>>>>>>>>>>>     result = self.getIoTuneResponse()
>>>>>>>>>>>>>   File "/usr/share/vdsm/virt/vm.py", line 2816, in 
>>>>>>>>>>>>> getIoTuneResponse
>>>>>>>>>>>>>     res = self._dom.blockIoTune(
>>>>>>>>>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", 
>>>>>>>>>>>>> line
>>>>>>>>>>>>> 47, in __getattr__
>>>>>>>>>>>>>     % self.vmid)
>>>>>>>>>>>>> NotConnectedError: VM u'a79e6b0e-fff4-4cba-a02c-4c00be151300' was 
>>>>>>>>>>>>> not
>>>>>>>>>>>>> started yet or was shut down
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/agent.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:33,636::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>>>>>>>>>>>>> Found OVF_STORE: imgUUID:222610db-7880-4f4f-8559-a3635fd73555,
>>>>>>>>>>>>> volUUID:c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:33,926::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:33,938::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>> OVF_STORE volume path:
>>>>>>>>>>>>> /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/images/222610db-7880-4f4f-8559-a3635fd73555/c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:33,967::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>> Found an OVF for HE VM, trying to convert
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:33,971::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>> Got vm.conf from OVF_STORE
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:36,736::states::678::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>>>>>>>>>>>>> Score is 0 due to unexpected vm shutdown at Thu Jun 29 10:53:59 
>>>>>>>>>>>>> 2017
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:36,736::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>>>> 10:56:46,772::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
>>>>>>>>>>>>> Reloading vm.conf from the shared storage domain
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/messages:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jun 29 10:53:46 kvm-ldn-02 kernel: dd: sending ioctl 80306d02 to 
>>>>>>>>>>>>> a partition!
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> host 1:
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/messages also in sanlock.log
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jun 29 11:01:02 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:02+0100
>>>>>>>>>>>>> 678325 [9132]: s4531 delta_acquire host_id 1 busy1 1 2 1193177
>>>>>>>>>>>>> 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>> Jun 29 11:01:03 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:03+0100
>>>>>>>>>>>>> 678326 [24159]: s4531 add_lockspace fail result -262
>>>>>>>>>>>>>
>>>>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/agent.log:
>>>>>>>>>>>>>
>>>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>>>> 15:21:01,143::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>>>> Failed to start monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>>>> 15:21:01,144::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Error while monitoring engine: Failed to start monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>>>> 15:21:01,144::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Unexpected error
>>>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>>>   File 
>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>>>     self._initialize_domain_monitor()
>>>>>>>>>>>>>   File 
>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>>>> line 823, in _initialize_domain_monitor
>>>>>>>>>>>>>     raise Exception(msg)
>>>>>>>>>>>>> Exception: Failed to start monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>>>> during domain acquisition
>>>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>>>> 15:21:01,144::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>>>> Shutting down the agent because of 3 failures in a row!
>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>> 15:21:06,717::hosted_engine::848::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>>>>>>>>>>>> VDSM domain monitor status: PENDING
>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>> 15:21:09,335::hosted_engine::776::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>>>>>>>>>>>> Failed to stop monitoring domain
>>>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f): Storage domain is
>>>>>>>>>>>>> member of pool: u'domain=207221b2-959b-426b-b945-18e1adfed62f'
>>>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>>>> 15:21:09,339::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>>>>>>>>>>> Agent shutting down
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 28, 2017 at 11:25 AM, cmc <[email protected]> wrote:
>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> yes, on two of the machines they have the same host_id. The 
>>>>>>>>>>>>>> other has
>>>>>>>>>>>>>> a different host_id.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To update since yesterday: I reinstalled and deployed Hosted 
>>>>>>>>>>>>>> Engine on
>>>>>>>>>>>>>> the other host (so all three hosts in the cluster now have it
>>>>>>>>>>>>>> installed). The second one I deployed said it was able to host 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> engine (unlike the first I reinstalled), so I tried putting the 
>>>>>>>>>>>>>> host
>>>>>>>>>>>>>> with the Hosted Engine on it into maintenance to see if it would
>>>>>>>>>>>>>> migrate over. It managed to move all hosts but the Hosted 
>>>>>>>>>>>>>> Engine. And
>>>>>>>>>>>>>> now the host that said it was able to host the engine says
>>>>>>>>>>>>>> 'unavailable due to HA score'. The host that it was trying to 
>>>>>>>>>>>>>> move
>>>>>>>>>>>>>> from is now in 'preparing for maintenance' for the last 12 hours.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The summary is:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kvm-ldn-01 - one of the original, pre-Hosted Engine hosts, 
>>>>>>>>>>>>>> reinstalled
>>>>>>>>>>>>>> with 'Deploy Hosted Engine'. No icon saying it can host the 
>>>>>>>>>>>>>> Hosted
>>>>>>>>>>>>>> Hngine, host_id of '2' in 
>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf.
>>>>>>>>>>>>>> 'add_lockspace' fails in sanlock.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kvm-ldn-02 - the other host that was pre-existing before Hosted 
>>>>>>>>>>>>>> Engine
>>>>>>>>>>>>>> was created. Reinstalled with 'Deploy Hosted Engine'. Had an icon
>>>>>>>>>>>>>> saying that it was able to host the Hosted Engine, but after 
>>>>>>>>>>>>>> migration
>>>>>>>>>>>>>> was attempted when putting kvm-ldn-03 into maintenance, it 
>>>>>>>>>>>>>> reports:
>>>>>>>>>>>>>> 'unavailable due to HA score'. It has a host_id of '1' in
>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf. No errors in 
>>>>>>>>>>>>>> sanlock.log
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> kvm-ldn-03 - this was the host I deployed Hosted Engine on, 
>>>>>>>>>>>>>> which was
>>>>>>>>>>>>>> not part of the original cluster. I restored the bare-metal 
>>>>>>>>>>>>>> engine
>>>>>>>>>>>>>> backup in the Hosted Engine on this host when deploying it, 
>>>>>>>>>>>>>> without
>>>>>>>>>>>>>> error. It currently has the Hosted Engine on it (as the only VM 
>>>>>>>>>>>>>> after
>>>>>>>>>>>>>> I put that host into maintenance to test the HA of Hosted 
>>>>>>>>>>>>>> Engine).
>>>>>>>>>>>>>> Sanlock log shows conflicts
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I will look through all the logs for any other errors. Please 
>>>>>>>>>>>>>> let me
>>>>>>>>>>>>>> know if you need any logs or other clarification/information.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Campbell
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 28, 2017 at 9:25 AM, Martin Sivak 
>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> can you please check the contents of
>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf or
>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine-ha/agent.conf (I am not sure which one 
>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>> right now) and search for host-id?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Make sure the IDs are different. If they are not, then there is 
>>>>>>>>>>>>>>> a bug somewhere.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 6:26 PM, cmc <[email protected]> wrote:
>>>>>>>>>>>>>>>> I see this on the host it is trying to migrate in 
>>>>>>>>>>>>>>>> /var/log/sanlock:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2017-06-27 17:10:40+0100 527703 [2407]: s3528 lockspace
>>>>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>>>> 2017-06-27 17:13:00+0100 527843 [27446]: s3528 delta_acquire 
>>>>>>>>>>>>>>>> host_id 1
>>>>>>>>>>>>>>>> busy1 1 2 1042692 
>>>>>>>>>>>>>>>> 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>>>>> 2017-06-27 17:13:01+0100 527844 [2407]: s3528 add_lockspace 
>>>>>>>>>>>>>>>> fail result -262
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The sanlock service is running. Why would this occur?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> C
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 5:21 PM, cmc <[email protected]> wrote:
>>>>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for the reply. I have done this, and the deployment 
>>>>>>>>>>>>>>>>> completed
>>>>>>>>>>>>>>>>> without error. However, it still will not allow the Hosted 
>>>>>>>>>>>>>>>>> Engine
>>>>>>>>>>>>>>>>> migrate to another host. The
>>>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf got created ok on 
>>>>>>>>>>>>>>>>> the host
>>>>>>>>>>>>>>>>> I re-installed, but the ovirt-ha-broker.service, though it 
>>>>>>>>>>>>>>>>> starts,
>>>>>>>>>>>>>>>>> reports:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --------------------8<-------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Jun 27 14:58:26 kvm-ldn-01 systemd[1]: Starting oVirt Hosted 
>>>>>>>>>>>>>>>>> Engine
>>>>>>>>>>>>>>>>> High Availability Communications Broker...
>>>>>>>>>>>>>>>>> Jun 27 14:58:27 kvm-ldn-01 ovirt-ha-broker[6101]: 
>>>>>>>>>>>>>>>>> ovirt-ha-broker
>>>>>>>>>>>>>>>>> ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker 
>>>>>>>>>>>>>>>>> ERROR
>>>>>>>>>>>>>>>>> Failed to read metadata from
>>>>>>>>>>>>>>>>> /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata
>>>>>>>>>>>>>>>>>                                                   Traceback 
>>>>>>>>>>>>>>>>> (most
>>>>>>>>>>>>>>>>> recent call last):
>>>>>>>>>>>>>>>>>                                                     File
>>>>>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>>>>>>>>>>> line 129, in get_raw_stats_for_service_type
>>>>>>>>>>>>>>>>>                                                       f =
>>>>>>>>>>>>>>>>> os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)
>>>>>>>>>>>>>>>>>                                                   OSError: 
>>>>>>>>>>>>>>>>> [Errno 2]
>>>>>>>>>>>>>>>>> No such file or directory:
>>>>>>>>>>>>>>>>> '/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --------------------8<-------------------
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I checked the path, and it exists. I can run 'less -f' on it 
>>>>>>>>>>>>>>>>> fine. The
>>>>>>>>>>>>>>>>> perms are slightly different on the host that is running the 
>>>>>>>>>>>>>>>>> VM vs the
>>>>>>>>>>>>>>>>> one that is reporting errors (600 vs 660), ownership is 
>>>>>>>>>>>>>>>>> vdsm:qemu. Is
>>>>>>>>>>>>>>>>> this a san locking issue?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 1:41 PM, Martin Sivak 
>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>> Should it be? It was not in the instructions for the 
>>>>>>>>>>>>>>>>>>> migration from
>>>>>>>>>>>>>>>>>>> bare-metal to Hosted VM
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The hosted engine will only migrate to hosts that have the 
>>>>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>>>>> running. Please put one other host to maintenance and select 
>>>>>>>>>>>>>>>>>> Hosted
>>>>>>>>>>>>>>>>>> engine action: DEPLOY in the reinstall dialog.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 1:23 PM, cmc <[email protected]> 
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> I changed the 'os.other.devices.display.protocols.value.3.6 
>>>>>>>>>>>>>>>>>>> =
>>>>>>>>>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display 
>>>>>>>>>>>>>>>>>>> protocols
>>>>>>>>>>>>>>>>>>> as 4 and the hosted engine now appears in the list of VMs. 
>>>>>>>>>>>>>>>>>>> I am
>>>>>>>>>>>>>>>>>>> guessing the compatibility version was causing it to use 
>>>>>>>>>>>>>>>>>>> the 3.6
>>>>>>>>>>>>>>>>>>> version. However, I am still unable to migrate the engine 
>>>>>>>>>>>>>>>>>>> VM to
>>>>>>>>>>>>>>>>>>> another host. When I try putting the host it is currently 
>>>>>>>>>>>>>>>>>>> on into
>>>>>>>>>>>>>>>>>>> maintenance, it reports:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Error while executing action: Cannot switch the Host(s) to 
>>>>>>>>>>>>>>>>>>> Maintenance mode.
>>>>>>>>>>>>>>>>>>> There are no available hosts capable of running the engine 
>>>>>>>>>>>>>>>>>>> VM.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Running 'hosted-engine --vm-status' still shows 'Engine 
>>>>>>>>>>>>>>>>>>> status:
>>>>>>>>>>>>>>>>>>> unknown stale-data'.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The ovirt-ha-broker service is only running on one host. It 
>>>>>>>>>>>>>>>>>>> was set to
>>>>>>>>>>>>>>>>>>> 'disabled' in systemd. It won't start as there is no
>>>>>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf on the other 
>>>>>>>>>>>>>>>>>>> two hosts.
>>>>>>>>>>>>>>>>>>> Should it be? It was not in the instructions for the 
>>>>>>>>>>>>>>>>>>> migration from
>>>>>>>>>>>>>>>>>>> bare-metal to Hosted VM
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 1:07 PM, cmc <[email protected]> 
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> Hi Tomas,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> So in my 
>>>>>>>>>>>>>>>>>>>> /usr/share/ovirt-engine/conf/osinfo-defaults.properties on 
>>>>>>>>>>>>>>>>>>>> my
>>>>>>>>>>>>>>>>>>>> engine VM, I have:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value = 
>>>>>>>>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value.3.6 = 
>>>>>>>>>>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> That seems to match - I assume since this is 4.1, the 3.6 
>>>>>>>>>>>>>>>>>>>> should not apply
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is there somewhere else I should be looking?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 11:40 AM, Tomas Jelinek 
>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek
>>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> > On 22 Jun 2017, at 12:31, Martin Sivak 
>>>>>>>>>>>>>>>>>>>>>> > <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > Tomas, what fields are needed in a VM to pass the 
>>>>>>>>>>>>>>>>>>>>>> > check that causes
>>>>>>>>>>>>>>>>>>>>>> > the following error?
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >>>>> WARN  
>>>>>>>>>>>>>>>>>>>>>> >>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>> >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation 
>>>>>>>>>>>>>>>>>>>>>> >>>>> of action
>>>>>>>>>>>>>>>>>>>>>> >>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>> >>>>> failed for user SYSTEM. Reasons: 
>>>>>>>>>>>>>>>>>>>>>> >>>>> VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> to match the OS and VM Display type;-)
>>>>>>>>>>>>>>>>>>>>>> Configuration is in osinfo….e.g. if that is import from 
>>>>>>>>>>>>>>>>>>>>>> older releases on
>>>>>>>>>>>>>>>>>>>>>> Linux this is typically caused by the cahgen of cirrus 
>>>>>>>>>>>>>>>>>>>>>> to vga for non-SPICE
>>>>>>>>>>>>>>>>>>>>>> VMs
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> yep, the default supported combinations for 4.0+ is this:
>>>>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value =
>>>>>>>>>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > Thanks.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > On Thu, Jun 22, 2017 at 12:19 PM, cmc 
>>>>>>>>>>>>>>>>>>>>>> > <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >> Hi Martin,
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>>>>> >>> just as a random comment, do you still have the 
>>>>>>>>>>>>>>>>>>>>>> >>> database backup from
>>>>>>>>>>>>>>>>>>>>>> >>> the bare metal -> VM attempt? It might be possible 
>>>>>>>>>>>>>>>>>>>>>> >>> to just try again
>>>>>>>>>>>>>>>>>>>>>> >>> using it. Or in the worst case.. update the 
>>>>>>>>>>>>>>>>>>>>>> >>> offending value there
>>>>>>>>>>>>>>>>>>>>>> >>> before restoring it to the new engine instance.
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> I still have the backup. I'd rather do the latter, as 
>>>>>>>>>>>>>>>>>>>>>> >> re-running the
>>>>>>>>>>>>>>>>>>>>>> >> HE deployment is quite lengthy and involved (I have 
>>>>>>>>>>>>>>>>>>>>>> >> to re-initialise
>>>>>>>>>>>>>>>>>>>>>> >> the FC storage each time). Do you know what the 
>>>>>>>>>>>>>>>>>>>>>> >> offending value(s)
>>>>>>>>>>>>>>>>>>>>>> >> would be? Would it be in the Postgres DB or in a 
>>>>>>>>>>>>>>>>>>>>>> >> config file
>>>>>>>>>>>>>>>>>>>>>> >> somewhere?
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> Cheers,
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> Cam
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >>> Regards
>>>>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>>>>> >>> Martin Sivak
>>>>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>>>>> >>> On Thu, Jun 22, 2017 at 11:39 AM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>> Hi Yanir,
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>> Thanks for the reply.
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>> First of all, maybe a chain reaction of :
>>>>>>>>>>>>>>>>>>>>>> >>>>> WARN  
>>>>>>>>>>>>>>>>>>>>>> >>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>> >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation 
>>>>>>>>>>>>>>>>>>>>>> >>>>> of action
>>>>>>>>>>>>>>>>>>>>>> >>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>> >>>>> failed for user SYSTEM. Reasons: 
>>>>>>>>>>>>>>>>>>>>>> >>>>> VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>> >>>>> is causing the hosted engine vm not to be set up 
>>>>>>>>>>>>>>>>>>>>>> >>>>> correctly  and
>>>>>>>>>>>>>>>>>>>>>> >>>>> further
>>>>>>>>>>>>>>>>>>>>>> >>>>> actions were made when the hosted engine vm wasnt 
>>>>>>>>>>>>>>>>>>>>>> >>>>> in a stable state.
>>>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>> As for now, are you trying to revert back to a 
>>>>>>>>>>>>>>>>>>>>>> >>>>> previous/initial
>>>>>>>>>>>>>>>>>>>>>> >>>>> state ?
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>> I'm not trying to revert it to a previous state for 
>>>>>>>>>>>>>>>>>>>>>> >>>> now. This was a
>>>>>>>>>>>>>>>>>>>>>> >>>> migration from a bare metal engine, and it didn't 
>>>>>>>>>>>>>>>>>>>>>> >>>> report any error
>>>>>>>>>>>>>>>>>>>>>> >>>> during the migration. I'd had some problems on my 
>>>>>>>>>>>>>>>>>>>>>> >>>> first attempts at
>>>>>>>>>>>>>>>>>>>>>> >>>> this migration, whereby it never completed (due to 
>>>>>>>>>>>>>>>>>>>>>> >>>> a proxy issue) but
>>>>>>>>>>>>>>>>>>>>>> >>>> I managed to resolve this. Do you know of a way to 
>>>>>>>>>>>>>>>>>>>>>> >>>> get the Hosted
>>>>>>>>>>>>>>>>>>>>>> >>>> Engine VM into a stable state, without rebuilding 
>>>>>>>>>>>>>>>>>>>>>> >>>> the entire cluster
>>>>>>>>>>>>>>>>>>>>>> >>>> from scratch (since I have a lot of VMs on it)?
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>> Thanks for any help.
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>> Regards,
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>> Cam
>>>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>> Regards,
>>>>>>>>>>>>>>>>>>>>>> >>>>> Yanir
>>>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>> Hi Jenny/Martin,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>> Any idea what I can do here? The hosted engine VM 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> has no log on any
>>>>>>>>>>>>>>>>>>>>>> >>>>>> host in /var/log/libvirt/qemu, and I fear that if 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> I need to put the
>>>>>>>>>>>>>>>>>>>>>> >>>>>> host into maintenance, e.g., to upgrade it that I 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> created it on
>>>>>>>>>>>>>>>>>>>>>> >>>>>> (which
>>>>>>>>>>>>>>>>>>>>>> >>>>>> I think is hosting it), or if it fails for any 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> reason, it won't get
>>>>>>>>>>>>>>>>>>>>>> >>>>>> migrated to another host, and I will not be able 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> to manage the
>>>>>>>>>>>>>>>>>>>>>> >>>>>> cluster. It seems to be a very dangerous position 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> to be in.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks Martin. The hosts are all part of the 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> same cluster.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> I get these errors in the engine.log on the 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> engine:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z WARN
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Validation of action
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> failed for user SYST
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> EM. Reasons:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z INFO
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> freed to object
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 'EngineLock:{exclusiveLocks='[a
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine>,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine=<VM_NAME, 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> sharedLocks=
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine>]'}'
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z ERROR
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter]
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> importing the Hosted
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Engine VM
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> The sanlock.log reports conflicts on that same 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> host, and a
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> different
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> error on the other hosts, not sure if they are 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> related.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> And this in the 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> /var/log/ovirt-hosted-engine-ha/agent log on the
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> host
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> which I deployed the hosted engine VM on:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Unable to extract HEVM OVF
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Failed extracting VM OVF from the OVF_STORE 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> volume, falling back
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> initial vm.conf
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> I've seen some of these issues reported in 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> bugzilla, but they were
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> for
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> older versions of oVirt (and appear to be 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> resolved).
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> I will install that package on the other two 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> hosts, for which I
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> will
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> put them in maintenance as vdsm is installed as 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> an upgrade. I
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> guess
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> restarting vdsm is a good idea after that?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> you do not have to install it on all hosts. But 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> you should have
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> more
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> than one and ideally all hosted engine enabled 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> nodes should
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> belong to
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> the same engine cluster.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Does ovirt-hosted-engine-ha need to be 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> installed across all
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> hosts?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Could that be the reason it is failing to see 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> it properly?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Logs are attached. I can see errors in there, 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> but am unsure how
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> they
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> arose.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Tokar
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> From the output it looks like the agent is 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> down, try starting
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> it by
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> running:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> systemctl start ovirt-ha-agent.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> The engine is supposed to see the hosted 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> engine storage domain
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> import it
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> to the system, then it should import the 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> hosted engine vm.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Can you attach the agent log from the host
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> and the engine log from the engine vm
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> (/var/log/ovirt-engine/engine.log)?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Jenny
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What version are you running?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 4.1.2.2-1.el7.centos
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> For the hosted engine vm to be imported 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> and displayed in the
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> engine, you
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> must first create a master storage domain.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> To provide a bit more detail: this was a 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> migration of a
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> bare-metal
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> engine in an existing cluster to a hosted 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> engine VM for that
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> cluster.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> As part of this migration, I built an 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> entirely new host and
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> ran
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 'hosted-engine --deploy' (followed these 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> instructions:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/).
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> I restored the backup from the engine and 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> it completed
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> without any
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> errors. I didn't see any instructions 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> regarding a master
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domain in the page above. The cluster has 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> two existing master
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domains, one is fibre channel, which is up, 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> and one ISO
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domain,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> is currently offline.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What do you mean the hosted engine 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> commands are failing?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> happens
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> you run hosted-engine --vm-status now?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Interestingly, whereas when I ran it 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> before, it exited with
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> output
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> and a return code of '1', it now reports:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> --== Host 1 status ==--
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> conf_on_shared_storage             : True
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Status up-to-date                  : False
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hostname                           :
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Host ID                            : 1
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Engine status                      : 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> unknown stale-data
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Score                              : 0
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> stopped                            : True
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Local maintenance                  : False
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> crc32                              : 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 0217f07b
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> local_conf_timestamp               : 2911
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Host timestamp                     : 2897
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Extra metadata (valid at timestamp):
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        metadata_parse_version=1
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        metadata_feature_version=1
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        timestamp=2897 (Thu Jun 15 16:22:54 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 2017)
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        host-id=1
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        score=0
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        vm_conf_refresh_time=2911 (Thu Jun 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 15 16:23:08 2017)
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        conf_on_shared_storage=True
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        maintenance=False
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        state=AgentStopped
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        stopped=True
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Yet I can login to the web GUI fine. I 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> guess it is not HA due
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> in an unknown state currently? Does the 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> hosted-engine-ha rpm
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> be installed across all nodes in the 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> cluster, btw?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> Jenny Tokar
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> I've migrated from a bare-metal engine to 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> a hosted engine.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> were
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> no errors during the install, however, 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> the hosted engine
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> did not
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> started. I tried running:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> hosted-engine --status
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> on the host I deployed it on, and it 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> returns nothing (exit
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is 1
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> however). I could not ping it either. So 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> I tried starting
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> it via
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> 'hosted-engine --vm-start' and it 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> returned:
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Virtual machine does not exist
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> But it then became available. I logged 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> into it
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> successfully. It
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> in the list of VMs however.
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Any ideas why the hosted-engine commands 
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> fail, and why it
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> the list of virtual machines?
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> [email protected]
>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>> >>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>>>> >>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>>>> >>>>>> [email protected]
>>>>>>>>>>>>>>>>>>>>>> >>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>>>> > _______________________________________________
>>>>>>>>>>>>>>>>>>>>>> > Users mailing list
>>>>>>>>>>>>>>>>>>>>>> > [email protected]
>>>>>>>>>>>>>>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] HostedEngine VM not visible, but running

Reply via email to