On Fri, Jun 29, 2018 at 12:32 PM fsoyer <[email protected]> wrote: > Hi, >
hi Frank, thanks for the report. > I must say it : I'm -totally- lost. > To try to find a reason to this error, I've re-installed the first host > from scratch - CentOS 7.5-1804, ovirt 4.2.3-1, gluster 3.12.9. > The first attemp was made with only em1 declared. Result = SUCCESS, the > install pass "Get local VM IP", then through "Wait for the host to be up" > without difficulty and wait at "Please specify the storage...". > At this time I even notice that I've forgot to stop/disable > NetworkManager, that had no impact ! > So : I re-install the host from scratch (yes, sometimes I'm a fool) to be > absolutly sure that there is no problem coming from the preceding install. > Now I declare em1 (10.0.0.230) and em2 (10.0.0.229, without gateway nor > DNS, for futur vmnetwork). NetworkManager off and disabled. Result = > SUCCESS... Oo > OK : Re-install host !! Now I declare, as I did some days ago, em1, em2 > and bond0(em3+em4 with IP 192.168.0.30). Result : SUCCESS !!! Oo > > So I'm unable to say what append tuesday. Actually I see only two > differences : > - gluster is not active (I don't configure it to go faster) > did you tried with gdeploy from the cockpit web ui? > - the version of ovirt (ovirt-release, ovirt-host, appliance...) has > sligthly changed. > AFAIK we didn't have any specific fix for that kind of issue in the recent weeks. > > I've no more time for another attempt re-installing the host(s) with > gluster activated, I must now go on as I need an operational system for > other tasks with VMs this afternoon. So I leave the first host waiting for > the end of install in a screen, I re-install the 2 other hosts and activate > gluster and volumes on the 3 nodes. Then I'll end the install on the > gluster volume. > I'll tell you if this works finally, but I hope so ! > however, I'm in doubt with this problem. I have no explanation of what > append tuesday, this is really annoying... Yes, the same for me. > Maybe have you the ability to test on a same configuration (3 hosts with 2 > nics on the same network for ovirtmgmt and a futur vmnetwork, and gluster > on a separate network) to try to understand ? > In the past moths we had a lot of successful tests also on complex network environments; I'll try to reproduce on something really close to your env. > > Thank you for the time spent. > Frank > > PS : to answer to your question : yes, tuesday I > ran ovirt-hosted-engine-cleanup between each attempt. > > > Le Jeudi, Juin 28, 2018 16:26 CEST, Simone Tiraboschi <[email protected]> > a écrit: > > > > On Wed, Jun 27, 2018 at 5:48 PM [email protected] <[email protected]> wrote: > >> Hi again, >> In fact, the hour in file is exactly 2hours before, I guess a timezone >> problem (in the process of install ?), as the file itself is correctly >> timed at 11:17am (correct hour here in France). So the messages are >> synchrone. >> >> > > Yes, sorry, fault of mine. > From the logs I don't see anything strange. > > Can you please try again on your environment and connect to the bootstrap > VM via virsh console or VNC to check what's happening there? > > Did you also run ovirt-hosted-engine-cleanup between one attempt and the > next? > > >> >> -------- Message original -------- >> Objet : Re: [ovirt-users] Re: Install hosted-engine - Task Get local VM >> IP failed >> De : Simone Tiraboschi >> À : [email protected] >> Cc : users >> >> >> >> Hi, >> HostedEngineLocal was started at 2018-06-26 09:17:26 but >> /var/log/messages starts only at Jun 26 11:02:32. >> Can you please reattach it fro the relevant time frame? >> >> On Wed, Jun 27, 2018 at 10:54 AM fsoyer <[email protected]> wrote: >> >>> Hi Simone, >>> here are the revelant part of messages and the engine install log (there >>> were only this file in /var/log/libvirt/qemu) . >>> >>> Thanks for your time. >>> >>> Frank >>> >>> >>> Le Mardi, Juin 26, 2018 11:43 CEST, Simone Tiraboschi < >>> [email protected]> a écrit: >>> >>> >>> >>> On Tue, Jun 26, 2018 at 11:39 AM fsoyer <[email protected]> wrote: >>> >>>> Well, >>>> unfortunatly, it was a "false-positive". This morning I tried again, >>>> with the idea that at one moment the deploy will ask for the final >>>> destination for the engine, I will restart bond0+gluster+volume engine at >>>> thos moment. >>>> Re-launching the deploy on the second "fresh" host (the first one with >>>> all errors yesterday let it in a doutful state) with em2 and gluster+bond0 >>>> off : >>>> >>>> # ip a >>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>>> group default qlen 1000 >>>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>>> inet 127.0.0.1/8 scope host lo >>>> valid_lft forever preferred_lft forever >>>> inet6 ::1/128 scope host >>>> valid_lft forever preferred_lft forever >>>> 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP >>>> group default qlen 1000 >>>> link/ether e0:db:55:15:f0:f0 brd ff:ff:ff:ff:ff:ff >>>> inet 10.0.0.227/8 brd 10.255.255.255 scope global em1 >>>> valid_lft forever preferred_lft forever >>>> inet6 fe80::e2db:55ff:fe15:f0f0/64 scope link >>>> valid_lft forever preferred_lft forever >>>> 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>> default qlen 1000 >>>> link/ether e0:db:55:15:f0:f1 brd ff:ff:ff:ff:ff:ff >>>> 4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>> default qlen 1000 >>>> link/ether e0:db:55:15:f0:f2 brd ff:ff:ff:ff:ff:ff >>>> 5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>> default qlen 1000 >>>> link/ether e0:db:55:15:f0:f3 brd ff:ff:ff:ff:ff:ff >>>> 6: bond0: <BROADCAST,MULTICAST,MASTER> mtu 9000 qdisc noqueue state >>>> DOWN group default qlen 1000 >>>> link/ether 3a:ab:a2:f2:38:5c brd ff:ff:ff:ff:ff:ff >>>> >>>> # ip r >>>> default via 10.0.1.254 dev em1 >>>> 10.0.0.0/8 dev em1 proto kernel scope link src 10.0.0.227 >>>> 169.254.0.0/16 dev em1 scope link metric 1002 >>>> >>>> ... does NOT work this morning >>>> >>>> [ INFO ] TASK [Get local VM IP] >>>> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": >>>> true, "cmd": "virsh -r net-dhcp-leases default | grep -i 00:16:3e:01:c6:32 >>>> | awk '{ print $5 }' | cut -f1 -d'/'", "delta": "0:00:00.083587", "end": >>>> "2018-06-26 11:26:07.581706", "rc": 0, "start": "2018-06-26 >>>> 11:26:07.498119", "stderr": "", "stderr_lines": [], "stdout": "", >>>> "stdout_lines": []} >>>> >>>> I'm sure that the network was the same yesterday when my attempt >>>> finally pass the "get local vm ip". Why not today ? >>>> After the error, the network was : >>>> >>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>>> group default qlen 1000 >>>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>>> inet 127.0.0.1/8 scope host lo >>>> valid_lft forever preferred_lft forever >>>> inet6 ::1/128 scope host >>>> valid_lft forever preferred_lft forever >>>> 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP >>>> group default qlen 1000 >>>> link/ether e0:db:55:15:f0:f0 brd ff:ff:ff:ff:ff:ff >>>> inet 10.0.0.227/8 brd 10.255.255.255 scope global em1 >>>> valid_lft forever preferred_lft forever >>>> inet6 fe80::e2db:55ff:fe15:f0f0/64 scope link >>>> valid_lft forever preferred_lft forever >>>> 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>> default qlen 1000 >>>> link/ether e0:db:55:15:f0:f1 brd ff:ff:ff:ff:ff:ff >>>> 4: em3: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>> default qlen 1000 >>>> link/ether e0:db:55:15:f0:f2 brd ff:ff:ff:ff:ff:ff >>>> 5: em4: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>> default qlen 1000 >>>> link/ether e0:db:55:15:f0:f3 brd ff:ff:ff:ff:ff:ff >>>> 6: bond0: <BROADCAST,MULTICAST,MASTER> mtu 9000 qdisc noqueue state >>>> DOWN group default qlen 1000 >>>> link/ether 3a:ab:a2:f2:38:5c brd ff:ff:ff:ff:ff:ff >>>> 7: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >>>> state UP group default qlen 1000 >>>> link/ether 52:54:00:ae:8d:93 brd ff:ff:ff:ff:ff:ff >>>> inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 >>>> valid_lft forever preferred_lft forever >>>> 8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master >>>> virbr0 state DOWN group default qlen 1000 >>>> link/ether 52:54:00:ae:8d:93 brd ff:ff:ff:ff:ff:ff >>>> 9: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>>> master virbr0 state UNKNOWN group default qlen 1000 >>>> link/ether fe:16:3e:01:c6:32 brd ff:ff:ff:ff:ff:ff >>>> inet6 fe80::fc16:3eff:fe01:c632/64 scope link >>>> valid_lft forever preferred_lft forever >>>> >>>> # ip r >>>> default via 10.0.1.254 dev em1 >>>> 10.0.0.0/8 dev em1 proto kernel scope link src 10.0.0.227 >>>> 169.254.0.0/16 dev em1 scope link metric 1002 >>>> 192.168.122.0/24 dev virbr0 proto kernel scope link src 192.168.122.1 >>>> >>>> >>>> >>>> So, finally, I have no idea why this appends :((( >>> >>> >>> Can you please attach /var/log/messages and /var/log/libvirt/qemu/* ? >>> >>> >>> >>>> >>>> >>>> Le Mardi, Juin 26, 2018 09:21 CEST, Simone Tiraboschi < >>>> [email protected]> a écrit: >>>> >>>> >>>> >>>> On Mon, Jun 25, 2018 at 6:32 PM fsoyer <[email protected]> wrote: >>>> >>>>> Well, answering to myself for more informations. >>>>> Thinking that the network was part of the problem, I tried to stop >>>>> gluster volumes, stop gluster on host, and stop bond0. >>>>> So, the host now had just em1 with one IP. >>>>> And... The winner is... Yes : the install passed the "[Get local VM >>>>> IP]" and continued !! >>>>> >>>>> I hit ctrl-c, restart the bond0, restart deploy : it crashed. So it >>>>> seems that more than one network is the problem. But ! How do I install >>>>> engine on gluster on a separate - bonding - jumbo network in this case ??? >>>>> >>>>> Can you reproduce this on your side ? >>>> >>>> >>>> Can you please attach the output of 'ip a' in both the case? >>>> >>>> >>>>> >>>>> Frank >>>>> >>>>> >>>>> >>>>> >>>>> Le Lundi, Juin 25, 2018 16:50 CEST, "fsoyer" <[email protected]> a >>>>> écrit: >>>>> >>>>> >>>>> >>>>> >>>>> Hi staff, >>>>> Installing a fresh ovirt - CentOS 7.5.1804 up to date, ovirt version : >>>>> # rpm -qa | grep ovirt >>>>> ovirt-hosted-engine-ha-2.2.11-1.el7.centos.noarch >>>>> ovirt-imageio-common-1.3.1.2-0.el7.centos.noarch >>>>> ovirt-host-dependencies-4.2.2-2.el7.centos.x86_64 >>>>> ovirt-vmconsole-1.0.5-4.el7.centos.noarch >>>>> ovirt-provider-ovn-driver-1.2.10-1.el7.centos.noarch >>>>> ovirt-hosted-engine-setup-2.2.20-1.el7.centos.noarch >>>>> ovirt-engine-appliance-4.2-20180504.1.el7.centos.noarch >>>>> python-ovirt-engine-sdk4-4.2.6-2.el7.centos.x86_64 >>>>> ovirt-host-deploy-1.7.3-1.el7.centos.noarch >>>>> ovirt-release42-4.2.3.1-1.el7.noarch >>>>> ovirt-vmconsole-host-1.0.5-4.el7.centos.noarch >>>>> cockpit-ovirt-dashboard-0.11.24-1.el7.centos.noarch >>>>> ovirt-setup-lib-1.1.4-1.el7.centos.noarch >>>>> ovirt-imageio-daemon-1.3.1.2-0.el7.centos.noarch >>>>> ovirt-host-4.2.2-2.el7.centos.x86_64 >>>>> ovirt-engine-sdk-python-3.6.9.1-1.el7.noarch >>>>> >>>>> ON PHYSICAL SERVERS (not on VMware, why should I be ?? ;) I got >>>>> exactly the same error : >>>>> [ INFO ] TASK [Get local VM IP] >>>>> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": >>>>> true, "cmd": "virsh -r net-dhcp-leases default | grep -i 00:16:3e:69:3a:c6 >>>>> | awk '{ print $5 }' | cut -f1 -d'/'", "delta": "0:00:00.073313", "end": >>>>> "2018-06-25 16:11:36.025277", "rc": 0, "start": "2018-06-25 >>>>> 16:11:35.951964", "stderr": "", "stderr_lines": [], "stdout": "", >>>>> "stdout_lines": []} >>>>> [ INFO ] TASK [include_tasks] >>>>> [ INFO ] ok: [localhost] >>>>> [ INFO ] TASK [Remove local vm dir] >>>>> [ INFO ] changed: [localhost] >>>>> [ INFO ] TASK [Notify the user about a failure] >>>>> [ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": >>>>> "The system may not be provisioned according to the playbook results: >>>>> please check the logs for the issue, fix accordingly or re-deploy from >>>>> scratch.\n"} >>>>> [ ERROR ] Failed to execute stage 'Closing up': Failed executing >>>>> ansible-playbook >>>>> [ INFO ] Stage: Clean up >>>>> >>>>> >>>>> I have 4 NIC : >>>>> em1 10.0.0.230/8 is for ovirmgmt, it have the gateway >>>>> em2 10.0.0.229/8 is for a vmnetwork >>>>> em3+em4 in bond0 192.168.0.30 are for gluster with jumbo frames, >>>>> volumes (ENGINE, ISO,EXPORT,DATA) are up and operationals. >>>>> >>>>> I tried to stop em2 (ONBOOT=No and restart network), so the network is >>>>> actually : >>>>> # ip a >>>>> 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN >>>>> group default qlen 1000 >>>>> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 >>>>> inet 127.0.0.1/8 scope host lo >>>>> valid_lft forever preferred_lft forever >>>>> inet6 ::1/128 scope host >>>>> valid_lft forever preferred_lft forever >>>>> 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP >>>>> group default qlen 1000 >>>>> link/ether e0:db:55:15:eb:70 brd ff:ff:ff:ff:ff:ff >>>>> inet 10.0.0.230/8 brd 10.255.255.255 scope global em1 >>>>> valid_lft forever preferred_lft forever >>>>> inet6 fe80::e2db:55ff:fe15:eb70/64 scope link >>>>> valid_lft forever preferred_lft forever >>>>> 3: em2: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group >>>>> default qlen 1000 >>>>> link/ether e0:db:55:15:eb:71 brd ff:ff:ff:ff:ff:ff >>>>> 4: em3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq >>>>> master bond0 state UP group default qlen 1000 >>>>> link/ether e0:db:55:15:eb:72 brd ff:ff:ff:ff:ff:ff >>>>> 5: em4: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq >>>>> master bond0 state UP group default qlen 1000 >>>>> link/ether e0:db:55:15:eb:72 brd ff:ff:ff:ff:ff:ff >>>>> 6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc >>>>> noqueue state UP group default qlen 1000 >>>>> link/ether e0:db:55:15:eb:72 brd ff:ff:ff:ff:ff:ff >>>>> inet 192.168.0.30/24 brd 192.168.0.255 scope global bond0 >>>>> valid_lft forever preferred_lft forever >>>>> inet6 fe80::e2db:55ff:fe15:eb72/64 scope link >>>>> valid_lft forever preferred_lft forever >>>>> >>>>> # ip r >>>>> default via 10.0.1.254 dev em1 >>>>> 10.0.0.0/8 dev em1 proto kernel scope link src 10.0.0.230 >>>>> 169.254.0.0/16 dev em1 scope link metric 1002 >>>>> 169.254.0.0/16 dev bond0 scope link metric 1006 >>>>> 192.168.0.0/24 dev bond0 proto kernel scope link src 192.168.0.30 >>>>> >>>>> but same issue, after "/usr/sbin/ovirt-hosted-engine-cleanup" and >>>>> restarting the deployment. >>>>> >>>>> NetworkManager was stopped and disabled at the node install, and it is >>>>> still stopped. >>>>> After the error, the network shows this after device 6 (bond0) : >>>>> 7: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue >>>>> state UP group default qlen 1000 >>>>> link/ether 52:54:00:38:e0:5a brd ff:ff:ff:ff:ff:ff >>>>> inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 >>>>> valid_lft forever preferred_lft forever >>>>> 8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master >>>>> virbr0 state DOWN group default qlen 1000 >>>>> link/ether 52:54:00:38:e0:5a brd ff:ff:ff:ff:ff:ff >>>>> 11: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast >>>>> master virbr0 state UNKNOWN group default qlen 1000 >>>>> link/ether fe:16:3e:69:3a:c6 brd ff:ff:ff:ff:ff:ff >>>>> inet6 fe80::fc16:3eff:fe69:3ac6/64 scope link >>>>> valid_lft forever preferred_lft forever >>>>> >>>>> I do not see ovirmgmt... And I don't know if I can access the engine >>>>> vm as I have not its IP :( >>>>> I tried to ping addresses after 192.168.122.1, but no one are >>>>> accessible so I stopped at 122.10. The VM seems up (kvm process), qemu-kvm >>>>> process taking 150% of cpu in "top"... >>>>> >>>>> I pasted the log here : https://pastebin.com/Ebzh1uEh >>>>> >>>>> PLEASE ! This issue seems to be reccurent since the beginning of 2018 >>>>> (see messages here on list ! >>>>> Jamie Lawrence in February, [email protected] in april, >>>>> [email protected] and Yaniv Kaul in May, >>>>> florentl on june 01...). Can anyone give us a way to solve this ? >>>>> -- >>>>> >>>>> Cordialement, >>>>> >>>>> *Frank Soyer * >>>>> >>>>> >>>>> Le Lundi, Juin 04, 2018 16:07 CEST, Simone Tiraboschi < >>>>> [email protected]> a écrit: >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Jun 4, 2018 at 2:20 PM, Phillip Bailey <[email protected]> >>>>> wrote: >>>>>> >>>>>> Hi Florent, >>>>>> >>>>>> Could you please provide the log for the stage in which the wizard is >>>>>> failing? Logs can be found in /var/log/ovirt-hosted-engine-setup. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> -Phillip Bailey >>>>>> >>>>>> On Fri, Jun 1, 2018 at 7:57 AM, florentl <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> I try to install hosted-engine on node : >>>>>>> ovirt-node-ng-4.2.3-0.20180518. >>>>>>> Every times I get stuck on : >>>>>>> [ ERROR ] fatal: [localhost]: FAILED! => {"attempts": 50, "changed": >>>>>>> true, "cmd": "virsh -r net-dhcp-leases default | grep -i >>>>>>> 00:16:3e:6c:5a:91 >>>>>>> | awk '{ print $5 }' | cut -f1 -d'/'", "delta": "0:00:00.108872", "end": >>>>>>> "2018-06-01 11:17:34.421769", "rc": 0, "start": "2018-06-01 >>>>>>> 11:17:34.312897", "stderr": "", "stderr_lines": [], "stdout": "", >>>>>>> "stdout_lines": []} >>>>>>> I tried with static IP Address and with DHCP but both failed. >>>>>>> >>>>>>> To be more specific, I installed three nodes, deployed glusterfs >>>>>>> with the wizard. I'm in a nested virtualization environment for this lab >>>>>>> (Vmware Esxi Hypervisor). >>>>>> >>>>>> >>>>> Unfortunately I think that the issue is trying to run a nested env >>>>> over ESXi. >>>>> AFAIk nesting KVM VMs over ESX is still problematic. >>>>> >>>>> I'd suggest to repeat the experiment nesting over KVM on L0. >>>>> >>>>> >>>>> >>>>>> My node IP is : 192.168.176.40 / and I want the hosted-engine vm has >>>>>>> 192.168.176.43. >>>>>>> >>>>>>> Thanks, >>>>>>> Florent >>>>>>> _______________________________________________ >>>>>>> Users mailing list -- [email protected] >>>>>>> To unsubscribe send an email to [email protected] >>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>>> oVirt Code of Conduct: >>>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>>> List Archives: >>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/F3BNUQ2T434EASIX56F7KQQJVF7OCDUM/ >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list -- [email protected] >>>>>> To unsubscribe send an email to [email protected] >>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/ >>>>>> oVirt Code of Conduct: >>>>>> https://www.ovirt.org/community/about/community-guidelines/ >>>>>> List Archives: >>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/RU34XDM2W6GPDCRRGWORBTPH2BUN3CJR/ >>>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >> >> > >
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/CAJTR572X5CN5NGWPGHIR2EI23Z2BHWJ/

