> Date: Tue, 11 Mar 2014 15:16:36 +0100 > From: [email protected] > To: [email protected]; [email protected]; [email protected] > CC: [email protected]; [email protected]; [email protected] > Subject: Re: [Users] hosted engine help > > Il 10/03/2014 22:32, Giuseppe Ragusa ha scritto: > > Hi all, > > > >> Date: Mon, 10 Mar 2014 12:56:19 -0400 > >> From: [email protected] > >> To: [email protected] > >> CC: [email protected] > >> Subject: Re: [Users] hosted engine help > >> > >> > >> > >> ----- Original Message ----- > >> > From: "Martin Sivak" <[email protected]> > >> > To: "Dan Kenigsberg" <[email protected]> > >> > Cc: [email protected] > >> > Sent: Saturday, March 8, 2014 11:52:59 PM > >> > Subject: Re: [Users] hosted engine help > >> > > >> > Hi Jason, > >> > > >> > can you please attach the full logs? We had very similar issue before I > >> > we > >> > need to see if is the same or not. > >> > >> I may have to recreate it -- I switched back to an all in one engine after > >> my > >> setup started refusing to run the engine at all. It's no fun losing your > >> engine! > >> > >> This was a migrated-from-standalone setup, maybe that caused additional > >> wrinkles... > >> > >> Jason > >> > >> > > >> > Thanks > > > > I experienced the exact same symptoms as Jason on a from-scratch > > installation on two physical nodes with CentOS 6.5 (fully up-to-date) using > > oVirt > > 3.4.0_pre (latest test-day release) and GlusterFS 3.5.0beta3 (with > > Gluster-provided NFS as storage for the self-hosted engine VM only). > > Using GlusterFS with hosted-engine storage is not supported and not > recommended. > HA daemon may not work properly there.
If it is unsupported (and particularly "not recommended") even with the interposed NFS (the native Gluster-provided NFSv3 export of a volume), then which is the recommended way to setup a fault-tolerant load-balanced 2 node oVirt cluster (without external dedicated SAN/NAS)? > > I roughly followed the guide from Andrew Lau: > > > > http://www.andrewklau.com/ovirt-hosted-engine-with-3-4-0-nightly/ > > > > with some variations due to newer packages (resolved bugs) and different > > hardware setup (no VLANs in my setup: physically separated networks; custom > > second nic added to Engine VM template before deploying etc.) > > > > The self-hosted installation on first node + Engine VM (configured for > > managing both oVirt and the storage; Datacenter default set to NFS because > > no > > GlusterFS offered) went apparently smooth, but the HA-agent failed to start > > at the very end (same errors in logs as Jason: the storage domain seems > > "missing") and I was only able to start it all manually with: > > > > hosted-engine --connect-storage > > hosted-engine --start-pool > > The above commands are used for development and shouldn't be used for > starting the engine. Directly starting the engine (with the command below) failed because of storage unavailability, so I used the above "trick" as a "last resort" to at least prove that the engine was able to start and had not been somewhat "destroyed" or "lost" in the process (but I do understand that it is an extreme debug-only action). > > hosted-engine --vm-start > > > > then the Engine came up and I could use it, I even registered the second > > node (same final error in HA-agent) and tried to add GlusterFS storage > > domains for further VMs and ISOs (by the way: the original NFS-GlusterFS > > domain for Engine VM only is not present inside the Engine web UI) but it > > always failed activating the domains (they remain "Inactive"). > > > > Furthermore the engine gets killed some time after starting (from 3 up to > > 11 hours later) and the only way to get it back is repeating the above > > commands. > > Need logs for this. I will try to reproduce it all, but I can recall that on libvirt logs (HostedEngine.log) there was always clear indication of the PID that killed the Engine VM and each time it belonged to an instance of sanlock. > > I always managed GlusterFS "natively" (not through oVirt) from the > > commandline and verified that the NFS-exported Engine-VM-only volume gets > > replicated, but I obviously failed to try migration because the HA part > > results inactive and oVirt refuse to migrate the Engine. > > > > Since I tried many times, with variations and further manual actions > > between (like trying to manually mount the NFS Engine domain, restarting the > > HA-agent only etc.), my logs are "cluttered", so I should start from > > scratch again and pack up all logs in one swipe. > > +1 ;> > > Tell me what I should capture and at which points in the whole process and > > I will try to follow up as soon as possible. > > What: > hosted-engine-setup, hosted-engine-ha, vdsm, libvirt, sanlock from the > physical hosts and engine and server logs from the hosted engine VM. > > When: > As soon as you see an error. If the setup design (wholly GlusterFS based) is somewhat flawed, please point me to some hint/docs/guide for the right way of setting it up on 2 standalone physical nodes, so as not to waste your time in chasing "defects" in something that is not supposed to be working anyway. I will follow your advice and try it accordingly. Many thanks again, Giuseppe > > Many thanks, > > Giuseppe > > > >> > -- > >> > Martin Sivák > >> > [email protected] > >> > Red Hat Czech > >> > RHEV-M SLA / Brno, CZ > >> > > >> > ----- Original Message ----- > >> > > On Fri, Mar 07, 2014 at 10:17:43AM +0100, Sandro Bonazzola wrote: > >> > > > Il 07/03/2014 01:10, Jason Brooks ha scritto: > >> > > > > Hey everyone, I've been testing out oVirt 3.4 w/ hosted engine, and > >> > > > > while I've managed to bring the engine up, I've only been able to > >> > > > > do it > >> > > > > manually, using "hosted-engine --vm-start". > >> > > > > > >> > > > > The ovirt-ha-agent service fails reliably for me, erroring out with > >> > > > > "RequestError: Request failed: success." > >> > > > > > >> > > > > I've pasted error passages from the ha agent and vdsm logs below. > >> > > > > > >> > > > > Any pointers? > >> > > > > >> > > > looks like a VDSM bug, Dan? > >> > > > >> > > Why? The exception is raised from deep inside the > >> > > ovirt_hosted_engine_ha > >> > > code. > >> > > _______________________________________________ > >> > > Users mailing list > >> > > [email protected] > >> > > http://lists.ovirt.org/mailman/listinfo/users > >> > > > >> > _______________________________________________ > >> > Users mailing list > >> > [email protected] > >> > http://lists.ovirt.org/mailman/listinfo/users > >> > > >> _______________________________________________ > >> Users mailing list > >> [email protected] > >> http://lists.ovirt.org/mailman/listinfo/users > > > > > > _______________________________________________ > > Users mailing list > > [email protected] > > http://lists.ovirt.org/mailman/listinfo/users > > > > > -- > Sandro Bonazzola > Better technology. Faster innovation. Powered by community collaboration. > See how it works at redhat.com
_______________________________________________ Users mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/users

