On Thu, Nov 19, 2020 at 4:37 PM Alex K <rightkickt...@gmail.com> wrote:
> Hi all, > > I have a corrupt self-hosted engine (with several file system errors, > postgres not able to start) and thus it does not give access to the web UI. > This happened following an unlucky split brain resolution (I am running 2 > nodes). The two hosts are running VMs also which I would like to keep > running as they are needed. > > When trying to boot into rescue mode (using systemd.unit=emergency.target > boot parameter) I get a cursor and nothing else. > This means that more than just the DB is corrupt... > > I have backups of engine files with scope all (using the engine-backup > tool). > What is the best approach to try and fix the engine or redeploy. > If you are careful, and know what you are doing, you can try something like the following. I am not giving many details, hopefully you can find on the net tutorials about how to use the things I suggest: 1. Move to global maintenance 2. Stop the current dead vm (if needed) 3. Find current vm conf, edit it to boot from a rescue iso image of your preference or from net/PXE etc., and start the vm with '--vm-conf' pointing to your edited file. 4. Connect a console (hosted-engine --console, or 'virsh console', or use '--add-console-password' and remote viewer, if needed) 5. Clean the disk and install the OS, oVirt, etc. 6. Copy your backup into the vm and restore with engine-backup 7. Then cleanly stop the machine, exit global maint, and let HA start it (or start it yourself with --vm-start). At the time, we had a bug [1] to document this. The result is [2]. It does not detail how to boot/reinstall os/etc., only restore (if e.g. db is dead but fs is ok). For something somewhat similar to what you want, see also [3], which uses guestfish. Might be useful, depending on how badly your disk is corrupted. How did you run into a split brain? There is a lock on the shared storage that should prevent this. Good luck and best regards, [1] https://bugzilla.redhat.com/show_bug.cgi?id=1482710 [2] https://www.ovirt.org/documentation/administration_guide/#Overwriting_a_Self-Hosted_Engine [3] https://bugzilla.redhat.com/show_bug.cgi?id=1569827#c4 -- Didi
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/2ALJN3CXYNC2UUCEI6H7HX3QU7YWUAML/