On February 6, 2020 6:06:18 PM GMT+02:00, Christian Reiss 
<[email protected]> wrote:
>Hey folks,
>
>Running a 3-way HCI (again (sigh)) on gluster. Now the _inside_ of the 
>vms is backup'ed seperatly using bareos on an hourly basis, so files
>are 
>present with worst case 59 minutes data loss.
>
>Now, on the outside I thought of doing gluster snapshots and then 
>syncing those .snap dirs away to a remote 10gig connected machine on a 
>weekly-or-so basis. As those contents of the snaps are the oVirt images
>
>(entire DC) I could re-setup gluster and copy those files back into 
>gluster and be done with it.
>
>Now some questions, if I may:
>
>- If the hosts remain intact but gluster dies, I simply setup Gluster, 
>stop the ovirt engine (seperate standalone hardware) copy everything 
>back and start ovirt engine again. All disks are accessible again 
>(tested). The bricks are marked as down (new bricks, same name). There 
>is a "reset brick" button that made the bricks come back online again. 
>What _exactly_ does it do? Does it reset the brick info in oVirt or
>copy 
>all the data over from another node and really, really reset the brick?
>
>- If the hosts remain intact, but the engine dies: Can I re-attach the 
>engine the the running cluster?
>
>- If hosts and engine dies and everything needs to be re-setup would it
>
>be possible to do the setup wizard(s) again up to a running point then 
>copy the disk images to the new gluster-dc-data-dir? Would oVirt rescan
>
>the dir for newly found vms?
>
>- If _one_ host dies, but 2 and the engine remain online: Whats the 
>oVirt way of resetting up the failed one? Reinstalling the node and
>then 
>what? From all the cases above this is the most likely one.
>
>Having had to reinstall the entire Cluster three times already scares 
>me. Always gluster related.
>
>Again thank you community for your great efforts!

Gluster  reset brick actually wipes the brick and starts a heal process from 
another  brick.

If your node dies, ovirt won't allow you to remove it from untill you restore 
the 'replica 3' status of gluster.
I think that the fastest way to restore a node is:
1. Reinstall the node with same hostname and network settings
2. Restore from backup gluster directory /var/lib/glusterd/
3.  Restart the node  and initiate a reaet brick.
4. Go to UI and remove the node that was defective
5. Add  again the node

Voila.

About the gluster  issues - you are not testing enough your upgrades and if you 
use the cluster in production, it will be quite disruptive. For example, the 
ACL issue  I had met (and actually you too) was  discussed  in the mailing list 
for 2 weeks before I have managed  to resolve it.

I'm using latest oVirt with Gluster v7 - but this is my lab and I can afford 
downtime of a week (or even more). The more tested is an oVirt/Gluster release 
- the more reliable it will be.

Best Regards,
Strahil Nikolov
_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/GUAT3VEJ4BAJN7PN4VCT4PGDXSL4OE4M/

Reply via email to