[ovirt-users] Re: Ovirt cluster unstable; gluster to blame (again)

Jim Kusznir Fri, 06 Jul 2018 17:09:01 -0700

So far it does not appear to be helping much. I'm still getting VM's
locking up and all kinds of notices from overt engine about non-responsive
hosts.  I'm still seeing load averages in the 20-30 range.


Jim

On Fri, Jul 6, 2018, 3:13 PM Jim Kusznir <[email protected]> wrote:

> Thank you for the advice and help
>
> I do plan on going 10Gbps networking; haven't quite jumped off that cliff
> yet, though.
>
> I did put my data-hdd (main VM storage volume) onto a dedicated 1Gbps
> network, and I've watched throughput on that and never seen more than
> 60GB/s achieved (as reported by bwm-ng).  I have a separate 1Gbps network
> for communication and ovirt migration, but I wanted to break that up
> further (separate out VM traffice from migration/mgmt traffic).  My three
> SSD-backed gluster volumes run the main network too, as I haven't been able
> to get them to move to the new network (which I was trying to use as all
> gluster).  I tried bonding, but that seamed to reduce performance rather
> than improve it.
>
> --Jim
>
> On Fri, Jul 6, 2018 at 2:52 PM, Jamie Lawrence <[email protected]>
> wrote:
>
>> Hi Jim,
>>
>> I don't have any targeted suggestions, because there isn't much to latch
>> on to. I can say Gluster replica three  (no arbiters) on dedicated servers
>> serving a couple Ovirt VM clusters here have not had these sorts of issues.
>>
>> I suspect your long heal times (and the resultant long periods of high
>> load) are at least partly related to 1G networking. That is just a matter
>> of IO - heals of VMs involve moving a lot of bits. My cluster uses 10G
>> bonded NICs on the gluster and ovirt boxes for storage traffic and separate
>> bonded 1G for ovirtmgmt and communication with other machines/people, and
>> we're occasionally hitting the bandwidth ceiling on the storage network.
>> I'm starting to think about 40/100G, different ways of splitting up
>> intensive systems, and considering iSCSI for specific volumes, although I
>> really don't want to go there.
>>
>> I don't run FreeNAS[1], but I do run FreeBSD as storage servers for their
>> excellent ZFS implementation, mostly for backups. ZFS will make your `heal`
>> problem go away, but not your bandwidth problems, which become worse
>> (because of fewer NICS pushing traffic). 10G hardware is not exactly in the
>> impulse-buy territory, but if you can, I'd recommend doing some testing
>> using it. I think at least some of your problems are related.
>>
>> If that's not possible, my next stops would be optimizing everything I
>> could about sharding, healing and optimizing for serving the shard size to
>> squeeze as much performance out of 1G as I could, but that will only go so
>> far.
>>
>> -j
>>
>> [1] FreeNAS is just a storage-tuned FreeBSD with a GUI.
>>
>> > On Jul 6, 2018, at 1:19 PM, Jim Kusznir <[email protected]> wrote:
>> >
>> > hi all:
>> >
>> > Once again my production ovirt cluster is collapsing in on itself.  My
>> servers are intermittently unavailable or degrading, customers are noticing
>> and calling in.  This seems to be yet another gluster failure that I
>> haven't been able to pin down.
>> >
>> > I posted about this a while ago, but didn't get anywhere (no replies
>> that I found).  The problem started out as a glusterfsd process consuming
>> large amounts of ram (up to the point where ram and swap were exhausted and
>> the kernel OOM killer killed off the glusterfsd process).  For reasons not
>> clear to me at this time, that resulted in any VMs running on that host and
>> that gluster volume to be paused with I/O error (the glusterfs process is
>> usually unharmed; why it didn't continue I/O with other servers is
>> confusing to me).
>> >
>> > I have 3 servers and a total of 4 gluster volumes (engine, iso, data,
>> and data-hdd).  The first 3 are replica 2+arb; the 4th (data-hdd) is
>> replica 3.  The first 3 are backed by an LVM partition (some thin
>> provisioned) on an SSD; the 4th is on a seagate hybrid disk (hdd + some
>> internal flash for acceleration).  data-hdd is the only thing on the disk.
>> Servers are Dell R610 with the PERC/6i raid card, with the disks
>> individually passed through to the OS (no raid enabled).
>> >
>> > The above RAM usage issue came from the data-hdd volume.  Yesterday, I
>> cought one of the glusterfsd high ram usage before the OOM-Killer had to
>> run.  I was able to migrate the VMs off the machine and for good measure,
>> reboot the entire machine (after taking this opportunity to run the
>> software updates that ovirt said were pending).  Upon booting back up, the
>> necessary volume healing began.  However, this time, the healing caused all
>> three servers to go to very, very high load averages (I saw just under 200
>> on one server; typically they've been 40-70) with top reporting IO Wait at
>> 7-20%.  Network for this volume is a dedicated gig network.  According to
>> bwm-ng, initially the network bandwidth would hit 50MB/s (yes, bytes), but
>> tailed off to mostly in the kB/s for a while.  All machines' load averages
>> were still 40+ and gluster volume heal data-hdd info reported 5 items
>> needing healing.  Server's were intermittently experiencing IO issues, even
>> on the 3 gluster volumes that appeared largely unaffected.  Even the OS
>> activities on the hosts itself (logging in, running commands) would often
>> be very delayed.  The ovirt engine was seemingly randomly throwing engine
>> down / engine up / engine failed notifications.  Responsiveness on ANY VM
>> was horrific most of the time, with random VMs being inaccessible.
>> >
>> > I let the gluster heal run overnight.  By morning, there were still 5
>> items needing healing, all three servers were still experiencing high load,
>> and servers were still largely unstable.
>> >
>> > I've noticed that all of my ovirt outages (and I've had a lot, way more
>> than is acceptable for a production cluster) have come from gluster.  I
>> still have 3 VMs who's hard disk images have become corrupted by my last
>> gluster crash that I haven't had time to repair / rebuild yet (I believe
>> this crash was caused by the OOM issue previously mentioned, but I didn't
>> know it at the time).
>> >
>> > Is gluster really ready for production yet?  It seems so unstable to
>> me....  I'm looking at replacing gluster with a dedicated NFS server likely
>> FreeNAS.  Any suggestions?  What is the "right" way to do production
>> storage on this (3 node cluster)?  Can I get this gluster volume stable
>> enough to get my VMs to run reliably again until I can deploy another
>> storage solution?
>> >
>> > --Jim
>> > _______________________________________________
>> > Users mailing list -- [email protected]
>> > To unsubscribe send an email to [email protected]
>> > Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> > oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> > List Archives:
>> https://lists.ovirt.org/archives/list/[email protected]/message/YQX3LQFQQPW4JTCB7B6FY2LLR6NA2CB3/
>>
>>
>

_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/O2HIECLFMYGKH3KSZHHSMDUVGOEBI7GQ/

[ovirt-users] Re: Ovirt cluster unstable; gluster to blame (again)

Reply via email to