On March 28, 2020 3:21:45 AM GMT+02:00, Gianluca Cecchi <[email protected]> wrote: >Hello, >having deployed oVirt 4.3.9 single host HCI with Gluster, I see some >times >VM going into paused state for the error above and needing to manually >run >it (sometimes this resumal operation fails). >Actually it only happened with empty disk (thin provisioned) and sudden >high I/O during the initial phase of install of the OS; it didn't >happened >then during normal operaton (even with 600MB/s of throughput). >I suspect something related to metadata extension not able to be in >pair >with the speed of the physical disk growing.... similar to what happens >for >block based storage domains where the LVM layer has to extend the >logical >volume representing the virtual disk > >My real world reproduction of the error is during install of OCP 4.3.8 >master node, when Red Hat Cores OS boots from network and wipes the >disk >and I think then transfer an image, so doing high immediate I/O. >The VM used as master node has been created with a 120Gb thin >provisioned >disk (virtio-scsi type) and starts with disk just initialized and >empty, >going through PXE install. >I get this line inside events for the VM > >Mar 27, 2020, 12:35:23 AM VM master01 has been paused due to unknown >storage error. > >Here logs around the time frame above: > >- engine.log >https://drive.google.com/file/d/1zpNo5IgFVTAlKXHiAMTL-uvaoXSNMVRO/view?usp=sharing > >- vdsm.log >https://drive.google.com/file/d/1v8kR0N6PdHBJ5hYzEYKl4-m7v1Lb_cYX/view?usp=sharing > >Any suggestions? > >The disk of the VM is on vmstore storage domain and its gluster volume >settings are: > >[root@ovirt tmp]# gluster volume info vmstore > >Volume Name: vmstore >Type: Distribute >Volume ID: a6203d77-3b9d-49f9-94c5-9e30562959c4 >Status: Started >Snapshot Count: 0 >Number of Bricks: 1 >Transport-type: tcp >Bricks: >Brick1: ovirtst.mydomain.storage:/gluster_bricks/vmstore/vmstore >Options Reconfigured: >performance.low-prio-threads: 32 >storage.owner-gid: 36 >performance.read-ahead: off >user.cifs: off >storage.owner-uid: 36 >performance.io-cache: off >performance.quick-read: off >network.ping-timeout: 30 >features.shard: on >network.remote-dio: off >cluster.eager-lock: enable >performance.strict-o-direct: on >transport.address-family: inet >nfs.disable: on >[root@ovirt tmp]# > >What about config above, related to eventual optimizations to be done >based >on having single host? >And comparing with the virt group of options: > >[root@ovirt tmp]# cat /var/lib/glusterd/groups/virt >performance.quick-read=off >performance.read-ahead=off >performance.io-cache=off >performance.low-prio-threads=32 >network.remote-dio=enable >cluster.eager-lock=enable >cluster.quorum-type=auto >cluster.server-quorum-type=server >cluster.data-self-heal-algorithm=full >cluster.locking-scheme=granular >cluster.shd-max-threads=8 >cluster.shd-wait-qlength=10000 >features.shard=on >user.cifs=off >cluster.choose-local=off >client.event-threads=4 >server.event-threads=4 >performance.client-io-threads=on >[root@ovirt tmp]# > >? > >Thanks Gianluca
Hi Gianluca, Is it happening to machines with preallocated disks or on machines with thin disks ? Best Regards, Strahil Nikolov _______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/LHLQUMBQSY6NPC6LBWJ2TLRNP3M7LZWZ/

