On Sat, Mar 28, 2020 at 2:21 AM Gianluca Cecchi <[email protected]> wrote:
> Hello, > having deployed oVirt 4.3.9 single host HCI with Gluster, I see some times > VM going into paused state for the error above and needing to manually run > it (sometimes this resumal operation fails). > Actually it only happened with empty disk (thin provisioned) and sudden > high I/O during the initial phase of install of the OS; it didn't happened > then during normal operaton (even with 600MB/s of throughput). > I suspect something related to metadata extension not able to be in pair > with the speed of the physical disk growing.... similar to what happens for > block based storage domains where the LVM layer has to extend the logical > volume representing the virtual disk > > My real world reproduction of the error is during install of OCP 4.3.8 > master node, when Red Hat Cores OS boots from network and wipes the disk > and I think then transfer an image, so doing high immediate I/O. > The VM used as master node has been created with a 120Gb thin provisioned > disk (virtio-scsi type) and starts with disk just initialized and empty, > going through PXE install. > I get this line inside events for the VM > > Mar 27, 2020, 12:35:23 AM VM master01 has been paused due to unknown > storage error. > > Here logs around the time frame above: > > - engine.log > > https://drive.google.com/file/d/1zpNo5IgFVTAlKXHiAMTL-uvaoXSNMVRO/view?usp=sharing > > - vdsm.log > > https://drive.google.com/file/d/1v8kR0N6PdHBJ5hYzEYKl4-m7v1Lb_cYX/view?usp=sharing > > Any suggestions? > > The disk of the VM is on vmstore storage domain and its gluster volume > settings are: > > [root@ovirt tmp]# gluster volume info vmstore > > Volume Name: vmstore > Type: Distribute > Volume ID: a6203d77-3b9d-49f9-94c5-9e30562959c4 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 > Transport-type: tcp > Bricks: > Brick1: ovirtst.mydomain.storage:/gluster_bricks/vmstore/vmstore > Options Reconfigured: > performance.low-prio-threads: 32 > storage.owner-gid: 36 > performance.read-ahead: off > user.cifs: off > storage.owner-uid: 36 > performance.io-cache: off > performance.quick-read: off > network.ping-timeout: 30 > features.shard: on > network.remote-dio: off > cluster.eager-lock: enable > performance.strict-o-direct: on > transport.address-family: inet > nfs.disable: on > [root@ovirt tmp]# > > What about config above, related to eventual optimizations to be done > based on having single host? > And comparing with the virt group of options: > > [root@ovirt tmp]# cat /var/lib/glusterd/groups/virt > performance.quick-read=off > performance.read-ahead=off > performance.io-cache=off > performance.low-prio-threads=32 > network.remote-dio=enable > cluster.eager-lock=enable > cluster.quorum-type=auto > cluster.server-quorum-type=server > cluster.data-self-heal-algorithm=full > cluster.locking-scheme=granular > cluster.shd-max-threads=8 > cluster.shd-wait-qlength=10000 > features.shard=on > user.cifs=off > cluster.choose-local=off > client.event-threads=4 > server.event-threads=4 > performance.client-io-threads=on > [root@ovirt tmp]# > > ? > > Thanks Gianluca > > Further information. What I see around time frame in gluster brick log file gluster_bricks-vmstore-vmstore.log (timestamp is behind 1 hour in log file) [2020-03-27 23:30:38.575808] I [MSGID: 101055] [client_t.c:436:gf_client_unref] 0-vmstore-server: Shutting down connection CTX_ID:6e8f70b8-1946-4505-860f-be90e5807cb3-GRAPH_ID:0-PID:223418-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0 [2020-03-27 23:35:15.281449] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] [2020-03-27 23:35:15.281545] E [MSGID: 115067] [server-rpc-fops_v2.c:1373:server4_writev_cbk] 0-vmstore-server: 34139378: WRITEV 10 (00d9fe81-8a31-498e-8401-7b9d1477378e), client: CTX_ID:d04437ba-ef98-43df-864f-5e9d3738620a-GRAPH_ID:0-PID:27687-HOST:ovirt.mydomain.local-PC_NAME:vmstore-client-0-RECON_NO:-0, error-xlator: vmstore-posix [Invalid argument] [2020-03-27 23:40:15.415794] E [MSGID: 113072] [posix-inode-fd-ops.c:1886:posix_writev] 0-vmstore-posix: write failed: offset 0, [Invalid argument] My gluster components' version: gluster-ansible-cluster-1.0.0-1.el7.noarch gluster-ansible-features-1.0.5-3.el7.noarch gluster-ansible-infra-1.0.4-3.el7.noarch gluster-ansible-maintenance-1.0.1-1.el7.noarch gluster-ansible-repositories-1.0.1-1.el7.noarch gluster-ansible-roles-1.0.5-7.el7.noarch glusterfs-6.8-1.el7.x86_64 glusterfs-api-6.8-1.el7.x86_64 glusterfs-cli-6.8-1.el7.x86_64 glusterfs-client-xlators-6.8-1.el7.x86_64 glusterfs-events-6.8-1.el7.x86_64 glusterfs-fuse-6.8-1.el7.x86_64 glusterfs-geo-replication-6.8-1.el7.x86_64 glusterfs-libs-6.8-1.el7.x86_64 glusterfs-rdma-6.8-1.el7.x86_64 glusterfs-server-6.8-1.el7.x86_64 libvirt-daemon-driver-storage-gluster-4.5.0-23.el7_7.6.x86_64 python2-gluster-6.8-1.el7.x86_64 vdsm-gluster-4.30.43-1.el7.x86_64 And for completeness, the whole set of parameters for the volume, to ask eventual further performance related suggestions considering single node environment: [root@ovirt tmp]# gluster volume get vmstore all Option Value ------ ----- cluster.lookup-unhashed on cluster.lookup-optimize on cluster.min-free-disk 10% cluster.min-free-inodes 5% cluster.rebalance-stats off cluster.subvols-per-directory (null) cluster.readdir-optimize off cluster.rsync-hash-regex (null) cluster.extra-hash-regex (null) cluster.dht-xattr-name trusted.glusterfs.dht cluster.randomize-hash-range-by-gfid off cluster.rebal-throttle normal cluster.lock-migration off cluster.force-migration off cluster.local-volume-name (null) cluster.weighted-rebalance on cluster.switch-pattern (null) cluster.entry-change-log on cluster.read-subvolume (null) cluster.read-subvolume-index -1 cluster.read-hash-mode 1 cluster.background-self-heal-count 8 cluster.metadata-self-heal off cluster.data-self-heal off cluster.entry-self-heal off cluster.self-heal-daemon on cluster.heal-timeout 600 cluster.self-heal-window-size 1 cluster.data-change-log on cluster.metadata-change-log on cluster.data-self-heal-algorithm (null) cluster.eager-lock enable disperse.eager-lock on disperse.other-eager-lock on disperse.eager-lock-timeout 1 disperse.other-eager-lock-timeout 1 cluster.quorum-type none cluster.quorum-count (null) cluster.choose-local true cluster.self-heal-readdir-size 1KB cluster.post-op-delay-secs 1 cluster.ensure-durability on cluster.consistent-metadata no cluster.heal-wait-queue-length 128 cluster.favorite-child-policy none cluster.full-lock yes diagnostics.latency-measurement off diagnostics.dump-fd-stats off diagnostics.count-fop-hits off diagnostics.brick-log-level INFO diagnostics.client-log-level INFO diagnostics.brick-sys-log-level CRITICAL diagnostics.client-sys-log-level CRITICAL diagnostics.brick-logger (null) diagnostics.client-logger (null) diagnostics.brick-log-format (null) diagnostics.client-log-format (null) diagnostics.brick-log-buf-size 5 diagnostics.client-log-buf-size 5 diagnostics.brick-log-flush-timeout 120 diagnostics.client-log-flush-timeout 120 diagnostics.stats-dump-interval 0 diagnostics.fop-sample-interval 0 diagnostics.stats-dump-format json diagnostics.fop-sample-buf-size 65535 diagnostics.stats-dnscache-ttl-sec 86400 performance.cache-max-file-size 0 performance.cache-min-file-size 0 performance.cache-refresh-timeout 1 performance.cache-priority performance.cache-size 32MB performance.io-thread-count 16 performance.high-prio-threads 16 performance.normal-prio-threads 16 performance.low-prio-threads 32 performance.least-prio-threads 1 performance.enable-least-priority on performance.iot-watchdog-secs (null) performance.iot-cleanup-disconnected-reqsoff performance.iot-pass-through false performance.io-cache-pass-through false performance.cache-size 128MB performance.qr-cache-timeout 1 performance.cache-invalidation false performance.ctime-invalidation false performance.flush-behind on performance.nfs.flush-behind on performance.write-behind-window-size 1MB performance.resync-failed-syncs-after-fsyncoff performance.nfs.write-behind-window-size1MB performance.strict-o-direct on performance.nfs.strict-o-direct off performance.strict-write-ordering off performance.nfs.strict-write-ordering off performance.write-behind-trickling-writeson performance.aggregate-size 128KB performance.nfs.write-behind-trickling-writeson performance.lazy-open yes performance.read-after-open yes performance.open-behind-pass-through false performance.read-ahead-page-count 4 performance.read-ahead-pass-through false performance.readdir-ahead-pass-through false performance.md-cache-pass-through false performance.md-cache-timeout 1 performance.cache-swift-metadata true performance.cache-samba-metadata false performance.cache-capability-xattrs true performance.cache-ima-xattrs true performance.md-cache-statfs off performance.xattr-cache-list performance.nl-cache-pass-through false features.encryption off network.frame-timeout 1800 network.ping-timeout 30 network.tcp-window-size (null) client.ssl off network.remote-dio off client.event-threads 2 client.tcp-user-timeout 0 client.keepalive-time 20 client.keepalive-interval 2 client.keepalive-count 9 network.tcp-window-size (null) network.inode-lru-limit 16384 auth.allow * auth.reject (null) transport.keepalive 1 server.allow-insecure on server.root-squash off server.all-squash off server.anonuid 65534 server.anongid 65534 server.statedump-path /var/run/gluster server.outstanding-rpc-limit 64 server.ssl off auth.ssl-allow * server.manage-gids off server.dynamic-auth on client.send-gids on server.gid-timeout 300 server.own-thread (null) server.event-threads 2 server.tcp-user-timeout 42 server.keepalive-time 20 server.keepalive-interval 2 server.keepalive-count 9 transport.listen-backlog 1024 transport.address-family inet performance.write-behind on performance.read-ahead off performance.readdir-ahead on performance.io-cache off performance.open-behind on performance.quick-read off performance.nl-cache off performance.stat-prefetch on performance.client-io-threads on performance.nfs.write-behind on performance.nfs.read-ahead off performance.nfs.io-cache off performance.nfs.quick-read off performance.nfs.stat-prefetch off performance.nfs.io-threads off performance.force-readdirp true performance.cache-invalidation false performance.global-cache-invalidation true features.uss off features.snapshot-directory .snaps features.show-snapshot-directory off features.tag-namespaces off network.compression off network.compression.window-size -15 network.compression.mem-level 8 network.compression.min-size 0 network.compression.compression-level -1 network.compression.debug false features.default-soft-limit 80% features.soft-timeout 60 features.hard-timeout 5 features.alert-time 86400 features.quota-deem-statfs off geo-replication.indexing off geo-replication.indexing off geo-replication.ignore-pid-check off geo-replication.ignore-pid-check off features.quota off features.inode-quota off features.bitrot disable debug.trace off debug.log-history no debug.log-file no debug.exclude-ops (null) debug.include-ops (null) debug.error-gen off debug.error-failure (null) debug.error-number (null) debug.random-failure off debug.error-fops (null) nfs.disable on features.read-only off features.worm off features.worm-file-level off features.worm-files-deletable on features.default-retention-period 120 features.retention-mode relax features.auto-commit-period 180 storage.linux-aio off storage.batch-fsync-mode reverse-fsync storage.batch-fsync-delay-usec 0 storage.owner-uid 36 storage.owner-gid 36 storage.node-uuid-pathinfo off storage.health-check-interval 30 storage.build-pgfid off storage.gfid2path on storage.gfid2path-separator : storage.reserve 1 storage.health-check-timeout 10 storage.fips-mode-rchecksum off storage.force-create-mode 0000 storage.force-directory-mode 0000 storage.create-mask 0777 storage.create-directory-mask 0777 storage.max-hardlinks 100 features.ctime on config.gfproxyd off cluster.server-quorum-type off cluster.server-quorum-ratio 0 changelog.changelog off changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs changelog.encoding ascii changelog.rollover-time 15 changelog.fsync-interval 5 changelog.changelog-barrier-timeout 120 changelog.capture-del-path off features.barrier disable features.barrier-timeout 120 features.trash off features.trash-dir .trashcan features.trash-eliminate-path (null) features.trash-max-filesize 5MB features.trash-internal-op off cluster.enable-shared-storage disable locks.trace off locks.mandatory-locking off cluster.disperse-self-heal-daemon enable cluster.quorum-reads no client.bind-insecure (null) features.shard on features.shard-block-size 64MB features.shard-lru-limit 16384 features.shard-deletion-rate 100 features.scrub-throttle lazy features.scrub-freq biweekly features.scrub false features.expiry-time 120 features.cache-invalidation off features.cache-invalidation-timeout 60 features.leases off features.lease-lock-recall-timeout 60 disperse.background-heals 8 disperse.heal-wait-qlength 128 cluster.heal-timeout 600 dht.force-readdirp on disperse.read-policy gfid-hash cluster.shd-max-threads 1 cluster.shd-wait-qlength 1024 cluster.locking-scheme full cluster.granular-entry-heal no features.locks-revocation-secs 0 features.locks-revocation-clear-all false features.locks-revocation-max-blocked 0 features.locks-monkey-unlocking false features.locks-notify-contention no features.locks-notify-contention-delay 5 disperse.shd-max-threads 1 disperse.shd-wait-qlength 1024 disperse.cpu-extensions auto disperse.self-heal-window-size 1 cluster.use-compound-fops off performance.parallel-readdir off performance.rda-request-size 131072 performance.rda-low-wmark 4096 performance.rda-high-wmark 128KB performance.rda-cache-limit 10MB performance.nl-cache-positive-entry false performance.nl-cache-limit 10MB performance.nl-cache-timeout 60 cluster.brick-multiplex off cluster.max-bricks-per-process 250 disperse.optimistic-change-log on disperse.stripe-cache 4 cluster.halo-enabled False cluster.halo-shd-max-latency 99999 cluster.halo-nfsd-max-latency 5 cluster.halo-max-latency 5 cluster.halo-max-replicas 99999 cluster.halo-min-replicas 2 features.selinux on cluster.daemon-log-level INFO debug.delay-gen off delay-gen.delay-percentage 10% delay-gen.delay-duration 100000 delay-gen.enable disperse.parallel-writes on features.sdfs off features.cloudsync off features.ctime on ctime.noatime on feature.cloudsync-storetype (null) features.enforce-mandatory-lock off [root@ovirt tmp]# Gianluca
_______________________________________________ Users mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/NB57D3ZK5Y5CJ267JQWYDPQEI5HXOFVV/

