I have attempted to remove myself, but continue due to receive "illegal email address" reply with my attempt. Thank you.
On Mon, Oct 5, 2020, 4:37 PM Felix Kölzow <[email protected]> wrote: > Dear Matthew, > > > from my current experience with gluster geo-replication and since this is > a key in your backup-procedure > > (this is how it seems to me), I would come up with a new one, just to be > sure. > > > Regards, > > Felix > > > On 05/10/2020 22:28, Matthew Benstead wrote: > > Hmm... Looks like I forgot to set the xattr's to sa - I left them as > default. > > [root@pcic-backup01 ~]# zfs get xattr pcic-backup01-zpool > NAME PROPERTY VALUE SOURCE > pcic-backup01-zpool xattr on default > > [root@pcic-backup02 ~]# zfs get xattr pcic-backup02-zpool > NAME PROPERTY VALUE SOURCE > pcic-backup02-zpool xattr on default > > I wonder if I can change them and continue, or if I need to blow away the > zpool and start over? > > Thanks, > -Matthew > > -- > Matthew Benstead > System Administrator > Pacific Climate Impacts Consortium <https://pacificclimate.org/> > University of Victoria, UH1 > PO Box 1800, STN CSC > Victoria, BC, V8W 2Y2 > Phone: +1-250-721-8432 > Email: [email protected] > On 10/5/20 12:53 PM, Felix Kölzow wrote: > > Dear Matthew, > > this is our configuration: > > zfs get all mypool > > mypool xattr sa > local > mypool acltype posixacl > local > > > Something more to consider? > > > Regards, > > Felix > > > > On 05/10/2020 21:11, Matthew Benstead wrote: > > Thanks Felix - looking through some more of the logs I may have found the > reason... > > From > /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/mnt-data-storage_a-storage.log > > [2020-10-05 18:13:35.736838] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:18:53.885591] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:22:14.405234] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:25:53.971679] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:31:44.571557] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:36:36.508772] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:40:10.401055] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:42:57.833536] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:45:19.691953] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:48:26.478532] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > [2020-10-05 18:52:24.466914] E [fuse-bridge.c:4288:fuse_xattr_cbk] > 0-glusterfs-fuse: extended attribute not supported by the backend storage > > > The slave nodes are running gluster on top of ZFS, but I had configured > ACLs - is there something else missing to make this work with ZFS? > > [root@pcic-backup01 ~]# gluster volume info > > Volume Name: pcic-backup > Type: Distribute > Volume ID: 7af8a424-f4b6-4405-bba1-0dbafb0fa231 > Status: Started > Snapshot Count: 0 > Number of Bricks: 2 > Transport-type: tcp > Bricks: > Brick1: 10.0.231.81:/pcic-backup01-zpool/brick > Brick2: 10.0.231.82:/pcic-backup02-zpool/brick > Options Reconfigured: > network.ping-timeout: 10 > performance.cache-size: 256MB > server.event-threads: 4 > client.event-threads: 4 > cluster.lookup-optimize: on > performance.parallel-readdir: on > performance.readdir-ahead: on > features.quota-deem-statfs: on > features.inode-quota: on > features.quota: on > transport.address-family: inet > nfs.disable: on > features.read-only: off > performance.open-behind: off > > > [root@pcic-backup01 ~]# zfs get acltype pcic-backup01-zpool > NAME PROPERTY VALUE SOURCE > pcic-backup01-zpool acltype posixacl local > > [root@pcic-backup01 ~]# grep "pcic-backup0" /proc/mounts > pcic-backup01-zpool /pcic-backup01-zpool zfs rw,seclabel,xattr,posixacl 0 0 > > > [root@pcic-backup02 ~]# zfs get acltype pcic-backup02-zpool > NAME PROPERTY VALUE SOURCE > pcic-backup02-zpool acltype posixacl local > > [root@pcic-backup02 ~]# grep "pcic-backup0" /proc/mounts > pcic-backup02-zpool /pcic-backup02-zpool zfs rw,seclabel,xattr,posixacl 0 0 > > Thanks, > -Matthew > > > -- > Matthew Benstead > System Administrator > Pacific Climate Impacts Consortium <https://pacificclimate.org/> > University of Victoria, UH1 > PO Box 1800, STN CSC > Victoria, BC, V8W 2Y2 > Phone: +1-250-721-8432 > Email: [email protected] > On 10/5/20 1:39 AM, Felix Kölzow wrote: > > Dear Matthew, > > > can you provide more information regarding to the geo-replication brick > logs. > > These files area also located in: > > /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/ > > > Usually, these log files are more precise to figure out the root cause > of the error. > > Additionally, it is also worth to look at the log-files on the slave side. > > > Regards, > > Felix > > > On 01/10/2020 23:08, Matthew Benstead wrote: > > Hello, > > I'm looking for some help with a GeoReplication Error in my Gluster > 7/CentOS 7 setup. Replication progress has basically stopped, and the > status of the replication keeps switching. > > The gsyncd log has errors like "Operation not permitted", "incomplete > sync", etc... help? I'm not sure how to proceed in troubleshooting this. > > The log is here, it basically just repeats - from: > /var/log/glusterfs/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.log > > > [2020-10-01 20:52:15.291923] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=32.8466 num_files=1749 job=3 return_code=23 > [2020-10-01 20:52:18.700062] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=43.1210 num_files=3167 job=6 return_code=23 > [2020-10-01 20:52:23.383234] W [master(worker > /data/storage_c/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs files=['XSYNC-CHANGELOG.1601585397'] > [2020-10-01 20:52:28.537657] E [repce(worker > /data/storage_b/storage):213:__call__] RepceClient: call failed > call=258187:140538843596608:1601585515.63 method=entry_ops > error=OSError > [2020-10-01 20:52:28.538064] E [syncdutils(worker > /data/storage_b/storage):339:log_raise_exception] <top>: FAIL: > Traceback (most recent call last): > File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 332, > in main > func(args) > File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 86, > in subcmd_worker > local.service_loop(remote) > File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line > 1308, in service_loop > g1.crawlwrap(oneshot=True, register_time=register_time) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 602, > in crawlwrap > self.crawl() > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1682, > in crawl > self.process([item[1]], 0) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1327, > in process > self.process_change(change, done, retry) > File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1221, > in process_change > failures = self.slave.server.entry_ops(entries) > File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 232, in > __call__ > return self.ins(self.meth, *a) > File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 214, in > __call__ > raise res > OSError: [Errno 1] Operation not permitted > [2020-10-01 20:52:28.570316] I [repce(agent > /data/storage_b/storage):96:service_loop] RepceServer: terminating on > reaching EOF. > [2020-10-01 20:52:28.613603] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change status=Faulty > [2020-10-01 20:52:29.619797] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=5.6458 num_files=455 job=3 return_code=23 > [2020-10-01 20:52:38.286245] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=14.1824 num_files=1333 job=2 return_code=23 > [2020-10-01 20:52:38.628156] I > [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker > Status Change status=Initializing... > [2020-10-01 20:52:38.628325] I [monitor(monitor):159:monitor] Monitor: > starting gsyncd worker brick=/data/storage_b/storage > slave_node=10.0.231.82 > [2020-10-01 20:52:38.684736] I [gsyncd(agent > /data/storage_b/storage):318:main] <top>: Using session config > file > path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf > > [2020-10-01 20:52:38.687213] I [gsyncd(worker > /data/storage_b/storage):318:main] <top>: Using session config > file > path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf > > [2020-10-01 20:52:38.687401] I [changelogagent(agent > /data/storage_b/storage):72:__init__] ChangelogAgent: Agent listining... > [2020-10-01 20:52:38.703295] I [resource(worker > /data/storage_b/storage):1386:connect_remote] SSH: Initializing SSH > connection between master and slave... > [2020-10-01 20:52:40.388372] I [resource(worker > /data/storage_b/storage):1435:connect_remote] SSH: SSH connection > between master and slave established. duration=1.6849 > [2020-10-01 20:52:40.388582] I [resource(worker > /data/storage_b/storage):1105:connect] GLUSTER: Mounting gluster volume > locally... > [2020-10-01 20:52:41.501105] I [resource(worker > /data/storage_b/storage):1128:connect] GLUSTER: Mounted gluster volume > duration=1.1123 > [2020-10-01 20:52:41.501405] I [subcmds(worker > /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn > successful. Acknowledging back to monitor > [2020-10-01 20:52:43.531146] I [master(worker > /data/storage_b/storage):1640:register] _GMaster: Working dir > path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage > > [2020-10-01 20:52:43.533953] I [resource(worker > /data/storage_b/storage):1291:service_loop] GLUSTER: Register time > time=1601585563 > [2020-10-01 20:52:43.547092] I [gsyncdstatus(worker > /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status > Change status=Active > [2020-10-01 20:52:43.561920] I [gsyncdstatus(worker > /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: > Crawl Status Change status=History Crawl > [2020-10-01 20:52:43.562184] I [master(worker > /data/storage_b/storage):1554:crawl] _GMaster: starting history > crawl turns=1 stime=None entry_stime=None etime=1601585563 > [2020-10-01 20:52:43.562269] I [resource(worker > /data/storage_b/storage):1307:service_loop] GLUSTER: No stime available, > using xsync crawl > [2020-10-01 20:52:43.569799] I [master(worker > /data/storage_b/storage):1670:crawl] _GMaster: starting hybrid > crawl stime=None > [2020-10-01 20:52:43.573528] I [gsyncdstatus(worker > /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: > Crawl Status Change status=Hybrid Crawl > [2020-10-01 20:52:44.370985] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=20.4307 num_files=2609 job=5 return_code=23 > [2020-10-01 20:52:49.431854] W [master(worker > /data/storage_c/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs files=['XSYNC-CHANGELOG.1601585397'] > [2020-10-01 20:52:54.801500] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=72.7492 num_files=4227 job=2 return_code=23 > [2020-10-01 20:52:56.766547] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=74.3569 num_files=4674 job=5 return_code=23 > [2020-10-01 20:53:18.853333] I [master(worker > /data/storage_c/storage):1991:syncjob] Syncer: Sync Time Taken > duration=28.7125 num_files=4397 job=3 return_code=23 > [2020-10-01 20:53:21.224921] W [master(worker > /data/storage_a/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs files=['CHANGELOG.1601044033', > 'CHANGELOG.1601044048', 'CHANGELOG.1601044063', 'CHANGELOG.1601044078', > 'CHANGELOG.1601044093', 'CHANGELOG.1601044108', 'CHANGELOG.1601044123'] > [2020-10-01 20:53:22.134536] I [master(worker > /data/storage_a/storage):1991:syncjob] Syncer: Sync Time Taken > duration=0.2159 num_files=3 job=3 return_code=23 > [2020-10-01 20:53:25.615712] I [master(worker > /data/storage_b/storage):1681:crawl] _GMaster: processing xsync > changelog > path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage/xsync/XSYNC-CHANGELOG.1601585563 > > [2020-10-01 20:53:25.634970] W [master(worker > /data/storage_c/storage):1393:process] _GMaster: incomplete sync, > retrying changelogs files=['XSYNC-CHANGELOG.1601585397'] > > GeoReplication status - see it change from Active to Faulty: > > [root@storage01 ~]# gluster volume geo-replication status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER > SLAVE SLAVE NODE STATUS > CRAWL STATUS LAST_SYNCED > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > 10.0.231.91 storage /data/storage_a/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Changelog Crawl 2020-09-25 07:26:57 > 10.0.231.91 storage /data/storage_c/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > Hybrid Crawl N/A > 10.0.231.91 storage /data/storage_b/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > Hybrid Crawl N/A > 10.0.231.92 storage /data/storage_b/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > History Crawl 2020-09-23 01:56:05 > 10.0.231.92 storage /data/storage_a/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > Hybrid Crawl N/A > 10.0.231.92 storage /data/storage_c/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Hybrid Crawl N/A > 10.0.231.93 storage /data/storage_c/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Changelog Crawl 2020-09-25 06:55:57 > 10.0.231.93 storage /data/storage_b/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Hybrid Crawl N/A > 10.0.231.93 storage /data/storage_a/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Hybrid Crawl N/A > > [root@storage01 ~]# gluster volume geo-replication status > > MASTER NODE MASTER VOL MASTER BRICK SLAVE USER > SLAVE SLAVE NODE STATUS > CRAWL STATUS LAST_SYNCED > --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > 10.0.231.91 storage /data/storage_a/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Changelog Crawl 2020-09-25 07:26:57 > 10.0.231.91 storage /data/storage_c/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > Hybrid Crawl N/A > 10.0.231.91 storage /data/storage_b/storage geoaccount > ssh://[email protected]::pcic-backup N/A Faulty > N/A N/A > 10.0.231.92 storage /data/storage_b/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > History Crawl 2020-09-23 01:58:05 > 10.0.231.92 storage /data/storage_a/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.82 Active > Hybrid Crawl N/A > 10.0.231.92 storage /data/storage_c/storage geoaccount > ssh://[email protected]::pcic-backup N/A Faulty > N/A N/A > 10.0.231.93 storage /data/storage_c/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Changelog Crawl 2020-09-25 06:58:56 > 10.0.231.93 storage /data/storage_b/storage geoaccount > ssh://[email protected]::pcic-backup 10.0.231.81 Active > Hybrid Crawl N/A > 10.0.231.93 storage /data/storage_a/storage geoaccount > ssh://[email protected]::pcic-backup N/A Faulty > N/A N/A > > > Cluster information: (Note - disabled performance.open-behind to work > around https://github.com/gluster/glusterfs/issues/1440 ) > > [root@storage01 ~]# gluster --version | head -1; cat > /etc/centos-release; uname -r > glusterfs 7.7 > CentOS Linux release 7.8.2003 (Core) > 3.10.0-1127.10.1.el7.x86_64 > > [root@storage01 ~]# df -h /storage2/ > Filesystem Size Used Avail Use% Mounted on > 10.0.231.91:/storage 328T 228T 100T 70% /storage2 > > [root@storage01 ~]# gluster volume info > > Volume Name: storage > Type: Distributed-Replicate > Volume ID: cf94a8f2-324b-40b3-bf72-c3766100ea99 > Status: Started > Snapshot Count: 0 > Number of Bricks: 3 x (2 + 1) = 9 > Transport-type: tcp > Bricks: > Brick1: 10.0.231.91:/data/storage_a/storage > Brick2: 10.0.231.92:/data/storage_b/storage > Brick3: 10.0.231.93:/data/storage_c/storage (arbiter) > Brick4: 10.0.231.92:/data/storage_a/storage > Brick5: 10.0.231.93:/data/storage_b/storage > Brick6: 10.0.231.91:/data/storage_c/storage (arbiter) > Brick7: 10.0.231.93:/data/storage_a/storage > Brick8: 10.0.231.91:/data/storage_b/storage > Brick9: 10.0.231.92:/data/storage_c/storage (arbiter) > Options Reconfigured: > changelog.changelog: on > geo-replication.ignore-pid-check: on > geo-replication.indexing: on > network.ping-timeout: 10 > features.inode-quota: on > features.quota: on > nfs.disable: on > features.quota-deem-statfs: on > storage.fips-mode-rchecksum: on > performance.readdir-ahead: on > performance.parallel-readdir: on > cluster.lookup-optimize: on > client.event-threads: 4 > server.event-threads: 4 > performance.cache-size: 256MB > performance.open-behind: off > > Thanks, > -Matthew > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users > > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing > [email protected]https://lists.gluster.org/mailman/listinfo/gluster-users > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing > [email protected]https://lists.gluster.org/mailman/listinfo/gluster-users > > > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing > [email protected]https://lists.gluster.org/mailman/listinfo/gluster-users > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
