Hi Strahil, Thanks for the response, appreciate it.
There were two set sets of shard files (each set of two replicas and the arbiter data) showing up, one 0-byte set and one of the correct size, for the problem data. The correct data looked fine. The two sets of shard files is what was causing Gluster to have a wobble. No idea how Gluster came to have the two sets of files. Maybe I'm missing something in my understanding of how Gluster works here. In the end, I resorted to verifying the data by copying the iSCSI backing store files to /dev/null from the mounted Gluster volume and then removing any "bad" 0 byte shards that were logged by Gluster with the "Stale file handle" error. This then resolved the I/O errors that were being seen within the iSCSI mounted filesystem. The problem I was initially experiencing in trying to tracking this down was the tgtd logs from libgfap. I had no clue on how to determine which shards were problematic from those logs. Anyway, disaster over but it does leave me a little nervous. Recovery from backup is quite tedious. Ronny Strahil Nikolov wrote on 10/11/2022 17:28: > I skimmed over , so take everything I say with a grain of salt. > > Based on thr logs, the gfid for one of the cases is clear -> > b42dc8f9-755e-46be-8418-4882a9f765e1 and shard 5613. > > As there is a linkto, most probably the shards location was on another > subvolume and in such case I would just "walk" over all bricks and get the > extended file attributes of the real ones. > > I can't imagine why it happened but I do suspect a gfid splitbrain. > > If I were in your shoes, I would check the gfids and assume that those with > the same gfid value are the good ones (usually the one that differs has an > older timestamp) and I would remove the copy from the last brick and check if > it fixes the things for me. > > Best Regards, > Strahil Nikolov > > > > > > > > > > On Thu, Nov 3, 2022 at 17:24, Ronny Adsetts > <[email protected]> wrote: > Hi, > > We have a 4 x ( 2 + 1 ) distribute-replicate volume with sharding > enabled. We use the volume for storing backing files for iscsi devices. The > iscsi devices are provided to our file server using tgtd using the glfs > backing store type via libgfapi. > > So we had a problem the other day where one of the filesystems wouldn't > re-mount following a rolling tgtd restart (we have 4 servers providing tgtd). > I think this rolling restart was done too quickly which meant there was a > disconnect at the file server end (speculating). After some investigation, > and manually trying to copy the fs image file to a temporary location, I > found 0 byte shards. > > Because I mounted the file directly I got errors in the gluster logs > (/var/log/glusterfs/srv-iscsi.log) for the volume. I get no errors in gluster > logs when this happens via libgfapi though I did see tgtd errors. > > The tgtd errors look like this: > > tgtd[24080]: tgtd: bs_glfs_request(279) Error on read ffffffff 1000tgtd: > bs_glfs_request(370) io error 0x55da8d9820b0 2 28 -1 4096 376698519552, Stale > file handle > > Not sure how to figure out which shard is the issue out of that log > entry. :-) > > The gluster logs look like this: > > [2022-11-01 16:51:28.496911] E [MSGID: 133010] > [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard > 5613 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale > file handle] > > [2022-11-01 19:17:09.060376] E [MSGID: 133010] > [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard > 5418 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale > file handle] > > So there were the two shards showing up as problematic. Checking the > shard files showed that they were 0 byte with a trusted.glusterfs.dht.linkto > value in the file attributes. There were other shard files of the same name > with the correct size. So I guess the shard had been moved at some point > resulting in the 8 byte linkto copies. Anyway, moving the offending .shard > and associated .gluster files out of the way resulted in me being able to > first, copy the file without error, and then run an "xfs_repair -L" on the > filesystem and get it remounted. There was some data loss but minor as far as > I can tell. > > So the two shards I removed (replica 2 + arbiter) look like so: > > ronny@cogline <mailto:ronny@cogline>:~$ ls -al > /tmp/publichomes-backup-stale-shards/.shard/ > total 0 > drwxr-xr-x 2 root root 104 Nov 2 00:13 . > drwxr-xr-x 4 root root 38 Nov 2 00:05 .. > ---------T 1 root root 0 Aug 14 04:26 > b42dc8f9-755e-46be-8418-4882a9f765e1.5418 > ---------T 1 root root 0 Oct 25 10:49 > b42dc8f9-755e-46be-8418-4882a9f765e1.5613 > > ronny@keratrix <mailto:ronny@keratrix>:~$ ls -al > /tmp/publichomes-backup-stale-shards/.shard/ > total 0 > drwxr-xr-x 2 root root 104 Nov 2 00:13 . > drwxr-xr-x 4 root root 38 Nov 2 00:07 .. > ---------T 1 root root 0 Aug 14 04:26 > b42dc8f9-755e-46be-8418-4882a9f765e1.5418 > ---------T 1 root root 0 Oct 25 10:49 > b42dc8f9-755e-46be-8418-4882a9f765e1.5613 > > ronny@bellizen <mailto:ronny@bellizen>:~$ ls -al > /tmp/publichomes-backup-stale-shards/.shard/ > total 0 > drwxr-xr-x 2 root root 55 Nov 2 00:07 . > drwxr-xr-x 4 root root 38 Nov 2 00:07 .. > ---------T 1 root root 0 Oct 25 10:49 > b42dc8f9-755e-46be-8418-4882a9f765e1.5613 > > ronny@risca <mailto:ronny@risca>:~$ ls -al > /tmp/publichomes-backup-stale-shards/.shard/ > total 0 > drwxr-xr-x 2 root root 55 Nov 2 00:13 . > drwxr-xr-x 4 root root 38 Nov 2 00:13 .. > ---------T 1 root root 0 Aug 14 04:26 > b42dc8f9-755e-46be-8418-4882a9f765e1.5418 > > So the first question is did I do the right thing to get this resolved? > > The other, and more important question now relates to "Stale file handle" > errors we are now seeing on a different file system. > > I only have tgtd log entries for this and wondered if anyone could help > with taking a log entry and somehow figuring out which shard is the > problematic one: > > tgtd[3052]: tgtd: bs_glfs_request(370) io error 0x56404e0dc510 2 2a -1 > 1310720 428680884224, Stale file handle > > Thanks for any help anyone can provide. > > Ronny > -- > Ronny Adsetts > Technical Director > Amazing Internet Ltd, London > t: +44 20 8977 8943 > w: www.amazinginternet.com > > Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ > Registered in England. Company No. 4042957 > > ________ > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > [email protected] <mailto:[email protected]> > https://lists.gluster.org/mailman/listinfo/gluster-users > -- Ronny Adsetts Technical Director Amazing Internet Ltd, London t: +44 20 8977 8943 w: www.amazinginternet.com Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ Registered in England. Company No. 4042957
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
