Re: [Gluster-users] Question on stale shards with distribute-replicate volume

Ronny Adsetts Fri, 11 Nov 2022 01:11:22 -0800

Hi Strahil,

Thanks for the response, appreciate it.


There were two set sets of shard files (each set of two replicas and the 
arbiter data) showing up, one 0-byte set and one of the correct size, for the 
problem data. The correct data looked fine. The two sets of shard files is what 
was causing Gluster to have a wobble. No idea how Gluster came to have the two 
sets of files. Maybe I'm missing something in my understanding of how Gluster 
works here.

In the end, I resorted to verifying the data by copying the iSCSI backing store 
files to /dev/null from the mounted Gluster volume and then removing any "bad" 
0 byte shards that were logged by Gluster with the "Stale file handle" error. 
This then resolved the I/O errors that were being seen within the iSCSI mounted 
filesystem.

The problem I was initially experiencing in trying to tracking this down was 
the tgtd logs from libgfap. I had no clue on how to determine which shards were 
problematic from those logs.

Anyway, disaster over but it does leave me a little nervous. Recovery from 
backup is quite tedious.

Ronny

Strahil Nikolov wrote on 10/11/2022 17:28:
> I skimmed over , so take everything I say with a grain of salt.
>
> Based on thr logs, the gfid for one of the cases is clear -> 
> b42dc8f9-755e-46be-8418-4882a9f765e1 and shard 5613.
>
> As there is a linkto, most probably the shards location was on another 
> subvolume and in such case I would just "walk" over all bricks and get the 
> extended file attributes of the real ones.
>
> I can't imagine why it happened but I do suspect a gfid splitbrain.
>
> If I were in your shoes, I would check the gfids and assume that those with 
> the same gfid value are the good ones (usually the one that differs has an 
> older timestamp) and I would remove the copy from the last brick and check if 
> it fixes the things for me.
>
> Best Regards,
> Strahil Nikolov 
>
>
>
>
>
>
>
>
>
>     On Thu, Nov 3, 2022 at 17:24, Ronny Adsetts
>     <[email protected]> wrote:
>     Hi,
>
>     We have a 4 x ( 2 + 1 ) distribute-replicate volume with sharding 
> enabled. We use the volume for storing backing files for iscsi devices. The 
> iscsi devices are provided to our file server using tgtd using the glfs 
> backing store type via libgfapi.
>
>     So we had a problem the other day where one of the filesystems wouldn't 
> re-mount following a rolling tgtd restart (we have 4 servers providing tgtd). 
> I think this rolling restart was done too quickly which meant there was a 
> disconnect at the file server end (speculating). After some investigation, 
> and manually trying to copy the fs image file to a temporary location, I 
> found 0 byte shards.
>
>     Because I mounted the file directly I got errors in the gluster logs 
> (/var/log/glusterfs/srv-iscsi.log) for the volume. I get no errors in gluster 
> logs when this happens via libgfapi though I did see tgtd errors.
>
>     The tgtd errors look like this:
>
>     tgtd[24080]: tgtd: bs_glfs_request(279) Error on read ffffffff 1000tgtd: 
> bs_glfs_request(370) io error 0x55da8d9820b0 2 28 -1 4096 376698519552, Stale 
> file handle
>
>     Not sure how to figure out which shard is the issue out of that log 
> entry. :-)
>
>     The gluster logs look like this:
>
>     [2022-11-01 16:51:28.496911] E [MSGID: 133010] 
> [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard 
> 5613 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale 
> file handle]
>
>     [2022-11-01 19:17:09.060376] E [MSGID: 133010] 
> [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard 
> 5418 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale 
> file handle]
>
>     So there were the two shards showing up as problematic. Checking the 
> shard files showed that they were 0 byte with a trusted.glusterfs.dht.linkto 
> value in the file attributes. There were other shard files of the same name 
> with the correct size. So I guess the shard had been moved at some point 
> resulting in the 8 byte linkto copies. Anyway, moving the offending .shard 
> and associated .gluster files out of the way resulted in me being able to 
> first, copy the file without error, and then run an "xfs_repair -L" on the 
> filesystem and get it remounted. There was some data loss but minor as far as 
> I can tell.
>
>     So the two shards I removed (replica 2 + arbiter) look like so:
>
>     ronny@cogline <mailto:ronny@cogline>:~$ ls -al 
> /tmp/publichomes-backup-stale-shards/.shard/
>     total 0
>     drwxr-xr-x 2 root root 104 Nov  2 00:13 .
>     drwxr-xr-x 4 root root  38 Nov  2 00:05 ..
>     ---------T 1 root root  0 Aug 14 04:26 
> b42dc8f9-755e-46be-8418-4882a9f765e1.5418
>     ---------T 1 root root  0 Oct 25 10:49 
> b42dc8f9-755e-46be-8418-4882a9f765e1.5613
>
>     ronny@keratrix <mailto:ronny@keratrix>:~$ ls -al 
> /tmp/publichomes-backup-stale-shards/.shard/
>     total 0
>     drwxr-xr-x 2 root root 104 Nov  2 00:13 .
>     drwxr-xr-x 4 root root  38 Nov  2 00:07 ..
>     ---------T 1 root root  0 Aug 14 04:26 
> b42dc8f9-755e-46be-8418-4882a9f765e1.5418
>     ---------T 1 root root  0 Oct 25 10:49 
> b42dc8f9-755e-46be-8418-4882a9f765e1.5613
>
>     ronny@bellizen <mailto:ronny@bellizen>:~$ ls -al 
> /tmp/publichomes-backup-stale-shards/.shard/
>     total 0
>     drwxr-xr-x 2 root root 55 Nov  2 00:07 .
>     drwxr-xr-x 4 root root 38 Nov  2 00:07 ..
>     ---------T 1 root root  0 Oct 25 10:49 
> b42dc8f9-755e-46be-8418-4882a9f765e1.5613
>
>     ronny@risca <mailto:ronny@risca>:~$ ls -al 
> /tmp/publichomes-backup-stale-shards/.shard/
>     total 0
>     drwxr-xr-x 2 root root 55 Nov  2 00:13 .
>     drwxr-xr-x 4 root root 38 Nov  2 00:13 ..
>     ---------T 1 root root  0 Aug 14 04:26 
> b42dc8f9-755e-46be-8418-4882a9f765e1.5418
>
>     So the first question is did I do the right thing to get this resolved?
>
>     The other, and more important question now relates to "Stale file handle" 
> errors we are now seeing on a different file system.
>
>     I only have tgtd log entries for this and wondered if anyone could help 
> with taking a log entry and somehow figuring out which shard is the 
> problematic one:
>
>     tgtd[3052]: tgtd: bs_glfs_request(370) io error 0x56404e0dc510 2 2a -1 
> 1310720 428680884224, Stale file handle
>
>     Thanks for any help anyone can provide.
>
>     Ronny
>     -- 
>     Ronny Adsetts
>     Technical Director
>     Amazing Internet Ltd, London
>     t: +44 20 8977 8943
>     w: www.amazinginternet.com
>
>     Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ
>     Registered in England. Company No. 4042957
>
>     ________
>
>
>
>     Community Meeting Calendar:
>
>     Schedule -
>     Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>     Bridge: https://meet.google.com/cpu-eiue-hvk
>     Gluster-users mailing list
>     [email protected] <mailto:[email protected]>
>     https://lists.gluster.org/mailman/listinfo/gluster-users
>
-- 

Ronny Adsetts
Technical Director
Amazing Internet Ltd, London
t: +44 20 8977 8943
w: www.amazinginternet.com

Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ
Registered in England. Company No. 4042957

________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Question on stale shards with distribute-replicate volume

Reply via email to