Yeah..zo the right procedure should be to setup a new volume without sharding and copy everything over.
On Thu, 28 Nov 2019, 06:45 Strahil, <[email protected]> wrote: > I have already tried disabling sharding on a test oVirt volume... The > results were devastating for the app, so please do not disable sharding. > > Best Regards, > Strahil Nikolov > On Nov 27, 2019 20:55, Olaf Buitelaar <[email protected]> wrote: > > Hi Tim, > > That issue also seems to point to a stale file. Best i suppose is first to > determine if you indeed have the same shard on different sub-volumes, where > on one of the sub-volumes the file size is 0KB and has the stick bit set. > if so we suffer from the same issue, and you can clean those files up, so > the `rm` command should start working again. > Essentially you should consider the volume unhealty until you have > resolved the stale files, before you can continue file operations. > Remounting the client shouldn't make a difference since the issue is at > brick/sub-volume level. > > the last comment i received from Krutika; > "I haven't had the chance to look into the attachments yet. I got another > customer case on me. > But from the description, it seems like the linkto file (the one with a > 'T') and the original file don't have the same gfid? > It's not wrong for those 'T' files to exist. But they're supposed to have > the same gfid. > This is something that needs DHT team's attention. > Do you mind raising a bug in bugzilla.redhat.com against glusterfs and > component 'distribute' or 'DHT'?" > > > For me replicating it was easiest with running xfs_fsr (which is very > write intensive in fragmented volumes) from within a VM, but it could > happen with a simple yum install.. docker run (with new image)..general > test with dd, mkfs.xfs or just random, which was the normal case. But i've > to say my workload is mostly write intensive, like yours. > > Sharding in general is a nice feature, it allows your files to be broken > up into peaces, which is also it's biggest danger..if anything goes > haywire, it's currently practically impossible to stitch all those peaces > together again, since no tool for this seems to exists..which is the nice > thing about none-sharded volumes, they are just files..but if you really > wanted i suppose it could be done. But would be very painful..i suppose. > With the files being in shard's it allows for much more equal > distribution. Also heals seem to resolve much quicker. > I'm also running none sharded volumes, with files of 100GB+ and those > heals can take significantly longer. And those none sharded volumes i also > sometime's have issues with..however not remembering any stale files. > But if you don't need it you might be better of disabling it. However i > believe you're never allowed to turn of sharding on a sharded volumes since > it will corrupt your data. > > Best Olaf > > Op wo 27 nov. 2019 om 19:19 schreef Timothy Orme <[email protected]>: > > Hi Olaf, > > Thanks so much for sharing this, it's hugely helpful, if only to make me > feel less like I'm going crazy. I'll see if theres anything I can add to > the bug report. I'm trying to develop a test to reproduce the issue now. > > We're running this in a sort of interactive HPC environment, so these > error are a bit hard for us to systematically handle, and they have a > tendency to be quite disruptive to folks work. > > I've run into other issues with sharding as well, such as this: > <https://lists.gluster.org/pipermail/gluster-users/2019-October/037241.html> > https://lists.gluster.org/pipermail/gluster-users/2019-October/037241.html > > I'm wondering then, if maybe sharding isn't quite stable yet and it's more > sensible for me to just disable this feature for now? I'm not quite sure > what other implications that might have but so far all the issues I've run > into so far as a new gluster user seem like they're related to shards. > > Thanks, > Tim > ------------------------------ > *From:* Olaf Buitelaar <[email protected]> > *Sent:* Wednesday, November 27, 2019 9:50 AM > *To:* Timothy Orme <[email protected]> > *Cc:* gluster-users <[email protected]> > *Subject:* [EXTERNAL] Re: [Gluster-users] Stale File Handle Errors During > Heavy Writes > > Hi Tim, > > i've been suffering from this also for a long time, not sure if it's exact > the same situation since your setup is different. But it seems similar. > i've filed this bug report; > https://bugzilla.redhat.com/show_bug.cgi?id=1732961 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugzilla.redhat.com_show-5Fbug.cgi-3Fid-3D1732961&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=Nh3Ca9VCh4XnpEF6imXwTa2NUUglz-XZQhfG8-AyOVU&s=GbJiS8pLGORzLwdgt0ypnnQxQgRhrTHdGXEizatE9g0&e=> > for > which you might be able to enrich. > To solve the stale files i've made this bash script; > https://gist.github.com/olafbuitelaar/ff6fe9d4ab39696d9ad6ca689cc89986 > <https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_olafbuitelaar_ff6fe9d4ab39696d9ad6ca689cc89986&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=Nh3Ca9VCh4XnpEF6imXwTa2NUUglz-XZQhfG8-AyOVU&s=CvN0yMFI03czcHgzTeexTfP9h4woiAO_XVyn1umHR8g&e=> > (it's > slightly outdated) which you could use as inspiration, it basically removes > the stale files as suggested here; > https://lists.gluster.org/pipermail/gluster-users/2018-March/033785.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gluster.org_pipermail_gluster-2Dusers_2018-2DMarch_033785.html&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=Nh3Ca9VCh4XnpEF6imXwTa2NUUglz-XZQhfG8-AyOVU&s=MGGOwcqFQ8DwBK3MDoMxO-MD6_wrmojY1T9GYqE8WOs&e=> > . > Please be aware the script won't work if you have 2 (or more) bricks of > the same volume on the same server (since it always takes the first path > found). > I invoke the script via ansible like this (since the script needs to run > on all bricks); > - hosts: host1,host2,host3 > tasks: > - shell: 'bash /root/clean-stale-gluster-fh.sh --host="{{ intif.ip | > first }}" --volume=ovirt-data --backup="/backup/stale/gfs/ovirt-data" > --shard="{{ item }}" --force' > with_items: > - 1b0ba5c2-dd2b-45d0-9c4b-a39b2123cc13.14451 > > fortunately for me the issue seems to be disappeared, since it's now about > 1 month i received one, while before it was about every other day. > The biggest thing the seemed to resolve it was more disk space. while > before there was also plenty the gluster volume was at about 85% full, and > the individual disk had about 20-30% free of 8TB disk array, but had > servers in the mix with smaller disk array's but with similar available > space (in percents). i'm now at much lower percentage. > So my latest running theory is that it has something todo with how gluster > allocates the shared's, since it's based on it's hash it might want to > place it in a certain sub-volume, but than comes to the conclusion it has > not enough space there, writes a marker to redirect it to another > sub-volume (thinking this is the stale file). However rebalances don't fix > this issue. Also this still doesn't seem explain that most stale files > always end up in the first sub-volume. > Unfortunate i've no proof this is actually the root cause, besides that > the symptom "disappeared" once gluster had more space to work with. > > Best Olaf > > Op wo 27 nov. 2019 om 02:38 schreef Timothy Orme <[email protected]>: > > Hi All, > > I'm running a 3x2 cluster, v6.5. Not sure if its relevant, but also have > sharding enabled. > > I've found that when under heavy write load, clients start erroring out > with "stale file handle" errors, on files not related to the writes. > > For instance, when a user is running a simple wc against a file, it will > bail during that operation with "stale file" > > When I check the client logs, I see errors like: > > [2019-11-26 22:41:33.565776] E [MSGID: 109040] > [dht-helper.c:1336:dht_migration_complete_check_task] 3-scratch-dht: > 24d53a0e-c28d-41e0-9dbc-a75e823a3c7d: failed to lookup the file on > scratch-dht [Stale file handle] > [2019-11-26 22:41:33.565853] W [fuse-bridge.c:2827:fuse_readv_cbk] > 0-glusterfs-fuse: 33112038: READ => -1 > gfid=147040e2-a6b8-4f54-8490-f0f3df29ee50 fd=0x7f95d8d0b3f8 (Stale file > handle) > > I've seen some bugs or other threads referencing similar issues, but > couldn't really discern a solution from them. > > Is this caused by some consistency issue with metadata while under load or > something else? I dont see the issue when heavy reads are occurrring. > > Any help is greatly appreciated! > > Thanks! > Tim > ________ > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: > <https://urldefense.proofpoint.com/v2/url?u=https-3A__bluejeans.com_441850968&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=Nh3Ca9VCh4XnpEF6imXwTa2NUUglz-XZQhfG8-AyOVU&s=JHDxrPUb-16_6j_D-rhVhXtDR9h4OwPyylW4ScTmygE&e=> > https://bluejeans.com/441850968 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: > <https://urldefense.proofpoint.com/v2/url?u=https-3A__bluejeans.com_441850968&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=Nh3Ca9VCh4XnpEF6imXwTa2NUUglz-XZQhfG8-AyOVU&s=JHDxrPUb-16_6j_D-rhVhXtDR9h4OwPyylW4ScTmygE&e=> > https://bluejeans.com/441850968 > > Gluster-users mailing list > [email protected] > https://lists.gluster.org/mailman/listinfo/gluster-users > <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gluster.org_mailman_listinfo_gluster-2Dusers&d=DwMFaQ&c=kKqjBR9KKWaWpMhASkPbOg&r=d0SJB4ihnau-Oyws6GEzcipkV9DfxCuMbgdSRgXeuxM&m=Nh3Ca9VCh4XnpEF6imXwTa2NUUglz-XZQhfG8-AyOVU&s=gPJBHZbzGbDnozrJuLTslUXJdPrLDrR2rT86P1uUuPk&e=> > >
________ Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/441850968 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/441850968 Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
