Hi João, I'd recommend to go with the disable/enable of the quota as that'd eventually do the same thing. Rather than manually changing the parameters in the said command, that would be the better option.
-- Thanks and Regards, SRIJAN SIVAKUMAR Associate Software Engineer Red Hat <https://www.redhat.com> <https://www.redhat.com> T: +91-9727532362 <http://redhatemailsignature-marketing.itos.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> On Wed, Aug 19, 2020 at 8:12 PM João Baúto < [email protected]> wrote: > Hi Srijan, > > Before I do the disable/enable just want to check something with you. The > other cluster where the crawling is running, I can see the find command and > this one which seems to be the one triggering the crawler (4 processes, one > per brick in all nodes) > > /usr/sbin/glusterfs -s localhost --volfile-id > client_per_brick/tank.client.hostname.tank-volume1-brick.vol > --use-readdirp=yes --client-pid -100 -l > /var/log/glusterfs/quota_crawl/tank-volume1-brick.log > /var/run/gluster/tmp/mntYbIVwT > > Can I manually trigger this command? > > Thanks! > *João Baúto* > --------------- > > *Scientific Computing and Software Platform* > Champalimaud Research > Champalimaud Center for the Unknown > Av. Brasília, Doca de Pedrouços > 1400-038 Lisbon, Portugal > fchampalimaud.org <https://www.fchampalimaud.org/> > > > Srijan Sivakumar <[email protected]> escreveu no dia quarta, 19/08/2020 > à(s) 07:25: > >> Hi João, >> >> If the crawl is not going on and the values are still not reflecting >> properly then it means the crawl process has ended abruptly. >> >> Yes, technically disabling and enabling the quota will trigger crawl but >> it'd do a complete crawl of the filesystem, hence would take time and be >> resource consuming. Usually disabling-enabling is the last thing to do if >> the accounting isn't reflecting properly but if you're going to merge these >> two clusters then probably you can go ahead with the merging and then >> enable quota. >> >> -- >> Thanks and Regards, >> >> SRIJAN SIVAKUMAR >> >> Associate Software Engineer >> >> Red Hat >> <https://www.redhat.com> >> >> >> <https://www.redhat.com> >> >> T: +91-9727532362 >> <http://redhatemailsignature-marketing.itos.redhat.com/> >> <https://red.ht/sig> >> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> >> >> On Wed, Aug 19, 2020 at 3:53 AM João Baúto < >> [email protected]> wrote: >> >>> Hi Srijan, >>> >>> I didn't get any result with that command so I went to our other cluster >>> (we are merging two clusters, data is replicated) and activated the quota >>> feature on the same directory. Running the same command on each node I get >>> a similar output to yours. One process per brick I'm assuming. >>> >>> root 1746822 1.4 0.0 230324 2992 ? S 23:06 0:04 >>> /usr/bin/find . -exec /usr/bin/stat {} \ ; >>> root 1746858 5.3 0.0 233924 6644 ? S 23:06 0:15 >>> /usr/bin/find . -exec /usr/bin/stat {} \ ; >>> root 1746889 3.3 0.0 233592 6452 ? S 23:06 0:10 >>> /usr/bin/find . -exec /usr/bin/stat {} \ ; >>> root 1746930 3.1 0.0 230476 3232 ? S 23:06 0:09 >>> /usr/bin/find . -exec /usr/bin/stat {} \ ; >>> >>> At this point, is it easier to just disable and enable the feature and >>> force a new crawl? We don't mind a temporary increase in CPU and IO usage. >>> >>> Thank you again! >>> *João Baúto* >>> --------------- >>> >>> *Scientific Computing and Software Platform* >>> Champalimaud Research >>> Champalimaud Center for the Unknown >>> Av. Brasília, Doca de Pedrouços >>> 1400-038 Lisbon, Portugal >>> fchampalimaud.org <https://www.fchampalimaud.org/> >>> >>> >>> Srijan Sivakumar <[email protected]> escreveu no dia terça, >>> 18/08/2020 à(s) 21:42: >>> >>>> Hi João, >>>> >>>> There isn't a straightforward way of tracking the crawl but as gluster >>>> uses find and stat during crawl, one can run the following command, >>>> # ps aux | grep find >>>> >>>> If the output is of the form, >>>> "root 1513 0.0 0.1 127224 2636 ? S 12:24 0.00 >>>> /usr/bin/find . -exec /usr/bin/stat {} \" >>>> then it means that the crawl is still going on. >>>> >>>> >>>> Thanks and Regards, >>>> >>>> SRIJAN SIVAKUMAR >>>> >>>> Associate Software Engineer >>>> >>>> Red Hat >>>> <https://www.redhat.com> >>>> >>>> >>>> <https://www.redhat.com> >>>> >>>> T: +91-9727532362 >>>> <http://redhatemailsignature-marketing.itos.redhat.com/> >>>> <https://red.ht/sig> >>>> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted> >>>> >>>> >>>> On Wed, Aug 19, 2020 at 1:46 AM João Baúto < >>>> [email protected]> wrote: >>>> >>>>> Hi Srijan, >>>>> >>>>> Is there a way of getting the status of the crawl process? >>>>> We are going to expand this cluster, adding 12 new bricks (around >>>>> 500TB) and we rely heavily on the quota feature to control the space usage >>>>> for each project. It's been running since Saturday (nothing changed) >>>>> and unsure if it's going to finish tomorrow or in weeks. >>>>> >>>>> Thank you! >>>>> *João Baúto* >>>>> --------------- >>>>> >>>>> *Scientific Computing and Software Platform* >>>>> Champalimaud Research >>>>> Champalimaud Center for the Unknown >>>>> Av. Brasília, Doca de Pedrouços >>>>> 1400-038 Lisbon, Portugal >>>>> fchampalimaud.org <https://www.fchampalimaud.org/> >>>>> >>>>> >>>>> Srijan Sivakumar <[email protected]> escreveu no dia domingo, >>>>> 16/08/2020 à(s) 06:11: >>>>> >>>>>> Hi João, >>>>>> >>>>>> Yes it'll take some time given the file system size as it has to >>>>>> change the xattrs in each level and then crawl upwards. >>>>>> >>>>>> stat is done by the script itself so the crawl is initiated. >>>>>> >>>>>> Regards, >>>>>> Srijan Sivakumar >>>>>> >>>>>> On Sun 16 Aug, 2020, 04:58 João Baúto, < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Hi Srijan & Strahil, >>>>>>> >>>>>>> I ran the quota_fsck script mentioned in Hari's blog post in all >>>>>>> bricks and it detected a lot of size mismatch. >>>>>>> >>>>>>> The script was executed as, >>>>>>> >>>>>>> - python quota_fsck.py --sub-dir projectB --fix-issues /mnt/tank >>>>>>> /tank/volume2/brick (in all nodes and bricks) >>>>>>> >>>>>>> Here is a snippet from the script, >>>>>>> >>>>>>> Size Mismatch /tank/volume2/brick/projectB {'parents': >>>>>>> {'00000000-0000-0000-0000-000000000001': {'contri_file_count': >>>>>>> 18446744073035296610L, 'contri_size': 18446645297413872640L, >>>>>>> 'contri_dir_count': 18446744073709527653L}}, 'version': '1', >>>>>>> 'file_count': >>>>>>> 18446744073035296610L, 'dirty': False, 'dir_count': >>>>>>> 18446744073709527653L, >>>>>>> 'size': 18446645297413872640L} 15204281691754 >>>>>>> MARKING DIRTY: /tank/volume2/brick/projectB >>>>>>> stat on /mnt/tank/projectB >>>>>>> Files verified : 683223 >>>>>>> Directories verified : 46823 >>>>>>> Objects Fixed : 705230 >>>>>>> >>>>>>> Checking the xattr in the bricks I can see the directory in question >>>>>>> marked as dirty, >>>>>>> # getfattr -d -m. -e hex /tank/volume2/brick/projectB >>>>>>> getfattr: Removing leading '/' from absolute path names >>>>>>> # file: tank/volume2/brick/projectB >>>>>>> trusted.gfid=0x3ca2bce0455945efa6662813ce20fc0c >>>>>>> >>>>>>> trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f372478000a7705 >>>>>>> trusted.glusterfs.dht=0xe1a4060c000000003ffffffe5ffffffc >>>>>>> >>>>>>> trusted.glusterfs.mdata=0x010000000000000000000000005f3724750000000013ddf679000000005ce2aff90000000007fdacb0000000005ce2aff90000000007fdacb0 >>>>>>> >>>>>>> trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x00000ca6ccf7a80000000000000790a1000000000000b6ea >>>>>>> trusted.glusterfs.quota.dirty=0x3100 >>>>>>> >>>>>>> trusted.glusterfs.quota.limit-set.1=0x0000640000000000ffffffffffffffff >>>>>>> >>>>>>> trusted.glusterfs.quota.size.1=0x00000ca6ccf7a80000000000000790a1000000000000b6ea >>>>>>> >>>>>>> Now, my question is how do I trigger Gluster to recalculate the >>>>>>> quota for this directory? Is it automatic but it takes a while? Because >>>>>>> the >>>>>>> quota list did change but not to a good "result". >>>>>>> >>>>>>> Path Hard-limit Soft-limit Used >>>>>>> Available Soft-limit exceeded? Hard-limit exceeded? >>>>>>> /projectB 100.0TB 80%(80.0TB) 16383.9PB 190.1TB >>>>>>> No No >>>>>>> >>>>>>> I would like to avoid a disable/enable quota in the volume as it >>>>>>> removes the configs. >>>>>>> >>>>>>> Thank you for all the help! >>>>>>> *João Baúto* >>>>>>> --------------- >>>>>>> >>>>>>> *Scientific Computing and Software Platform* >>>>>>> Champalimaud Research >>>>>>> Champalimaud Center for the Unknown >>>>>>> Av. Brasília, Doca de Pedrouços >>>>>>> 1400-038 Lisbon, Portugal >>>>>>> fchampalimaud.org <https://www.fchampalimaud.org/> >>>>>>> >>>>>>> >>>>>>> Srijan Sivakumar <[email protected]> escreveu no dia sábado, >>>>>>> 15/08/2020 à(s) 11:57: >>>>>>> >>>>>>>> Hi João, >>>>>>>> >>>>>>>> The quota accounting error is what we're looking at here. I think >>>>>>>> you've already looked into the blog post by Hari and are using the >>>>>>>> script >>>>>>>> to fix the accounting issue. >>>>>>>> That should help you out in fixing this issue. >>>>>>>> >>>>>>>> Let me know if you face any issues while using it. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Srijan Sivakumar >>>>>>>> >>>>>>>> >>>>>>>> On Fri 14 Aug, 2020, 17:10 João Baúto, < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> Hi Strahil, >>>>>>>>> >>>>>>>>> I have tried removing the quota for that specific directory and >>>>>>>>> setting it again but it didn't work (maybe it has to be a quota >>>>>>>>> disable and >>>>>>>>> enable in the volume options). Currently testing a solution >>>>>>>>> by Hari with the quota_fsck.py script (https://medium.com/@ >>>>>>>>> harigowtham/glusterfs-quota-fix-accounting-840df33fcd3a) and its >>>>>>>>> detecting a lot of size mismatch in files. >>>>>>>>> >>>>>>>>> Thank you, >>>>>>>>> *João Baúto* >>>>>>>>> --------------- >>>>>>>>> >>>>>>>>> *Scientific Computing and Software Platform* >>>>>>>>> Champalimaud Research >>>>>>>>> Champalimaud Center for the Unknown >>>>>>>>> Av. Brasília, Doca de Pedrouços >>>>>>>>> 1400-038 Lisbon, Portugal >>>>>>>>> fchampalimaud.org <https://www.fchampalimaud.org/> >>>>>>>>> >>>>>>>>> >>>>>>>>> Strahil Nikolov <[email protected]> escreveu no dia sexta, >>>>>>>>> 14/08/2020 à(s) 10:16: >>>>>>>>> >>>>>>>>>> Hi João, >>>>>>>>>> >>>>>>>>>> Based on your output it seems that the quota size is different on >>>>>>>>>> the 2 bricks. >>>>>>>>>> >>>>>>>>>> Have you tried to remove the quota and then recreate it ? Maybe >>>>>>>>>> it will be the easiest way to fix it. >>>>>>>>>> >>>>>>>>>> Best Regards, >>>>>>>>>> Strahil Nikolov >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> На 14 август 2020 г. 4:35:14 GMT+03:00, "João Baúto" < >>>>>>>>>> [email protected]> написа: >>>>>>>>>> >Hi all, >>>>>>>>>> > >>>>>>>>>> >We have a 4-node distributed cluster with 2 bricks per node >>>>>>>>>> running >>>>>>>>>> >Gluster >>>>>>>>>> >7.7 + ZFS. We use directory quota to limit the space used by our >>>>>>>>>> >members on >>>>>>>>>> >each project. Two days ago we noticed inconsistent space used >>>>>>>>>> reported >>>>>>>>>> >by >>>>>>>>>> >Gluster in the quota list. >>>>>>>>>> > >>>>>>>>>> >A small snippet of gluster volume quota vol list, >>>>>>>>>> > >>>>>>>>>> > Path Hard-limit Soft-limit Used >>>>>>>>>> >Available Soft-limit exceeded? Hard-limit exceeded? >>>>>>>>>> >/projectA 5.0TB 80%(4.0TB) 3.1TB >>>>>>>>>> 1.9TB >>>>>>>>>> > No No >>>>>>>>>> >*/projectB 100.0TB 80%(80.0TB) 16383.4PB 740.9TB >>>>>>>>>> > No No* >>>>>>>>>> >/projectC 70.0TB 80%(56.0TB) 50.0TB >>>>>>>>>> 20.0TB >>>>>>>>>> > No No >>>>>>>>>> > >>>>>>>>>> >The total space available in the cluster is 360TB, the quota for >>>>>>>>>> >projectB >>>>>>>>>> >is 100TB and, as you can see, its reporting 16383.4PB used and >>>>>>>>>> 740TB >>>>>>>>>> >available (already decreased from 750TB). >>>>>>>>>> > >>>>>>>>>> >There was an issue in Gluster 3.x related to the wrong directory >>>>>>>>>> quota >>>>>>>>>> >( >>>>>>>>>> > >>>>>>>>>> https://lists.gluster.org/pipermail/gluster-users/2016-February/025305.html >>>>>>>>>> > and >>>>>>>>>> > >>>>>>>>>> https://lists.gluster.org/pipermail/gluster-users/2018-November/035374.html >>>>>>>>>> ) >>>>>>>>>> >but it's marked as solved (not sure if the solution still >>>>>>>>>> applies). >>>>>>>>>> > >>>>>>>>>> >*On projectB* >>>>>>>>>> ># getfattr -d -m . -e hex projectB >>>>>>>>>> ># file: projectB >>>>>>>>>> >trusted.gfid=0x3ca2bce0455945efa6662813ce20fc0c >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f35e69800098ed9 >>>>>>>>>> >trusted.glusterfs.dht=0xe1a4060c000000003ffffffe5ffffffc >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.mdata=0x010000000000000000000000005f355c59000000000939079f000000005ce2aff90000000007fdacb0000000005ce2aff90000000007fdacb0 >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x0000ab0f227a860000000000478e33acffffffffffffc112 >>>>>>>>>> >trusted.glusterfs.quota.dirty=0x3000 >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.quota.limit-set.1=0x0000640000000000ffffffffffffffff >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.quota.size.1=0x0000ab0f227a860000000000478e33acffffffffffffc112 >>>>>>>>>> > >>>>>>>>>> >*On projectA* >>>>>>>>>> ># getfattr -d -m . -e hex projectA >>>>>>>>>> ># file: projectA >>>>>>>>>> >trusted.gfid=0x05b09ded19354c0eb544d22d4659582e >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f1aeb9f00044c64 >>>>>>>>>> >trusted.glusterfs.dht=0xe1a4060c000000001fffffff3ffffffd >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.mdata=0x010000000000000000000000005f1ac6a10000000018f30a4e000000005c338fab0000000017a3135a000000005b0694fb000000001584a21b >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x0000067de3bbe20000000000000128610000000000033498 >>>>>>>>>> >trusted.glusterfs.quota.dirty=0x3000 >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.quota.limit-set.1=0x0000460000000000ffffffffffffffff >>>>>>>>>> >>>>>>>>>> >trusted.glusterfs.quota.size.1=0x0000067de3bbe20000000000000128610000000000033498 >>>>>>>>>> > >>>>>>>>>> >Any idea on what's happening and how to fix it? >>>>>>>>>> > >>>>>>>>>> >Thanks! >>>>>>>>>> >*João Baúto* >>>>>>>>>> >--------------- >>>>>>>>>> > >>>>>>>>>> >*Scientific Computing and Software Platform* >>>>>>>>>> >Champalimaud Research >>>>>>>>>> >Champalimaud Center for the Unknown >>>>>>>>>> >Av. Brasília, Doca de Pedrouços >>>>>>>>>> >1400-038 Lisbon, Portugal >>>>>>>>>> >fchampalimaud.org <https://www.fchampalimaud.org/> >>>>>>>>>> >>>>>>>>> ________ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Community Meeting Calendar: >>>>>>>>> >>>>>>>>> Schedule - >>>>>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>>>>>>>> Bridge: https://bluejeans.com/441850968 >>>>>>>>> >>>>>>>>> Gluster-users mailing list >>>>>>>>> [email protected] >>>>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>>> >>>>>>>> >>>> >>>> -- >>>> >>> >> >> -- Thanks and Regards, SRIJAN SIVAKUMAR Associate Software Engineer Red Hat <https://www.redhat.com> <https://www.redhat.com> T: +91-9727532362 <http://redhatemailsignature-marketing.itos.redhat.com/> <https://red.ht/sig> TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
