10.2.2 -Mykola
On 7 October 2016 at 15:43, Yan, Zheng <uker...@gmail.com> wrote: > On Thu, Oct 6, 2016 at 4:11 PM, <mykola.dvor...@gmail.com> wrote: > > Is there any way to repair pgs/cephfs gracefully? > > > > So far no. We need to write a tool to repair this type of corruption. > > Which version of ceph did you use before upgrading to 10.2.3 ? > > Regards > Yan, Zheng > > > > > > > -Mykola > > > > > > > > From: Yan, Zheng > > Sent: Thursday, 6 October 2016 04:48 > > To: Mykola Dvornik > > Cc: John Spray; ceph-users > > Subject: Re: [ceph-users] CephFS: No space left on device > > > > > > > > On Wed, Oct 5, 2016 at 2:27 PM, Mykola Dvornik <mykola.dvor...@gmail.com > > > > wrote: > > > >> Hi Zheng, > > > >> > > > >> Many thanks for you reply. > > > >> > > > >> This indicates the MDS metadata is corrupted. Did you do any unusual > > > >> operation on the cephfs? (e.g reset journal, create new fs using > > > >> existing metadata pool) > > > >> > > > >> No, nothing has been explicitly done to the MDS. I had a few > inconsistent > > > >> PGs that belonged to the (3 replica) metadata pool. The symptoms were > > > >> similar to http://tracker.ceph.com/issues/17177 . The PGs were > eventually > > > >> repaired and no data corruption was expected as explained in the ticket. > > > >> > > > > > > > > I'm afraid that issue does cause corruption. > > > > > > > >> BTW, when I posted this issue on the ML the amount of ground state stry > > > >> objects was around 7.5K. Now it went up to 23K. No inconsistent PGs or > any > > > >> other problems happened to the cluster within this time scale. > > > >> > > > >> -Mykola > > > >> > > > >> On 5 October 2016 at 05:49, Yan, Zheng <uker...@gmail.com> wrote: > > > >>> > > > >>> On Mon, Oct 3, 2016 at 5:48 AM, Mykola Dvornik < > mykola.dvor...@gmail.com> > > > >>> wrote: > > > >>> > Hi Johan, > > > >>> > > > > >>> > Many thanks for your reply. I will try to play with the mds tunables > >>> > and > > > >>> > report back to your ASAP. > > > >>> > > > > >>> > So far I see that mds log contains a lot of errors of the following > > > >>> > kind: > > > >>> > > > > >>> > 2016-10-02 11:58:03.002769 7f8372d54700 0 > mds.0.cache.dir(100056ddecd) > > > >>> > _fetched badness: got (but i already had) [inode 10005729a77 > [2,head] > > > >>> > ~mds0/stray1/10005729a77 auth v67464942 s=196728 nl=0 n(v0 b196728 > > > >>> > 1=1+0) > > > >>> > (iversion lock) 0x7f84acae82a0] mode 33204 mtime 2016-08-07 > > > >>> > 23:06:29.776298 > > > >>> > > > > >>> > 2016-10-02 11:58:03.002789 7f8372d54700 -1 log_channel(cluster) log > > > >>> > [ERR] : > > > >>> > loaded dup inode 10005729a77 [2,head] v68621 at > > > >>> > > > > >>> > > >>> > /users/mykola/mms/NCSHNO/final/120nm-uniform-h8200/ > j002654.out/m_xrange192-320_yrange192-320_016232.dump, > > > >>> > but inode 10005729a77.head v67464942 already exists at > > > >>> > ~mds0/stray1/10005729a77 > > > >>> > > > >>> This indicates the MDS metadata is corrupted. Did you do any unusual > > > >>> operation on the cephfs? (e.g reset journal, create new fs using > > > >>> existing metadata pool) > > > >>> > > > >>> > > > > >>> > Those folders within mds.0.cache.dir that got badness report a size > of > > > >>> > 16EB > > > >>> > on the clients. rm on them fails with 'Directory not empty'. > > > >>> > > > > >>> > As for the "Client failing to respond to cache pressure", I have 2 > > > >>> > kernel > > > >>> > clients on 4.4.21, 1 on 4.7.5 and 16 fuse clients always running the > > > >>> > most > > > >>> > recent release version of ceph-fuse. The funny thing is that every > > > >>> > single > > > >>> > client misbehaves from time to time. I am aware of quite discussion > > > >>> > about > > > >>> > this issue on the ML, but cannot really follow how to debug it. > > > >>> > > > > >>> > Regards, > > > >>> > > > > >>> > -Mykola > > > >>> > > > > >>> > On 2 October 2016 at 22:27, John Spray <jsp...@redhat.com> wrote: > > > >>> >> > > > >>> >> On Sun, Oct 2, 2016 at 11:09 AM, Mykola Dvornik > > > >>> >> <mykola.dvor...@gmail.com> wrote: > > > >>> >> > After upgrading to 10.2.3 we frequently see messages like > > > >>> >> > > > >>> >> From which version did you upgrade? > > > >>> >> > > > >>> >> > 'rm: cannot remove '...': No space left on device > > > >>> >> > > > > >>> >> > The folders we are trying to delete contain approx. 50K files 193 > KB > > > >>> >> > each. > > > >>> >> > > > >>> >> My guess would be that you are hitting the new > > > >>> >> mds_bal_fragment_size_max check. This limits the number of entries > > > >>> >> that the MDS will create in a single directory fragment, to avoid > > > >>> >> overwhelming the OSD with oversized objects. It is 100000 by > default. > > > >>> >> This limit also applies to "stray" directories where unlinked files > > > >>> >> are put while they wait to be purged, so you could get into this > state > > > >>> >> while doing lots of deletions. There are ten stray directories that > > > >>> >> get a roughly even share of files, so if you have more than about > one > > > >>> >> million files waiting to be purged, you could see this condition. > > > >>> >> > > > >>> >> The "Client failing to respond to cache pressure" messages may play > a > > > >>> >> part here -- if you have misbehaving clients then they may cause the > > > >>> >> MDS to delay purging stray files, leading to a backlog. If your > > > >>> >> clients are by any chance older kernel clients, you should upgrade > > > >>> >> them. You can also unmount/remount them to clear this state, > although > > > >>> >> it will reoccur until the clients are updated (or until the bug is > > > >>> >> fixed, if you're running latest clients already). > > > >>> >> > > > >>> >> The high level counters for strays are part of the default output of > > > >>> >> "ceph daemonperf mds.<id>" when run on the MDS server (the "stry" > and > > > >>> >> "purg" columns). You can look at these to watch how fast the MDS is > > > >>> >> clearing out strays. If your backlog is just because it's not doing > > > >>> >> it fast enough, then you can look at tuning mds_max_purge_files and > > > >>> >> mds_max_purge_ops to adjust the throttles on purging. Those > settings > > > >>> >> can be adjusted without restarting the MDS using the "injectargs" > > > >>> >> command > > > >>> >> > > > >>> >> > >>> >> (http://docs.ceph.com/docs/master/rados/operations/ > control/#mds-subsystem) > > > >>> >> > > > >>> >> Let us know how you get on. > > > >>> >> > > > >>> >> John > > > >>> >> > > > >>> >> > > > >>> >> > The cluster state and storage available are both OK: > > > >>> >> > > > > >>> >> > cluster 98d72518-6619-4b5c-b148-9a781ef13bcb > > > >>> >> > health HEALTH_WARN > > > >>> >> > mds0: Client XXX.XXX.XXX.XXX failing to respond to > cache > > > >>> >> > pressure > > > >>> >> > mds0: Client XXX.XXX.XXX.XXX failing to respond to > cache > > > >>> >> > pressure > > > >>> >> > mds0: Client XXX.XXX.XXX.XXX failing to respond to > cache > > > >>> >> > pressure > > > >>> >> > mds0: Client XXX.XXX.XXX.XXX failing to respond to > cache > > > >>> >> > pressure > > > >>> >> > mds0: Client XXX.XXX.XXX.XXX failing to respond to > cache > > > >>> >> > pressure > > > >>> >> > monmap e1: 1 mons at {000-s-ragnarok=XXX.XXX.XXX.XXX:6789/0} > > > >>> >> > election epoch 11, quorum 0 000-s-ragnarok > > > >>> >> > fsmap e62643: 1/1/1 up {0=000-s-ragnarok=up:active} > > > >>> >> > osdmap e20203: 16 osds: 16 up, 16 in > > > >>> >> > flags sortbitwise > > > >>> >> > pgmap v15284654: 1088 pgs, 2 pools, 11263 GB data, 40801 > > > >>> >> > kobjects > > > >>> >> > 23048 GB used, 6745 GB / 29793 GB avail > > > >>> >> > 1085 active+clean > > > >>> >> > 2 active+clean+scrubbing > > > >>> >> > 1 active+clean+scrubbing+deep > > > >>> >> > > > > >>> >> > > > > >>> >> > Has anybody experienced this issue so far? > > > >>> >> > > > > >>> >> > Regards, > > > >>> >> > -- > > > >>> >> > Mykola > > > >>> >> > > > > >>> >> > _______________________________________________ > > > >>> >> > ceph-users mailing list > > > >>> >> > ceph-users@lists.ceph.com > > > >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > >>> >> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> > -- > > > >>> > Mykola > > > >>> > > > > >>> > _______________________________________________ > > > >>> > ceph-users mailing list > > > >>> > ceph-users@lists.ceph.com > > > >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > >>> > > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> Mykola > > > > > -- Mykola
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com