10.2.2

-Mykola

On 7 October 2016 at 15:43, Yan, Zheng <uker...@gmail.com> wrote:

> On Thu, Oct 6, 2016 at 4:11 PM,  <mykola.dvor...@gmail.com> wrote:
> > Is there any way to repair pgs/cephfs gracefully?
> >
>
> So far no.  We need to write a tool to repair this type of corruption.
>
> Which version of ceph did you use before upgrading to 10.2.3 ?
>
> Regards
> Yan, Zheng
>
> >
> >
> > -Mykola
> >
> >
> >
> > From: Yan, Zheng
> > Sent: Thursday, 6 October 2016 04:48
> > To: Mykola Dvornik
> > Cc: John Spray; ceph-users
> > Subject: Re: [ceph-users] CephFS: No space left on device
> >
> >
> >
> > On Wed, Oct 5, 2016 at 2:27 PM, Mykola Dvornik <mykola.dvor...@gmail.com
> >
> > wrote:
> >
> >> Hi Zheng,
> >
> >>
> >
> >> Many thanks for you reply.
> >
> >>
> >
> >> This indicates the MDS metadata is corrupted. Did you do any unusual
> >
> >> operation on the cephfs? (e.g reset journal, create new fs using
> >
> >> existing metadata pool)
> >
> >>
> >
> >> No, nothing has been explicitly done to the MDS. I had a few
> inconsistent
> >
> >> PGs that belonged to the (3 replica) metadata pool. The symptoms were
> >
> >> similar to http://tracker.ceph.com/issues/17177 . The PGs were
> eventually
> >
> >> repaired and no data corruption was expected as explained in the ticket.
> >
> >>
> >
> >
> >
> > I'm afraid that issue does cause corruption.
> >
> >
> >
> >> BTW, when I posted this issue on the ML the amount of ground state stry
> >
> >> objects was around 7.5K. Now it went up to 23K. No inconsistent PGs or
> any
> >
> >> other problems happened to the cluster within this time scale.
> >
> >>
> >
> >> -Mykola
> >
> >>
> >
> >> On 5 October 2016 at 05:49, Yan, Zheng <uker...@gmail.com> wrote:
> >
> >>>
> >
> >>> On Mon, Oct 3, 2016 at 5:48 AM, Mykola Dvornik <
> mykola.dvor...@gmail.com>
> >
> >>> wrote:
> >
> >>> > Hi Johan,
> >
> >>> >
> >
> >>> > Many thanks for your reply. I will try to play with the mds tunables
> >>> > and
> >
> >>> > report back to your ASAP.
> >
> >>> >
> >
> >>> > So far I see that mds log contains a lot of errors of the following
> >
> >>> > kind:
> >
> >>> >
> >
> >>> > 2016-10-02 11:58:03.002769 7f8372d54700  0
> mds.0.cache.dir(100056ddecd)
> >
> >>> > _fetched  badness: got (but i already had) [inode 10005729a77
> [2,head]
> >
> >>> > ~mds0/stray1/10005729a77 auth v67464942 s=196728 nl=0 n(v0 b196728
> >
> >>> > 1=1+0)
> >
> >>> > (iversion lock) 0x7f84acae82a0] mode 33204 mtime 2016-08-07
> >
> >>> > 23:06:29.776298
> >
> >>> >
> >
> >>> > 2016-10-02 11:58:03.002789 7f8372d54700 -1 log_channel(cluster) log
> >
> >>> > [ERR] :
> >
> >>> > loaded dup inode 10005729a77 [2,head] v68621 at
> >
> >>> >
> >
> >>> >
> >>> > /users/mykola/mms/NCSHNO/final/120nm-uniform-h8200/
> j002654.out/m_xrange192-320_yrange192-320_016232.dump,
> >
> >>> > but inode 10005729a77.head v67464942 already exists at
> >
> >>> > ~mds0/stray1/10005729a77
> >
> >>>
> >
> >>> This indicates the MDS metadata is corrupted. Did you do any unusual
> >
> >>> operation on the cephfs? (e.g reset journal, create new fs using
> >
> >>> existing metadata pool)
> >
> >>>
> >
> >>> >
> >
> >>> > Those folders within mds.0.cache.dir that got badness report a size
> of
> >
> >>> > 16EB
> >
> >>> > on the clients. rm on them fails with 'Directory not empty'.
> >
> >>> >
> >
> >>> > As for the "Client failing to respond to cache pressure", I have 2
> >
> >>> > kernel
> >
> >>> > clients on 4.4.21, 1 on 4.7.5 and 16 fuse clients always running the
> >
> >>> > most
> >
> >>> > recent release version of ceph-fuse. The funny thing is that every
> >
> >>> > single
> >
> >>> > client misbehaves from time to time. I am aware of quite discussion
> >
> >>> > about
> >
> >>> > this issue on the ML, but cannot really follow how to debug it.
> >
> >>> >
> >
> >>> > Regards,
> >
> >>> >
> >
> >>> > -Mykola
> >
> >>> >
> >
> >>> > On 2 October 2016 at 22:27, John Spray <jsp...@redhat.com> wrote:
> >
> >>> >>
> >
> >>> >> On Sun, Oct 2, 2016 at 11:09 AM, Mykola Dvornik
> >
> >>> >> <mykola.dvor...@gmail.com> wrote:
> >
> >>> >> > After upgrading to 10.2.3 we frequently see messages like
> >
> >>> >>
> >
> >>> >> From which version did you upgrade?
> >
> >>> >>
> >
> >>> >> > 'rm: cannot remove '...': No space left on device
> >
> >>> >> >
> >
> >>> >> > The folders we are trying to delete contain approx. 50K files 193
> KB
> >
> >>> >> > each.
> >
> >>> >>
> >
> >>> >> My guess would be that you are hitting the new
> >
> >>> >> mds_bal_fragment_size_max check.  This limits the number of entries
> >
> >>> >> that the MDS will create in a single directory fragment, to avoid
> >
> >>> >> overwhelming the OSD with oversized objects.  It is 100000 by
> default.
> >
> >>> >> This limit also applies to "stray" directories where unlinked files
> >
> >>> >> are put while they wait to be purged, so you could get into this
> state
> >
> >>> >> while doing lots of deletions.  There are ten stray directories that
> >
> >>> >> get a roughly even share of files, so if you have more than about
> one
> >
> >>> >> million files waiting to be purged, you could see this condition.
> >
> >>> >>
> >
> >>> >> The "Client failing to respond to cache pressure" messages may play
> a
> >
> >>> >> part here -- if you have misbehaving clients then they may cause the
> >
> >>> >> MDS to delay purging stray files, leading to a backlog.  If your
> >
> >>> >> clients are by any chance older kernel clients, you should upgrade
> >
> >>> >> them.  You can also unmount/remount them to clear this state,
> although
> >
> >>> >> it will reoccur until the clients are updated (or until the bug is
> >
> >>> >> fixed, if you're running latest clients already).
> >
> >>> >>
> >
> >>> >> The high level counters for strays are part of the default output of
> >
> >>> >> "ceph daemonperf mds.<id>" when run on the MDS server (the "stry"
> and
> >
> >>> >> "purg" columns).  You can look at these to watch how fast the MDS is
> >
> >>> >> clearing out strays.  If your backlog is just because it's not doing
> >
> >>> >> it fast enough, then you can look at tuning mds_max_purge_files and
> >
> >>> >> mds_max_purge_ops to adjust the throttles on purging.  Those
> settings
> >
> >>> >> can be adjusted without restarting the MDS using the "injectargs"
> >
> >>> >> command
> >
> >>> >>
> >
> >>> >>
> >>> >> (http://docs.ceph.com/docs/master/rados/operations/
> control/#mds-subsystem)
> >
> >>> >>
> >
> >>> >> Let us know how you get on.
> >
> >>> >>
> >
> >>> >> John
> >
> >>> >>
> >
> >>> >>
> >
> >>> >> > The cluster state and storage available are both OK:
> >
> >>> >> >
> >
> >>> >> >     cluster 98d72518-6619-4b5c-b148-9a781ef13bcb
> >
> >>> >> >      health HEALTH_WARN
> >
> >>> >> >             mds0: Client XXX.XXX.XXX.XXX failing to respond to
> cache
> >
> >>> >> > pressure
> >
> >>> >> >             mds0: Client XXX.XXX.XXX.XXX failing to respond to
> cache
> >
> >>> >> > pressure
> >
> >>> >> >             mds0: Client XXX.XXX.XXX.XXX failing to respond to
> cache
> >
> >>> >> > pressure
> >
> >>> >> >             mds0: Client XXX.XXX.XXX.XXX failing to respond to
> cache
> >
> >>> >> > pressure
> >
> >>> >> >             mds0: Client XXX.XXX.XXX.XXX failing to respond to
> cache
> >
> >>> >> > pressure
> >
> >>> >> >      monmap e1: 1 mons at {000-s-ragnarok=XXX.XXX.XXX.XXX:6789/0}
> >
> >>> >> >             election epoch 11, quorum 0 000-s-ragnarok
> >
> >>> >> >       fsmap e62643: 1/1/1 up {0=000-s-ragnarok=up:active}
> >
> >>> >> >      osdmap e20203: 16 osds: 16 up, 16 in
> >
> >>> >> >             flags sortbitwise
> >
> >>> >> >       pgmap v15284654: 1088 pgs, 2 pools, 11263 GB data, 40801
> >
> >>> >> > kobjects
> >
> >>> >> >             23048 GB used, 6745 GB / 29793 GB avail
> >
> >>> >> >                 1085 active+clean
> >
> >>> >> >                    2 active+clean+scrubbing
> >
> >>> >> >                    1 active+clean+scrubbing+deep
> >
> >>> >> >
> >
> >>> >> >
> >
> >>> >> > Has anybody experienced this issue so far?
> >
> >>> >> >
> >
> >>> >> > Regards,
> >
> >>> >> > --
> >
> >>> >> >  Mykola
> >
> >>> >> >
> >
> >>> >> > _______________________________________________
> >
> >>> >> > ceph-users mailing list
> >
> >>> >> > ceph-users@lists.ceph.com
> >
> >>> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >>> >> >
> >
> >>> >
> >
> >>> >
> >
> >>> >
> >
> >>> >
> >
> >>> > --
> >
> >>> >  Mykola
> >
> >>> >
> >
> >>> > _______________________________________________
> >
> >>> > ceph-users mailing list
> >
> >>> > ceph-users@lists.ceph.com
> >
> >>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >>> >
> >
> >>
> >
> >>
> >
> >>
> >
> >>
> >
> >> --
> >
> >>  Mykola
> >
> >
>



-- 
 Mykola
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to