purged_snaps is persistent indefinitely. If the list gets too large it abbreviates it a bit, but it can cause your osd-map to get a fair bit larger because it keeps track of them.
On Sun, Oct 22, 2017 at 10:39 PM Eric Eastman <eric.east...@keepertech.com> wrote: > On Sun, Oct 22, 2017 at 8:05 PM, Yan, Zheng <uker...@gmail.com> wrote: > >> On Mon, Oct 23, 2017 at 9:35 AM, Eric Eastman >> <eric.east...@keepertech.com> wrote: >> > With help from the list we recently recovered one of our Jewel based >> > clusters that started failing when we got to about 4800 cephfs >> snapshots. >> > We understand that cephfs snapshots are still marked experimental. We >> are >> > running a single active MDS with 2 standby MDS. We only have a single >> file >> > system, we are only taking snapshots from the top level directory, and >> we >> > are now planning on limiting snapshots to a few hundred. Currently we >> have >> > removed all snapshots from this system, using rmdir on each snapshot >> > directory, and the system is reporting that it is healthy: >> > >> > ceph -s >> > cluster ba0c94fc-1168-11e6-aaea-000c290cc2d4 >> > health HEALTH_OK >> > monmap e1: 3 mons at >> > {mon01= >> 10.16.51.21:6789/0,mon02=10.16.51.22:6789/0,mon03=10.16.51.23:6789/0} >> > election epoch 202, quorum 0,1,2 mon01,mon02,mon03 >> > fsmap e18283: 1/1/1 up {0=mds01=up:active}, 2 up:standby >> > osdmap e342543: 93 osds: 93 up, 93 in >> > flags sortbitwise,require_jewel_osds >> > pgmap v38759308: 11336 pgs, 9 pools, 23107 GB data, 12086 kobjects >> > 73956 GB used, 209 TB / 281 TB avail >> > 11336 active+clean >> > client io 509 kB/s rd, 2548 B/s wr, 0 op/s rd, 1 op/s wr >> > >> > The snapshots were removed several days ago, but just as an experiment I >> > decided to query a few PGs in the cephfs data storage pool, and I am >> seeing >> > they are all listing: >> > >> > “purged_snaps": "[2~12cd,12d0~12c9]", >> >> purged_snaps IDs of snapshots whose data have been completely purged. >> Currently purged_snap set is append only, osd never remove ID from it. > > > > Thank you for the quick reply. > So it is normal to have "purged_snaps" listed on a system that all > snapshots have been deleted. > Eric > >> >> >> > > >> > Here is an example: >> > >> > ceph pg 1.72 query >> > { >> > "state": "active+clean", >> > "snap_trimq": "[]", >> > "epoch": 342540, >> > "up": [ >> > 75, >> > 77, >> > 82 >> > ], >> > "acting": [ >> > 75, >> > 77, >> > 82 >> > ], >> > "actingbackfill": [ >> > "75", >> > "77", >> > "82" >> > ], >> > "info": { >> > "pgid": "1.72", >> > "last_update": "342540'261039", >> > "last_complete": "342540'261039", >> > "log_tail": "341080'260697", >> > "last_user_version": 261039, >> > "last_backfill": "MAX", >> > "last_backfill_bitwise": 1, >> > "purged_snaps": "[2~12cd,12d0~12c9]", >> > … >> > >> > Is this an issue? >> > I am not seeing any recent trim activity. >> > Are there any procedures documented for looking at snapshots to see if >> there >> > are any issues? >> > >> > Before posting this, I have reread the cephfs and snapshot pages in at: >> > http://docs.ceph.com/docs/master/cephfs/ >> > http://docs.ceph.com/docs/master/dev/cephfs-snapshots/ >> > >> > Looked at the slides: >> > >> http://events.linuxfoundation.org/sites/events/files/slides/2017-03-23%20Vault%20Snapshots.pdf >> > >> > Watched the video “Ceph Snapshots for Fun and Profit” given at the last >> > OpenStack conference. >> > >> > And I still can’t find much on info on debugging snapshots. >> > >> > Here is some addition information on the cluster: >> > >> > ceph df >> > GLOBAL: >> > SIZE AVAIL RAW USED %RAW USED >> > 281T 209T 73955G 25.62 >> > POOLS: >> > NAME ID USED %USED MAX AVAIL >> OBJECTS >> > rbd 0 16 0 56326G >> 3 >> > cephfs_data 1 22922G 28.92 56326G >> 12279871 >> > cephfs_metadata 2 89260k 0 56326G >> 45232 >> > cinder 9 147G 0.26 56326G >> 41420 >> > glance 10 0 0 56326G >> 0 >> > cinder-backup 11 0 0 56326G >> 0 >> > cinder-ssltest 23 1362M 0 56326G >> 431 >> > IDMT-dfgw02 27 2552M 0 56326G >> 758 >> > dfbackup 28 33987M 0.06 56326G >> 8670 >> > >> > >> > Recent tickets and posts on problems with this cluster >> > http://tracker.ceph.com/issues/21761 >> > http://tracker.ceph.com/issues/21412 >> > https://www.spinics.net/lists/ceph-devel/msg38203.html >> > >> > ceph -v >> > ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe) >> > >> > Kernel is 4.13.1 >> > uname -a >> > Linux ss001 4.13.1-041301-generic #201709100232 SMP Sun Sep 10 06:33:36 >> UTC >> > 2017 x86_64 x86_64 x86_64 GNU/Linux >> > >> > OS is Ubuntu 16.04 >> > >> > Thanks >> > Eric >> > >> > _______________________________________________ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com