2 weeks later and things are still deleting, but getting really close to being done. I tried to use ceph-objectstore-tool to remove one of the PGs. I only tested on 1 PG on 1 OSD, but it's doing something really weird. While it was running, my connection to the DC reset and the command died. Now when I try to run the tool it segfaults and just running the OSD it doesn't try to delete the data. The data in this PG does not matter and I figure the worst case scenario is that it just sits there taking up 200GB until I redeploy the OSD.
However, I like to learn things about Ceph. Is there anyone with any insight to what is happening with this PG? [root@osd1 ~] # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-0 --journal-path /var/lib/ceph/osd/ceph-0/journal --pgid 97.314s0 --op remove SG_IO: questionable sense data, results may be incorrect SG_IO: questionable sense data, results may be incorrect marking collection for removal mark_pg_for_removal warning: peek_map_epoch reported error terminate called after throwing an instance of 'ceph::buffer::end_of_buffer' what(): buffer::end_of_buffer *** Caught signal (Aborted) ** in thread 7f98ab2dc980 thread_name:ceph-objectstor ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185) 1: (()+0x95209a) [0x7f98abc4b09a] 2: (()+0xf100) [0x7f98a91d7100] 3: (gsignal()+0x37) [0x7f98a7d825f7] 4: (abort()+0x148) [0x7f98a7d83ce8] 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f98a86879d5] 6: (()+0x5e946) [0x7f98a8685946] 7: (()+0x5e973) [0x7f98a8685973] 8: (()+0x5eb93) [0x7f98a8685b93] 9: (ceph::buffer::list::iterator_impl<false>::copy(unsigned int, char*)+0xa5) [0x7f98abd498a5] 10: (PG::read_info(ObjectStore*, spg_t, coll_t const&, ceph::buffer::list&, pg_info_t&, std::map<unsigned int, pg_interval_t, std::less<unsigned int>, std::allocator<std::pair<unsigned int const, pg_interval_t> > >&, unsigned char&)+0x324) [0x7f98ab6d3094] 11: (mark_pg_for_removal(ObjectStore*, spg_t, ObjectStore::Transaction*)+0x87c) [0x7f98ab66615c] 12: (initiate_new_remove_pg(ObjectStore*, spg_t, ObjectStore::Sequencer&)+0x131) [0x7f98ab666a51] 13: (main()+0x39b7) [0x7f98ab610437] 14: (__libc_start_main()+0xf5) [0x7f98a7d6eb15] 15: (()+0x363a57) [0x7f98ab65ca57] Aborted On Thu, Nov 2, 2017 at 12:45 PM Gregory Farnum <gfar...@redhat.com> wrote: > Deletion is throttled, though I don’t know the configs to change it you > could poke around if you want stuff to go faster. > > Don’t just remove the directory in the filesystem; you need to clean up > the leveldb metadata as well. ;) > Removing the pg via Ceph-objectstore-tool would work fine but I’ve seen > too many people kill the wrong thing to recommend it. > -Greg > On Thu, Nov 2, 2017 at 9:40 AM David Turner <drakonst...@gmail.com> wrote: > >> Jewel 10.2.7; XFS formatted OSDs; no dmcrypt or LVM. I have a pool that >> I deleted 16 hours ago that accounted for about 70% of the available space >> on each OSD (averaging 84% full), 370M objects in 8k PGs, ec 4+2 profile. >> Based on the rate that the OSDs are freeing up space after deleting the >> pool, it will take about a week to finish deleting the PGs from the OSDs. >> >> Is there anything I can do to speed this process up? I feel like there >> may be a way for me to go through the OSDs and delete the PG folders either >> with the objectstore tool or while the OSD is offline. I'm not sure what >> Ceph is doing to delete the pool, but I don't think that an `rm -Rf` of the >> PG folder would take nearly this long. >> >> Thank you all for your help. >> > _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com