On Fri, Jul 21, 2017 at 10:23 PM Daniel K <satha...@gmail.com> wrote:
> Luminous 12.1.0(RC) > > I replaced two OSD drives(old ones were still good, just too small), using: > > ceph osd out osd.12 > ceph osd crush remove osd.12 > ceph auth del osd.12 > systemctl stop ceph-osd@osd.12 > ceph osd rm osd.12 > > I later found that I also should have unmounted it from > /var/lib/ceph/osd-12 > > (remove old disk, insert new disk) > > I added the new disk/osd with ceph-deploy osd prepare stor-vm3:sdg > --bluestore > > This automatically activated the osd (not sure why, I thought it needed a > ceph-deploy osd activate as well) > > > Then, working on an unrelated issue, I upgraded one (out of 4 total) nodes > to 12.1.1 using apt and rebooted. > > The mon daemon would not form a quorum with the others on 12.1.0, so, > instead of troubleshooting that, I just went ahead and upgraded the other 3 > nodes and rebooted. > > Lots of recovery IO went on afterwards, but now things have stopped at: > > pools: 10 pools, 6804 pgs > objects: 1784k objects, 7132 GB > usage: 11915 GB used, 19754 GB / 31669 GB avail > pgs: 0.353% pgs not active > 70894/2988573 objects degraded (2.372%) > 422090/2988573 objects misplaced (14.123%) > 6626 active+clean > 129 active+remapped+backfill_wait > 23 incomplete > 14 active+undersized+degraded+remapped+backfill_wait > 4 active+undersized+degraded+remapped+backfilling > 4 active+remapped+backfilling > 2 active+clean+scrubbing+deep > 1 peering > 1 active+recovery_wait+degraded+remapped > > > when I run ceph pg query on the incompletes, they all list at least one of > the two removed OSDs(12,17) in "down_osds_we_would_probe" > > most pools are size:2 min_size 1(trusting bluestore to tell me which one > is valid). One pool is size:1 min size:1 and I'm okay with losing it, > except I had it mounted in a directory on cephfs, I rm'd the directory but > I can't delete the pool because it's "in use by CephFS" > > > I still have the old drives, can I stick them into another host and re-add > them somehow? > Yes, that'll probably be your easiest solution. You may have some trouble because you already deleted them, but I'm not sure. Alternatively, you ought to be able to remove the pool from CephFS using some of the monitor commands and then delete it. > This data isn't super important, but I'd like to learn a bit on how to > recover when bad things happen as we are planning a production deployment > in a couple of weeks. > > > > > > > > > > > > > > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com