Yes, this did turn out to be our main issue. We also had a smaller
issue, but this was the one that caused parts of our pools to go offline
for a short time. Or, 'cause' was us adding some new NVMe drives that
were much larger than the ones we already had so too many PGs got mapped
to them but we didn't realize at first that it was the problem. Taking
those OSDs down again allowed us to quickly recover though.
It was a little hard to figure out, mostly because we had two separate
problems at the same time. Some kind of separate warning message would
have been nice (couldn't find anything in the logs), and perhaps allow
the PGs to activate anyway and put the cluster in health_warn?
My colleague built a lab copy of our environment virtualized and we used
that to recreate and then fix our issues.
We are also working on installing more OSDs, as was our original plan,
so PGs per OSD will decrease over time. At the time we thought to aim
for 300 PGs per OSD, which I realize now was probably not a great idea,
something like 150 would have been better.
/Peter
Den 2018-01-31 kl. 13:42, skrev Thomas Bennett:
Hi Peter,
Relooking at your problem, you might want to keep track of this issue:
http://tracker.ceph.com/issues/22440
<http://tracker.ceph.com/issues/22440>
Regards,
Tom
On Wed, Jan 31, 2018 at 11:37 AM, Thomas Bennett <tho...@ska.ac.za
<mailto:tho...@ska.ac.za>> wrote:
Hi Peter,
From your reply, I see that:
1. pg 3.12c is part of pool 3.
2. The osd's in the "up" for pg 3.12c are: 6, 0, 12.
I suggest to check on this 'activating' issue do the following:
1. What is the rule that pool 3 should follow, 'hybrid', 'nvme'
or 'hdd'? (Use the *ceph osd pool ls detail* command and look
at pool 3's crush rule)
2. Then check are osds 6, 0, 12 backed by nvme's or hdd's? (Use
*ceph osd tree | grep nvme *command to find your nvme backed
osds.)
If your problem is similar to mine, you will have osds that are
nvme backed in a pool that should only be backed by hdds, which
was causing a pg to go into 'activating' state and staying there.
Cheers,
Tom
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com