hdd pool

Peter Linder Wed, 31 Jan 2018 12:58:58 -0800

Yes, this did turn out to be our main issue. We also had a smallerissue, but this was the one that caused parts of our pools to go offlinefor a short time. Or, 'cause' was us adding some new NVMe drives thatwere much larger than the ones we already had so too many PGs got mappedto them but we didn't realize at first that it was the problem. Takingthose OSDs down again allowed us to quickly recover though.

It was a little hard to figure out, mostly because we had two separateproblems at the same time. Some kind of separate warning message wouldhave been nice (couldn't find anything in the logs), and perhaps allowthe PGs to activate anyway and put the cluster in health_warn?

My colleague built a lab copy of our environment virtualized and we usedthat to recreate and then fix our issues.

We are also working on installing more OSDs, as was our original plan,so PGs per OSD will decrease over time. At the time we thought to aimfor 300 PGs per OSD, which I realize now was probably not a great idea,something like 150 would have been better.


/Peter

Den 2018-01-31 kl. 13:42, skrev Thomas Bennett:

Hi Peter,

Relooking at your problem, you might want to keep track of this issue:http://tracker.ceph.com/issues/22440<http://tracker.ceph.com/issues/22440>


Regards,
Tom

On Wed, Jan 31, 2018 at 11:37 AM, Thomas Bennett <tho...@ska.ac.za<mailto:tho...@ska.ac.za>> wrote:


    Hi Peter,

    From your reply, I see that:

     1. pg 3.12c is part of pool 3.
     2. The osd's in the "up" for pg 3.12c  are: 6, 0, 12.


    I suggest to check on this 'activating' issue do the following:

     1. What is the rule that pool 3 should follow, 'hybrid', 'nvme'
        or 'hdd'? (Use the *ceph osd pool ls detail* command and look
        at pool 3's crush rule)
     2. Then check are osds 6, 0, 12 backed by nvme's or hdd's? (Use
        *ceph osd tree | grep nvme *command to find your nvme backed
        osds.)


    If your problem is similar to mine, you will have osds that are
    nvme backed in a pool that should only be backed by hdds, which
    was causing a pg to go into 'activating' state and staying there.

    Cheers,
    Tom

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

Reply via email to