6 hosts with 2 x 10G NICs, data in 2+2 EC pool. 17.2.0, upgrade from pacific.

    health: HEALTH_WARN
            2 host(s) running different kernel versions
            2071 pgs not deep-scrubbed in time
            837 pgs not scrubbed in time

mon: 5 daemons, quorum test-ceph-03,test-ceph-04,dcn-ceph-03,dcn-ceph-02,dcn-ceph-01 (age 116s) mgr: dcn-ceph-01.dzercj(active, since 6h), standbys: dcn-ceph-03.lrhaxo
    mds:        1/1 daemons up, 2 standby
osd: 118 osds: 118 up (since 6d), 118 in (since 6d); 66 remapped pgs
    rbd-mirror: 2 daemons active (2 hosts)

    volumes: 1/1 healthy
    pools:   9 pools, 2737 pgs
    objects: 246.02M objects, 337 TiB
    usage:   665 TiB used, 688 TiB / 1.3 PiB avail
    pgs:     42128281/978408875 objects misplaced (4.306%)
             2332 active+clean
             281  active+clean+snaptrim_wait
             66   active+remapped+backfilling
             36   active+clean+snaptrim
             11   active+clean+scrubbing+deep
             8    active+clean+scrubbing
             1    active+clean+scrubbing+deep+snaptrim_wait
             1    active+clean+scrubbing+deep+snaptrim
             1    active+clean+scrubbing+snaptrim

    client:   159 MiB/s rd, 86 MiB/s wr, 17.14k op/s rd, 326 op/s wr
    recovery: 2.0 MiB/s, 3 objects/s

Low load, low latency, low network traffic. Tried osd_mclock_profile=high_recovery_ops, no difference. Disabling scrubs and snaptrim, no difference.

Am I missing something obvious I should have done after the upgrade?



Torkil Svensgaard
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
KettegÄrd Allé 30
DK-2650 Hvidovre
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to