What distribution and kernel are you running? I recently found my cluster running the 3.10 centos kernel when I thought it was running the elrepo kernel. After forcing it to boot correctly, my flapping osd issue went away.
On Tue, Apr 10, 2018, 2:18 AM Jan Marquardt <j...@artfiles.de> wrote: > Hi, > > we are experiencing massive problems with our Ceph setup. After starting > a "repair pg" because of scrub errors OSDs started to crash, which we > could not stop so far. We are running Ceph 12.2.4. Crashed OSDs are both > bluestore and filestore. > > Our cluster currently looks like this: > > # ceph -s > cluster: > id: c59e56df-2043-4c92-9492-25f05f268d9f > health: HEALTH_ERR > 1 osds down > 73005/17149710 objects misplaced (0.426%) > 5 scrub errors > Reduced data availability: 2 pgs inactive, 2 pgs down > Possible data damage: 1 pg inconsistent > Degraded data redundancy: 611518/17149710 objects degraded > (3.566%), 86 pgs degraded, 86 pgs undersized > > services: > mon: 3 daemons, quorum head1,head2,head3 > mgr: head3(active), standbys: head2, head1 > osd: 34 osds: 24 up, 25 in; 18 remapped pgs > > data: > pools: 1 pools, 768 pgs > objects: 5582k objects, 19500 GB > usage: 62030 GB used, 31426 GB / 93456 GB avail > pgs: 0.260% pgs not active > 611518/17149710 objects degraded (3.566%) > 73005/17149710 objects misplaced (0.426%) > 670 active+clean > 75 active+undersized+degraded > 8 active+undersized+degraded+remapped+backfill_wait > 8 active+clean+remapped > 2 down > 2 active+undersized+degraded+remapped+backfilling > 2 active+clean+scrubbing+deep > 1 active+undersized+degraded+inconsistent > > io: > client: 10911 B/s rd, 118 kB/s wr, 0 op/s rd, 54 op/s wr > recovery: 31575 kB/s, 8 objects/s > > # ceph osd tree > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 124.07297 root default > -2 29.08960 host ceph1 > 0 hdd 3.63620 osd.0 up 1.00000 1.00000 > 1 hdd 3.63620 osd.1 down 0 1.00000 > 2 hdd 3.63620 osd.2 up 1.00000 1.00000 > 3 hdd 3.63620 osd.3 up 1.00000 1.00000 > 4 hdd 3.63620 osd.4 down 0 1.00000 > 5 hdd 3.63620 osd.5 down 0 1.00000 > 6 hdd 3.63620 osd.6 up 1.00000 1.00000 > 7 hdd 3.63620 osd.7 up 1.00000 1.00000 > -3 7.27240 host ceph2 > 14 hdd 3.63620 osd.14 up 1.00000 1.00000 > 15 hdd 3.63620 osd.15 up 1.00000 1.00000 > -4 29.11258 host ceph3 > 16 hdd 3.63620 osd.16 up 1.00000 1.00000 > 18 hdd 3.63620 osd.18 down 0 1.00000 > 19 hdd 3.63620 osd.19 down 0 1.00000 > 20 hdd 3.65749 osd.20 up 1.00000 1.00000 > 21 hdd 3.63620 osd.21 up 1.00000 1.00000 > 22 hdd 3.63620 osd.22 up 1.00000 1.00000 > 23 hdd 3.63620 osd.23 up 1.00000 1.00000 > 24 hdd 3.63789 osd.24 down 0 1.00000 > -9 29.29919 host ceph4 > 17 hdd 3.66240 osd.17 up 1.00000 1.00000 > 25 hdd 3.66240 osd.25 up 1.00000 1.00000 > 26 hdd 3.66240 osd.26 down 0 1.00000 > 27 hdd 3.66240 osd.27 up 1.00000 1.00000 > 28 hdd 3.66240 osd.28 down 0 1.00000 > 29 hdd 3.66240 osd.29 up 1.00000 1.00000 > 30 hdd 3.66240 osd.30 up 1.00000 1.00000 > 31 hdd 3.66240 osd.31 down 0 1.00000 > -11 29.29919 host ceph5 > 32 hdd 3.66240 osd.32 up 1.00000 1.00000 > 33 hdd 3.66240 osd.33 up 1.00000 1.00000 > 34 hdd 3.66240 osd.34 up 1.00000 1.00000 > 35 hdd 3.66240 osd.35 up 1.00000 1.00000 > 36 hdd 3.66240 osd.36 down 1.00000 1.00000 > 37 hdd 3.66240 osd.37 up 1.00000 1.00000 > 38 hdd 3.66240 osd.38 up 1.00000 1.00000 > 39 hdd 3.66240 osd.39 up 1.00000 1.00000 > > The last OSDs that crashed are #28 and #36. Please find the > corresponding log files here: > > http://af.janno.io/ceph/ceph-osd.28.log.1.gz > http://af.janno.io/ceph/ceph-osd.36.log.1.gz > > The backtraces look almost the same for all crashed OSDs. > > Any help, hint or advice would really be appreciated. Please let me know > if you need any further information. > > Best Regards > > Jan > > -- > Artfiles New Media GmbH | Zirkusweg 1 | 20359 Hamburg > Tel: 040 - 32 02 72 90 | Fax: 040 - 32 02 72 95 > E-Mail: supp...@artfiles.de | Web: http://www.artfiles.de > Geschäftsführer: Harald Oltmanns | Tim Evers > Eingetragen im Handelsregister Hamburg - HRB 81478 > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com