On Tue, Oct 1, 2019 at 10:43 PM Del Monaco, Andrea < andrea.delmon...@atos.net> wrote:
> Hi list, > > After the nodes ran OOM and after reboot, we are not able to restart the > ceph-osd@x services anymore. (Details about the setup at the end). > > I am trying to do this manually, so we can see the error but all i see is > several crash dumps - this is just one of the OSDs which is not starting. > Any idea how to get past this?? > [root@ceph001 ~]# /usr/bin/ceph-osd --debug_osd 10 -f --cluster ceph --id > 83 --setuser ceph --setgroup ceph > /tmp/dump 2>&1 > starting osd.83 at - osd_data /var/lib/ceph/osd/ceph-83 > /var/lib/ceph/osd/ceph-83/journal > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' > thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > 34: FAILED assert(stripe_width % stripe_size == 0) > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14b) [0x2aaaaaf3d36b] > 2: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 7: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 8: (OSD::init()+0xc99) [0x5555559238e9] > 9: (main()+0x23a3) [0x5555558017a3] > 10: (__libc_start_main()+0xf5) [0x2aaab77de495] > 11: (()+0x385900) [0x5555558d9900] > 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' > thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > 34: FAILED assert(stripe_width % stripe_size == 0) > https://tracker.ceph.com/issues/41336 may be relevant here. Can you post details of the pool involved as well as the erasure code profile in use for that pool? > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14b) [0x2aaaaaf3d36b] > 2: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 7: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 8: (OSD::init()+0xc99) [0x5555559238e9] > 9: (main()+0x23a3) [0x5555558017a3] > 10: (__libc_start_main()+0xf5) [0x2aaab77de495] > 11: (()+0x385900) [0x5555558d9900] > > *** Caught signal (Aborted) ** > in thread 2aaaaaaf5540 thread_name:ceph-osd > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (()+0xf5d0) [0x2aaab69765d0] > 2: (gsignal()+0x37) [0x2aaab77f22c7] > 3: (abort()+0x148) [0x2aaab77f39b8] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x248) [0x2aaaaaf3d468] > 5: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 10: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 11: (OSD::init()+0xc99) [0x5555559238e9] > 12: (main()+0x23a3) [0x5555558017a3] > 13: (__libc_start_main()+0xf5) [0x2aaab77de495] > 14: (()+0x385900) [0x5555558d9900] > 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal (Aborted) ** > in thread 2aaaaaaf5540 thread_name:ceph-osd > > > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (()+0xf5d0) [0x2aaab69765d0] > 2: (gsignal()+0x37) [0x2aaab77f22c7] > 3: (abort()+0x148) [0x2aaab77f39b8] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x248) [0x2aaaaaf3d468] > 5: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 10: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 11: (OSD::init()+0xc99) [0x5555559238e9] > 12: (main()+0x23a3) [0x5555558017a3] > 13: (__libc_start_main()+0xf5) [0x2aaab77de495] > 14: (()+0x385900) [0x5555558d9900] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > -693> 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/ > x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' > thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > 34: FAILED assert(stripe_width % stripe_size == 0) > > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14b) [0x2aaaaaf3d36b] > 2: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 7: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 8: (OSD::init()+0xc99) [0x5555559238e9] > 9: (main()+0x23a3) [0x5555558017a3] > 10: (__libc_start_main()+0xf5) [0x2aaab77de495] > 11: (()+0x385900) [0x5555558d9900] > > -693> 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal > (Aborted) ** > in thread 2aaaaaaf5540 thread_name:ceph-osd > > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (()+0xf5d0) [0x2aaab69765d0] > 2: (gsignal()+0x37) [0x2aaab77f22c7] > 3: (abort()+0x148) [0x2aaab77f39b8] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x248) [0x2aaaaaf3d468] > 5: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 10: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 11: (OSD::init()+0xc99) [0x5555559238e9] > 12: (main()+0x23a3) [0x5555558017a3] > 13: (__libc_start_main()+0xf5) [0x2aaab77de495] > 14: (()+0x385900) [0x5555558d9900] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > -693> 2019-10-01 14:19:49.500 2aaaaaaf5540 -1 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > In function 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' > thread 2aaaaaaf5540 time 2019-10-01 14:19:49.494368 > /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.6/rpm/el7/BUILD/ceph-13.2.6/src/osd/ECUtil.h: > 34: FAILED assert(stripe_width % stripe_size == 0) > > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x14b) [0x2aaaaaf3d36b] > 2: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 3: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 4: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 5: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 6: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 7: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 8: (OSD::init()+0xc99) [0x5555559238e9] > 9: (main()+0x23a3) [0x5555558017a3] > 10: (__libc_start_main()+0xf5) [0x2aaab77de495] > 11: (()+0x385900) [0x5555558d9900] > > -693> 2019-10-01 14:19:49.509 2aaaaaaf5540 -1 *** Caught signal > (Aborted) ** > in thread 2aaaaaaf5540 thread_name:ceph-osd > > ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic > (stable) > 1: (()+0xf5d0) [0x2aaab69765d0] > 2: (gsignal()+0x37) [0x2aaab77f22c7] > 3: (abort()+0x148) [0x2aaab77f39b8] > 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x248) [0x2aaaaaf3d468] > 5: (()+0x26e4f7) [0x2aaaaaf3d4f7] > 6: (ECBackend::ECBackend(PGBackend::Listener*, coll_t const&, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned > long)+0x46d) [0x555555c0bd3d] > 7: (PGBackend::build_pg_backend(pg_pool_t const&, std::map<std::string, > std::string, std::less<std::string>, std::allocator<std::pair<std::string > const, std::string> > > const&, PGBackend::Listener*, coll_t, > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore*, > CephContext*)+0x30a) [0x555555b0ba8a] > 8: (PrimaryLogPG::PrimaryLogPG(OSDService*, std::shared_ptr<OSDMap > const>, PGPool const&, std::map<std::string, std::string, > std::less<std::string>, std::allocator<std::pair<std::string const, > std::string> > > const&, spg_t)+0x140) [0x555555abd100] > 9: (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x10cb) > [0x555555914ecb] > 10: (OSD::load_pgs()+0x4a9) [0x555555917e39] > 11: (OSD::init()+0xc99) [0x5555559238e9] > 12: (main()+0x23a3) [0x5555558017a3] > 13: (__libc_start_main()+0xf5) [0x2aaab77de495] > 14: (()+0x385900) [0x5555558d9900] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed > to interpret this. > > Environment: > [root@ceph001 ~]# uname -r > 3.10.0-957.27.2.el7.x86_64 > [root@ceph001 ~]# cat /etc/redhat-release > CentOS Linux release 7.6.1810 (Core) > [root@ceph001 ~]# rpm -qa | grep -i ceph > cm-config-ceph-release-mimic-8.2-73_cm8.2.noarch > ceph-13.2.6-0.el7.x86_64 > ceph-selinux-13.2.6-0.el7.x86_64 > ceph-base-13.2.6-0.el7.x86_64 > ceph-osd-13.2.6-0.el7.x86_64 > cm-config-ceph-radosgw-systemd-8.2-6_cm8.2.noarch > libcephfs2-13.2.6-0.el7.x86_64 > ceph-common-13.2.6-0.el7.x86_64 > ceph-mgr-13.2.6-0.el7.x86_64 > cm-config-ceph-systemd-8.2-12_cm8.2.noarch > ceph-mon-13.2.6-0.el7.x86_64 > python-cephfs-13.2.6-0.el7.x86_64 > ceph-mds-13.2.6-0.el7.x86_64 > > ceph osd tree: > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 785.95801 root default > -5 261.98599 host ceph001 > 1 hdd 7.27699 osd.1 up 1.00000 1.00000 > 3 hdd 7.27699 osd.3 down 1.00000 1.00000 > 6 hdd 7.27699 osd.6 down 1.00000 1.00000 > 9 hdd 7.27699 osd.9 down 0 1.00000 > 12 hdd 7.27699 osd.12 down 1.00000 1.00000 > 15 hdd 7.27699 osd.15 up 1.00000 1.00000 > 18 hdd 7.27699 osd.18 down 1.00000 1.00000 > 21 hdd 7.27699 osd.21 down 1.00000 1.00000 > 24 hdd 7.27699 osd.24 up 1.00000 1.00000 > 27 hdd 7.27699 osd.27 down 1.00000 1.00000 > 30 hdd 7.27699 osd.30 down 1.00000 1.00000 > 35 hdd 7.27699 osd.35 down 1.00000 1.00000 > 37 hdd 7.27699 osd.37 down 1.00000 1.00000 > 40 hdd 7.27699 osd.40 down 1.00000 1.00000 > 44 hdd 7.27699 osd.44 down 1.00000 1.00000 > 47 hdd 7.27699 osd.47 up 1.00000 1.00000 > 50 hdd 7.27699 osd.50 up 1.00000 1.00000 > 53 hdd 7.27699 osd.53 down 1.00000 1.00000 > 56 hdd 7.27699 osd.56 down 1.00000 1.00000 > 59 hdd 7.27699 osd.59 up 1.00000 1.00000 > 62 hdd 7.27699 osd.62 down 0 1.00000 > 65 hdd 7.27699 osd.65 down 1.00000 1.00000 > 68 hdd 7.27699 osd.68 down 1.00000 1.00000 > 71 hdd 7.27699 osd.71 down 1.00000 1.00000 > 74 hdd 7.27699 osd.74 down 1.00000 1.00000 > 77 hdd 7.27699 osd.77 up 1.00000 1.00000 > 80 hdd 7.27699 osd.80 down 1.00000 1.00000 > 83 hdd 7.27699 osd.83 up 1.00000 1.00000 > 86 hdd 7.27699 osd.86 down 1.00000 1.00000 > 88 hdd 7.27699 osd.88 down 1.00000 1.00000 > 91 hdd 7.27699 osd.91 down 1.00000 1.00000 > 94 hdd 7.27699 osd.94 down 1.00000 1.00000 > 97 hdd 7.27699 osd.97 down 1.00000 1.00000 > 100 hdd 7.27699 osd.100 down 0 1.00000 > 103 hdd 7.27699 osd.103 down 1.00000 1.00000 > 106 hdd 7.27699 osd.106 up 1.00000 1.00000 > -3 261.98599 host ceph002 > 0 hdd 7.27699 osd.0 down 0 1.00000 > 4 hdd 7.27699 osd.4 up 1.00000 1.00000 > 7 hdd 7.27699 osd.7 up 1.00000 1.00000 > 11 hdd 7.27699 osd.11 down 1.00000 1.00000 > 13 hdd 7.27699 osd.13 up 1.00000 1.00000 > 16 hdd 7.27699 osd.16 down 1.00000 1.00000 > 19 hdd 7.27699 osd.19 down 0 1.00000 > 23 hdd 7.27699 osd.23 up 1.00000 1.00000 > 26 hdd 7.27699 osd.26 down 0 1.00000 > 29 hdd 7.27699 osd.29 down 0 1.00000 > 32 hdd 7.27699 osd.32 down 0 1.00000 > 33 hdd 7.27699 osd.33 down 0 1.00000 > 36 hdd 7.27699 osd.36 down 0 1.00000 > 39 hdd 7.27699 osd.39 down 1.00000 1.00000 > 43 hdd 7.27699 osd.43 up 1.00000 1.00000 > 46 hdd 7.27699 osd.46 up 1.00000 1.00000 > 49 hdd 7.27699 osd.49 down 1.00000 1.00000 > 52 hdd 7.27699 osd.52 down 1.00000 1.00000 > 55 hdd 7.27699 osd.55 down 0 1.00000 > 58 hdd 7.27699 osd.58 up 1.00000 1.00000 > 61 hdd 7.27699 osd.61 down 1.00000 1.00000 > 64 hdd 7.27699 osd.64 down 1.00000 1.00000 > 67 hdd 7.27699 osd.67 up 1.00000 1.00000 > 70 hdd 7.27699 osd.70 down 1.00000 1.00000 > 73 hdd 7.27699 osd.73 down 1.00000 1.00000 > 76 hdd 7.27699 osd.76 up 1.00000 1.00000 > 78 hdd 7.27699 osd.78 down 1.00000 1.00000 > 81 hdd 7.27699 osd.81 down 1.00000 1.00000 > 84 hdd 7.27699 osd.84 down 0 1.00000 > 87 hdd 7.27699 osd.87 down 1.00000 1.00000 > 90 hdd 7.27699 osd.90 down 0 1.00000 > 93 hdd 7.27699 osd.93 down 1.00000 1.00000 > 96 hdd 7.27699 osd.96 down 0 1.00000 > 99 hdd 7.27699 osd.99 down 0 1.00000 > 102 hdd 7.27699 osd.102 down 0 1.00000 > 105 hdd 7.27699 osd.105 up 1.00000 1.00000 > -7 261.98599 host ceph003 > 2 hdd 7.27699 osd.2 up 1.00000 1.00000 > 5 hdd 7.27699 osd.5 down 1.00000 1.00000 > 8 hdd 7.27699 osd.8 up 1.00000 1.00000 > 10 hdd 7.27699 osd.10 down 0 1.00000 > 14 hdd 7.27699 osd.14 down 0 1.00000 > 17 hdd 7.27699 osd.17 up 1.00000 1.00000 > 20 hdd 7.27699 osd.20 down 0 1.00000 > 22 hdd 7.27699 osd.22 down 0 1.00000 > 25 hdd 7.27699 osd.25 up 1.00000 1.00000 > 28 hdd 7.27699 osd.28 up 1.00000 1.00000 > 31 hdd 7.27699 osd.31 down 0 1.00000 > 34 hdd 7.27699 osd.34 down 0 1.00000 > 38 hdd 7.27699 osd.38 down 0 1.00000 > 41 hdd 7.27699 osd.41 down 1.00000 1.00000 > 42 hdd 7.27699 osd.42 down 0 1.00000 > 45 hdd 7.27699 osd.45 up 1.00000 1.00000 > 48 hdd 7.27699 osd.48 up 1.00000 1.00000 > 51 hdd 7.27699 osd.51 down 1.00000 1.00000 > 54 hdd 7.27699 osd.54 up 1.00000 1.00000 > 57 hdd 7.27699 osd.57 down 1.00000 1.00000 > 60 hdd 7.27699 osd.60 down 1.00000 1.00000 > 63 hdd 7.27699 osd.63 up 1.00000 1.00000 > 66 hdd 7.27699 osd.66 down 1.00000 1.00000 > 69 hdd 7.27699 osd.69 up 1.00000 1.00000 > 72 hdd 7.27699 osd.72 up 1.00000 1.00000 > 75 hdd 7.27699 osd.75 down 1.00000 1.00000 > 79 hdd 7.27699 osd.79 up 1.00000 1.00000 > 82 hdd 7.27699 osd.82 down 1.00000 1.00000 > 85 hdd 7.27699 osd.85 down 1.00000 1.00000 > 89 hdd 7.27699 osd.89 down 0 1.00000 > 92 hdd 7.27699 osd.92 down 1.00000 1.00000 > 95 hdd 7.27699 osd.95 down 0 1.00000 > 98 hdd 7.27699 osd.98 down 0 1.00000 > 101 hdd 7.27699 osd.101 down 1.00000 1.00000 > 104 hdd 7.27699 osd.104 down 0 1.00000 > 107 hdd 7.27699 osd.107 up 1.00000 1.00000 > > Ceph status; > [root@ceph001 ~]# ceph status > cluster: > id: 54052e72-6835-410e-88a9-af4ac17a8113 > health: HEALTH_WARN > 1 filesystem is degraded > 1 MDSs report slow metadata IOs > 48 osds down > Reduced data availability: 2053 pgs inactive, 2043 pgs down, 7 > pgs peering, 3 pgs incomplete, 126 pgs stale > Degraded data redundancy: 18473/27200783 objects degraded > (0.068%), 106 pgs degraded, 103 pgs undersized > too many PGs per OSD (258 > max 250) > > services: > mon: 3 daemons, quorum filler001,filler002,bezavrdat-master01 > mgr: bezavrdat-master01(active), standbys: filler002, filler001 > mds: cephfs-1/1/1 up {0=filler002=up:replay}, 1 up:standby > osd: 108 osds: 32 up, 80 in; 16 remapped pgs > > data: > pools: 2 pools, 2176 pgs > objects: 2.73 M objects, 1.7 TiB > usage: 2.3 TiB used, 580 TiB / 582 TiB avail > pgs: 94.347% pgs not active > 18473/27200783 objects degraded (0.068%) > 1951 down > 79 active+undersized+degraded > 76 stale+down > 23 stale+active+undersized+degraded > 14 down+remapped > 14 stale+active+clean > 6 stale+peering > 3 active+clean > 3 stale+active+recovery_wait+degraded > 2 incomplete > 2 stale+down+remapped > 1 stale+incomplete > 1 stale+remapped+peering > 1 active+recovering+undersized+degraded+remapped > > Thank you in advance! > > Regards, > > [image: Atos logo] > > *Andrea Del Monaco* > HPC Consultant – Big Data & Security > M: +31 612031174 > Burgemeester Rijnderslaan 30 – 1185 MC Amstelveen – The Netherlands > atos.net > [image: LinkedIn icon] <https://www.linkedin.com/company/1259/> [image: > Twitter icon] <https://twitter.com/atos> [image: Facebook icon] > <https://www.facebook.com/Atos/> [image: Youtube icon] > <https://www.youtube.com/user/Atos> > This e-mail and the documents attached are confidential and intended > solely for the addressee; it may also be privileged. If you receive this > e-mail in error, please notify the sender immediately and destroy it. As > its integrity cannot be secured on the Internet, Atos’ liability cannot be > triggered for the message content. Although the sender endeavours to > maintain a computer virus-free network, the sender does not warrant that > this transmission is virus-free and will not be liable for any damages > resulting from any virus transmitted. On all offers and agreements under > which Atos Nederland B.V. supplies goods and/or services of whatever > nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. > The Terms of Delivery shall be promptly submitted to you on your request. > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Cheers, Brad
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com