Re: [ceph-users] Ceph OSD crash starting up

Gonzalo Aguilar Delgado Tue, 19 Sep 2017 08:29:02 -0700

Hi David,

What I want is to add the OSD back with its data yes. But avoiding anytroubles that can happen from the time it was out.

Is it possible? I suppose that some pg has been updated after. Will cephmanage it gracefully?


Ceph status is getting worse every day.

ceph status
    cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771
     health HEALTH_ERR
            6 pgs inconsistent
            31 scrub errors
            too many PGs per OSD (305 > max 300)

monmap e12: 2 mons at{blue-compute=172.16.0.119:6789/0,red-compute=172.16.0.100:6789/0}

            election epoch 4328, quorum 0,1 red-compute,blue-compute
      fsmap e881: 1/1/1 up {0=blue-compute=up:active}
     osdmap e7120: 5 osds: 5 up, 5 in
            flags require_jewel_osds
      pgmap v66976120: 764 pgs, 6 pools, 555 GB data, 140 kobjects
            1111 GB used, 3068 GB / 4179 GB avail
                 758 active+clean
                   6 active+clean+inconsistent
  client io 384 kB/s wr, 0 op/s rd, 83 op/s wr

I want to add the old OSD, rebalance copies are more hosts/osds andremove it out again.



Best regards,


On 19/09/17 14:47, David Turner wrote:

Are you asking to add the osd back with its data or add it back in asa fresh osd. What is your `ceph status`?

On Tue, Sep 19, 2017, 5:23 AM Gonzalo Aguilar Delgado<gagui...@aguilardelgado.com <mailto:gagui...@aguilardelgado.com>> wrote:


    Hi David,

    Thank you for the great explanation of the weights, I thought that
    ceph was adjusting them based on disk. But it seems it's not.

    But the problem was not that I think the node was failing because
    a software bug because the disk was not full anymeans.

    /dev/sdb1                     976284608 172396756 803887852  18%
    /var/lib/ceph/osd/ceph-1

    Now the question is to know if I can add again this osd safely. Is
    it possible?

    Best regards,



    On 14/09/17 23:29, David Turner wrote:

Your weights should more closely represent the size of the OSDs.OSD3 and OSD6 are weighted properly, but your other 3 OSDs have

    the same weight even though OSD0 is twice the size of OSD2 and OSD4.

    Your OSD weights is what I thought you were referring to when you
    said you set the crush map to 1.  At some point it does look like

you set all of your OSD weights to 1, which would apply to OSD1.If the OSD was too small for that much data, it would have filled

    up and be too full to start.  Can you mount that disk and see how
    much free space is on it?

    Just so you understand what that weight is, it is how much data
    the cluster is going to put on it.  The default is for the weight
    to be the size of the OSD in TiB (1024 based instead of TB which
    is 1000).  If you set the weight of a 1TB disk and a 4TB disk
    both to 1, then the cluster will try and give them the same
    amount of data.  If you set the 4TB disk to a weight of 4, then
    the cluster will try to give it 4x more data than the 1TB drive
    (usually what you want).

    In your case, your 926G OSD0 has a weight of 1 and your 460G OSD2
    has a weight of 1 so the cluster thinks they should each receive
    the same amount of data (which it did, they each have ~275GB of
    data).  OSD3 has a weight of 1.36380 (its size in TiB) and OSD6
    has a weight of 0.90919 and they have basically the same %used
    space (17%) as opposed to the same amount of data because the
    weight is based on their size.

    As long as you had enough replicas of your data in the cluster
    for it to recover from you removing OSD1 such that your cluster
    is health_ok without any missing objects, then there is nothing
    that you need off of OSD1 and ceph recovered from the lost disk
    successfully.

    On Thu, Sep 14, 2017 at 4:39 PM Gonzalo Aguilar Delgado
    <gagui...@aguilardelgado.com
    <mailto:gagui...@aguilardelgado.com>> wrote:

        Hello,

        I was on a old version of ceph. And it showed a warning saying:

        /crush map/ has straw_calc_version=/0/

        I rode that adjusting it will only rebalance all so admin
        should select when to do it. So I went straigth and ran:


        ceph osd crush tunables optimal

        /
        /It rebalanced as it said but then I started to have lots of
        pg wrong. I discovered that it was because my OSD1. I thought
        it was disk faillure so I added a new OSD6 and system started
        to rebalance. Anyway OSD was not starting.

        I thought to wipe it all. But I preferred to leave disk as it
        was, and journal intact, in case I can recover and get data
        from it. (See mail: [ceph-users] Scrub failing all the time,
        new inconsistencies keep appearing).


        So here's the information. But it has OSD1 replaced by OSD3,
        sorry.

        ID WEIGHT  REWEIGHT SIZE  USE   AVAIL %USE  VAR PGS
         0 1.00000  1.00000  926G  271G  654G 29.34 1.10 369
         2 1.00000  1.00000  460G  284G  176G 61.67 2.32 395
         4 1.00000  1.00000  465G  151G  313G 32.64 1.23 214
         3 1.36380  1.00000 1396G  239G 1157G 17.13 0.64 340
         6 0.90919  1.00000  931G  164G  766G 17.70 0.67 210
                      TOTAL 4179G 1111G 3067G 26.60
        MIN/MAX VAR: 0.64/2.32  STDDEV: 16.99

        As I said I still have OSD1 intact so I can do whatever you
        need except readding to the cluster. Since I don't know what
        It will do, maybe cause havok.
        Best regards,


        On 14/09/17 17:12, David Turner wrote:

        What do you mean by "updated crush map to 1"?  Can you
        please provide a copy of your crush map and `ceph osd df`?

        On Wed, Sep 13, 2017 at 6:39 AM Gonzalo Aguilar Delgado
        <gagui...@aguilardelgado.com
        <mailto:gagui...@aguilardelgado.com>> wrote:

            Hi,

            I'recently updated crush map to 1 and did all relocation
            of the pgs. At the end I found that one of the OSD is
            not starting.

            This is what it shows:


            2017-09-13 10:37:34.287248 7f49cbe12700 -1 *** Caught
            signal (Aborted) **
             in thread 7f49cbe12700 thread_name:filestore_sync

             ceph version 10.2.7
            (50e863e0f4bc8f4b9e31156de690d765af245185)
             1: (()+0x9616ee) [0xa93c6ef6ee]
             2: (()+0x11390) [0x7f49d9937390]
             3: (gsignal()+0x38) [0x7f49d78d3428]
             4: (abort()+0x16a) [0x7f49d78d502a]
             5: (ceph::__ceph_assert_fail(char const*, char const*,
            int, char const*)+0x26b) [0xa93c7ef43b]
             6: (FileStore::sync_entry()+0x2bbb) [0xa93c47fcbb]
             7: (FileStore::SyncThread::entry()+0xd) [0xa93c4adcdd]
             8: (()+0x76ba) [0x7f49d992d6ba]
             9: (clone()+0x6d) [0x7f49d79a53dd]
             NOTE: a copy of the executable, or `objdump -rdS
            <executable>` is needed to interpret this.

            --- begin dump of recent events ---
                -3> 2017-09-13 10:37:34.253808 7f49dac6e8c0  5 osd.1
            pg_epoch: 6293 pg[10.8c( v 6220'575937
            (4942'572901,6220'575937] local-les=6235 n=282 ec=419
            les/c/f 6235/6235/0 6293/6293/6290) [1,2]/[2] r=-1 lpr=0
            pi=6234-6292/24 crt=6220'575937 lcod 0'0 inactive NOTIFY
            NIBBLEWISE] exit Initial 0.029683 0 0.000000
                -2> 2017-09-13 10:37:34.253848 7f49dac6e8c0  5 osd.1
            pg_epoch: 6293 pg[10.8c( v 6220'575937
            (4942'572901,6220'575937] local-les=6235 n=282 ec=419
            les/c/f 6235/6235/0 6293/6293/6290) [1,2]/[2] r=-1 lpr=0
            pi=6234-6292/24 crt=6220'575937 lcod 0'0 inactive NOTIFY
            NIBBLEWISE] enter Reset
                -1> 2017-09-13 10:37:34.255018 7f49dac6e8c0  5 osd.1
            pg_epoch: 6293 pg[10.90(unlocked)] enter Initial
                 0> 2017-09-13 10:37:34.287248 7f49cbe12700 -1 ***
            Caught signal (Aborted) **
             in thread 7f49cbe12700 thread_name:filestore_sync

             ceph version 10.2.7
            (50e863e0f4bc8f4b9e31156de690d765af245185)
             1: (()+0x9616ee) [0xa93c6ef6ee]
             2: (()+0x11390) [0x7f49d9937390]
             3: (gsignal()+0x38) [0x7f49d78d3428]
             4: (abort()+0x16a) [0x7f49d78d502a]
             5: (ceph::__ceph_assert_fail(char const*, char const*,
            int, char const*)+0x26b) [0xa93c7ef43b]
             6: (FileStore::sync_entry()+0x2bbb) [0xa93c47fcbb]
             7: (FileStore::SyncThread::entry()+0xd) [0xa93c4adcdd]
             8: (()+0x76ba) [0x7f49d992d6ba]
             9: (clone()+0x6d) [0x7f49d79a53dd]
             NOTE: a copy of the executable, or `objdump -rdS
            <executable>` is needed to interpret this.

            --- logging levels ---
               0/ 5 none
               0/ 1 lockdep
               0/ 1 context
               1/ 1 crush
               1/ 5 mds
               1/ 5 mds_balancer
               1/ 5 mds_locker
               1/ 5 mds_log
               1/ 5 mds_log_expire
               1/ 5 mds_migrator
               0/ 1 buffer
               0/ 1 timer
               0/ 1 filer
               0/ 1 striper
               0/ 1 objecter
               0/ 5 rados
               0/ 5 rbd
               0/ 5 rbd_mirror
               0/ 5 rbd_replay
               0/ 5 journaler
               0/ 5 objectcacher
               0/ 5 client
               0/ 5 osd
               0/ 5 optracker
               0/ 5 objclass
               1/ 3 filestore
               1/ 3 journal
               0/ 5 ms
               1/ 5 mon
               0/10 monc
               1/ 5 paxos
               0/ 5 tp
               1/ 5 auth
               1/ 5 crypto
               1/ 1 finisher
               1/ 5 heartbeatmap
               1/ 5 perfcounter
               1/ 5 rgw
               1/10 civetweb
               1/ 5 javaclient
               1/ 5 asok
               1/ 1 throttle
               0/ 0 refs
               1/ 5 xio
               1/ 5 compressor
               1/ 5 newstore
               1/ 5 bluestore
               1/ 5 bluefs
               1/ 3 bdev
               1/ 5 kstore
               4/ 5 rocksdb
               4/ 5 leveldb
               1/ 5 kinetic
               1/ 5 fuse
              -2/-2 (syslog threshold)
              -1/-1 (stderr threshold)
              max_recent     10000
              max_new         1000
              log_file /var/log/ceph/ceph-osd.1.log
            --- end dump of recent events ---



            Is there any way to recover it or should I open a bug?


            Best regards

            _______________________________________________
            ceph-users mailing list
            ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
            http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph OSD crash starting up

Reply via email to