[ceph-users] Re: help

Amudhan P Fri, 30 Aug 2019 01:50:40 -0700

After leaving 12 hours time now cluster status is healthy, but why did it
take such a long time for backfill?


How do I fine-tune? if in case of same kind error pop-out again.


On Thu, Aug 29, 2019 at 6:52 PM Caspar Smit <caspars...@supernas.eu> wrote:

> Hi,
>
> This output doesn't show anything 'wrong' with the cluster. It's just
> still recovering (backfilling) from what seems like one of your OSD's
> crashed and restarted.
> The backfilling is taking a while because max_backfills = 1 and you only
> have 3 OSD's total so the backfilling per PG has to have for the previous
> PG backfill to complete.
>
> The real concern is not the current state of the cluster but how you end
> up in this state. Probably the script overloaded the OSD's.
>
> I also advise you to add a monitor to your other 2 nodes as well (running
> 3 mons total). Running 1 mon is not advised.
>
> Furthermore, just let the backfilling complete and HEALTH_OK will return
> eventually if nothing goes wrong in between.
>
> Met vriendelijke groet,
>
> Caspar Smit
> Systemengineer
> SuperNAS
> Dorsvlegelstraat 13
> 1445 PA Purmerend
>
> t: (+31) 299 410 414
> e: caspars...@supernas.eu
> w: www.supernas.eu
>
>
> Op do 29 aug. 2019 om 14:35 schreef Amudhan P <amudha...@gmail.com>:
>
>> output from "ceph -s "
>>
>>   cluster:
>>     id:     7c138e13-7b98-4309-b591-d4091a1742b4
>>     health: HEALTH_WARN
>>             Degraded data redundancy: 1141587/7723191 objects degraded
>> (14.781%), 15 pgs degraded, 16 pgs undersized
>>
>>   services:
>>     mon: 1 daemons, quorum mon01
>>     mgr: mon01(active)
>>     mds: cephfs-tst-1/1/1 up  {0=mon01=up:active}
>>     osd: 3 osds: 3 up, 3 in; 16 remapped pgs
>>
>>   data:
>>     pools:   2 pools, 64 pgs
>>     objects: 2.57 M objects, 59 GiB
>>     usage:   190 GiB used, 5.3 TiB / 5.5 TiB avail
>>     pgs:     1141587/7723191 objects degraded (14.781%)
>>              48 active+clean
>>              15 active+undersized+degraded+remapped+backfill_wait
>>              1  active+undersized+remapped+backfilling
>>
>>   io:
>>     recovery: 0 B/s, 10 objects/s
>>
>> output from  "ceph osd tree"
>> ID CLASS WEIGHT  TYPE NAME           STATUS REWEIGHT PRI-AFF
>> -1       5.45819 root default
>> -3       1.81940     host test-node1
>>  0   hdd 1.81940         osd.0           up  1.00000 1.00000
>> -5       1.81940     host test-node2
>>  1   hdd 1.81940         osd.1           up  1.00000 1.00000
>> -7       1.81940     host test-node3
>>  2   hdd 1.81940         osd.2           up  1.00000 1.00000
>>
>> failure domain not configured yet, setup is 3 OSD node each with a single
>> disk, 1 node with mon&mds&mgr running.
>> the cluster was healthy until I run a script for creating multiple
>> folders.
>>
>> regards
>> Amudhan
>>
>> On Thu, Aug 29, 2019 at 5:33 PM Heðin Ejdesgaard Møller <h...@synack.fo>
>> wrote:
>>
>>> In adition to ceph -s, could you provide the output of
>>> ceph osd tree
>>> and specify what your failure domain is ?
>>>
>>> /Heðin
>>>
>>>
>>> On hós, 2019-08-29 at 13:55 +0200, Janne Johansson wrote:
>>> >
>>> >
>>> > Den tors 29 aug. 2019 kl 13:50 skrev Amudhan P <amudha...@gmail.com>:
>>> > > Hi,
>>> > >
>>> > > I am using ceph version 13.2.6 (mimic) on test setup trying with
>>> > > cephfs.
>>> > > my ceph health status showing warning .
>>> > >
>>> > > "ceph health"
>>> > > HEALTH_WARN Degraded data redundancy: 1197023/7723191 objects
>>> > > degraded (15.499%)
>>> > >
>>> > > "ceph health detail"
>>> > > HEALTH_WARN Degraded data redundancy: 1197128/7723191 objects
>>> > > degraded (15.500%)
>>> > > PG_DEGRADED Degraded data redundancy: 1197128/7723191 objects
>>> > > degraded (15.500%)
>>> > >     pg 2.0 is stuck undersized for 1076.454929, current state
>>> > > active+undersized+
>>> > >     pg 2.2 is stuck undersized for 1076.456639, current state
>>> > > active+undersized+
>>> > >
>>> >
>>> > How does "ceph -s" look?
>>> > It should have more info on what else is wrong.
>>> >
>>> > --
>>> > May the most significant bit of your life be positive.
>>> > _______________________________________________
>>> > ceph-users mailing list -- ceph-users@ceph.io
>>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: help

Reply via email to