Ceph doesn't delete a copy if it can't find a new place to store it at,
this is a good thing.
Use one more server to see the data actually moving elsewhere (without a
health warning in Nautilus, with a health warning in older versions)


It's a little bit unfortunate that "ceph osd df" lies about the usage of
out OSDs: they go to 0 immediately; this used to work different in
pre-Luminous (or was it pre-BlueStore?)


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Sun, Jun 9, 2019 at 2:38 PM Tarek Zegar <tze...@us.ibm.com> wrote:

> Hi Haung,
>
> So you are suggesting that even though osd.4 in this case has weight 0,
> it's still getting new data being written to it? I find that counter to
> what weight 0 means.
>
> Thanks
> Tarek
>
>
>
> [image: Inactive hide details for huang jun ---06/08/2019 05:27:45 AM---i
> think the write data will also write to the osd.4 in this cas]huang jun
> ---06/08/2019 05:27:45 AM---i think the write data will also write to the
> osd.4 in this case. bc your osd.4 is not down, so the
>
> From: huang jun <hjwsm1...@gmail.com>
> To: Tarek Zegar <tze...@us.ibm.com>
> Cc: Paul Emmerich <paul.emmer...@croit.io>, Ceph Users <
> ceph-users@lists.ceph.com>
> Date: 06/08/2019 05:27 AM
> Subject: [EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't
> report degraded if UP set under Pool Size
> ------------------------------
>
>
>
> i think the write data will also write to the osd.4 in this case.
> bc your osd.4 is not down, so the ceph don't think the pg have some osd
> down,
> and it will replicated the data to all osds in actingbackfill set.
>
> Tarek Zegar <*tze...@us.ibm.com* <tze...@us.ibm.com>> 于2019年6月7日周五
> 下午10:37写道:
>
>    Paul / All
>
>    I'm not sure what warning your are referring to, I'm on Nautilus. The
>    point I'm getting at is if you weight out all OSD on a host with a cluster
>    of 3 OSD hosts with 3 OSD each, crush rule = host, then write to the
>    cluster, it *should* imo not just say remapped but undersized / degraded.
>
>    See below, 1 out of the 3 OSD hosts has ALL it's OSD marked out and
>    weight = 0. When you write (say using FIO), the PGs *only* have 2 OSD in
>    them (UP set), which is pool min size. I don't understand why it's not
>    saying undersized/degraded, this seems like a bug. Who cares that the
>    Acting Set has the 3 original OSD in it, the actual data is only on 2 OSD,
>    which is a degraded state
>
> * root@hostadmin:~# ceph -s*
>    cluster:
>    id: 33d41932-9df2-40ba-8e16-8dedaa4b3ef6
>    health: HEALTH_WARN
>    application not enabled on 1 pool(s)
>
>    services:
>    mon: 1 daemons, quorum hostmonitor1 (age 29m)
>    mgr: hostmonitor1(active, since 31m)
>    osd: 9 osds: 9 up, 6 in; 100 remapped pgs
>
>    data:
>    pools: 1 pools, 100 pgs
>    objects: 520 objects, 2.0 GiB
>    usage: 15 GiB used, 75 GiB / 90 GiB avail
>    pgs: 520/1560 objects misplaced (33.333%)
> * 100 active+clean+remapped*
>
> * root@hostadmin:~# ceph osd tree*
>    ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>    -1 0.08817 root default
>    -3 0.02939 host hostosd1
>    0 hdd 0.00980 osd.0 up 1.00000 1.00000
>    3 hdd 0.00980 osd.3 up 1.00000 1.00000
>    6 hdd 0.00980 osd.6 up 1.00000 1.00000
>
>
>
> * -5 0.02939 host hostosd2 1 hdd 0.00980 osd.1 up 0 1.00000 4 hdd 0.00980
>    osd.4 up 0 1.00000 7 hdd 0.00980 osd.7 up 0 1.00000*
>    -7 0.02939 host hostosd3
>    2 hdd 0.00980 osd.2 up 1.00000 1.00000
>    5 hdd 0.00980 osd.5 up 1.00000 1.00000
>    8 hdd 0.00980 osd.8 up 1.00000 1.00000
>
>
> * root@hostadmin:~# ceph osd df*
>    ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR
>    PGS STATUS
>    0 hdd 0.00980 1.00000 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB 8.2 GiB
>    17.48 1.03 34 up
>    3 hdd 0.00980 1.00000 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB 8.2 GiB
>    17.48 1.03 36 up
>    6 hdd 0.00980 1.00000 10 GiB 1.6 GiB 593 MiB 4 KiB 1024 MiB 8.4 GiB
>    15.80 0.93 30 up
>
>
> * 1 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up 4 hdd 0.00980 0 0 B 0 B
>    0 B 0 B 0 B 0 B 0 0 0 up 7 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 100 
> up*
>    2 hdd 0.00980 1.00000 10 GiB 1.5 GiB 525 MiB 8 KiB 1024 MiB 8.5 GiB
>    15.13 0.89 20 up
>    5 hdd 0.00980 1.00000 10 GiB 1.9 GiB 941 MiB 4 KiB 1024 MiB 8.1 GiB
>    19.20 1.13 43 up
>    8 hdd 0.00980 1.00000 10 GiB 1.6 GiB 657 MiB 8 KiB 1024 MiB 8.4 GiB
>    16.42 0.97 37 up
>    TOTAL 90 GiB 15 GiB 6.2 GiB 61 KiB 9.0 GiB 75 GiB 16.92
>    MIN/MAX VAR: 0.89/1.13 STDDEV: 1.32
>    Tarek Zegar
>    Senior SDS Engineer
> *Email **tze...@us.ibm.com* <email%20address>
>    Mobile *630.974.7172*
>
>
>
>
>    [image: Inactive hide details for Paul Emmerich ---06/07/2019 05:25:23
>    AM---remapped no longer triggers a health warning in nautilus. Y]Paul
>    Emmerich ---06/07/2019 05:25:23 AM---remapped no longer triggers a health
>    warning in nautilus. Your data is still there, it's just on the
>
>    From: Paul Emmerich <*paul.emmer...@croit.io* <paul.emmer...@croit.io>>
>    To: Tarek Zegar <*tze...@us.ibm.com* <tze...@us.ibm.com>>
>    Cc: Ceph Users <*ceph-users@lists.ceph.com* <ceph-users@lists.ceph.com>
>    >
>    Date: 06/07/2019 05:25 AM
>    Subject: [EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't
>    report degraded if UP set under Pool Size
>    ------------------------------
>
>
>
>    remapped no longer triggers a health warning in nautilus.
>
>    Your data is still there, it's just on the wrong OSD if that OSD is
>    still up and running.
>
>
>    Paul
>
>    --
>    Paul Emmerich
>
>    Looking for help with your Ceph cluster? Contact us at
>    *https://croit.io* <https://croit.io>
>
>    croit GmbH
>    Freseniusstr. 31h
>    81247 München
> *www.croit.io* <http://www.croit.io>
>    Tel: +49 89 1896585 90
>
>
>    On Thu, Jun 6, 2019 at 10:48 PM Tarek Zegar <*tze...@us.ibm.com*
>    <tze...@us.ibm.com>> wrote:
>       For testing purposes I set a bunch of OSD to 0 weight, this
>          correctly forces Ceph to not use said OSD. I took enough out such 
> that the
>          UP set only had Pool min size # of OSD (i.e 2 OSD).
>
>          Two Questions:
>          1. Why doesn't the acting set eventually match the UP set and
>          simply point to [6,5] only
>          2. Why are none of the PGs marked as undersized and degraded?
>          The data is only hosted on 2 OSD rather then Pool size (3), I would 
> expect
>          a undersized warning and degraded for PG with data?
>
>          Example PG:
>          PG 1.4d active+clean+remapped UP= [6,5] Acting = [6,5,4]
>
>          OSD Tree:
>          ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>          -1 0.08817 root default
>          -3 0.02939 host hostosd1
>          0 hdd 0.00980 osd.0 up 1.00000 1.00000
>          3 hdd 0.00980 osd.3 up 1.00000 1.00000
>          6 hdd 0.00980 osd.6 up 1.00000 1.00000
>          -5 0.02939 host hostosd2
>          1 hdd 0.00980 osd.1 up 0 1.00000
>          4 hdd 0.00980 osd.4 up 0 1.00000
>          7 hdd 0.00980 osd.7 up 0 1.00000
>          -7 0.02939 host hostosd3
>          2 hdd 0.00980 osd.2 up 1.00000 1.00000
>          5 hdd 0.00980 osd.5 up 1.00000 1.00000
>          8 hdd 0.00980 osd.8 up 0 1.00000
>
>
>
>
>          _______________________________________________
>          ceph-users mailing list
> *ceph-users@lists.ceph.com* <ceph-users@lists.ceph.com>
> *http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com*
>          <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>
>    _______________________________________________
>    ceph-users mailing list
> *ceph-users@lists.ceph.com* <ceph-users@lists.ceph.com>
> *http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com*
>    <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>
>
>
> --
> Thank you!
> HuangJun
>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to