Ceph doesn't delete a copy if it can't find a new place to store it at, this is a good thing. Use one more server to see the data actually moving elsewhere (without a health warning in Nautilus, with a health warning in older versions)
It's a little bit unfortunate that "ceph osd df" lies about the usage of out OSDs: they go to 0 immediately; this used to work different in pre-Luminous (or was it pre-BlueStore?) Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Sun, Jun 9, 2019 at 2:38 PM Tarek Zegar <tze...@us.ibm.com> wrote: > Hi Haung, > > So you are suggesting that even though osd.4 in this case has weight 0, > it's still getting new data being written to it? I find that counter to > what weight 0 means. > > Thanks > Tarek > > > > [image: Inactive hide details for huang jun ---06/08/2019 05:27:45 AM---i > think the write data will also write to the osd.4 in this cas]huang jun > ---06/08/2019 05:27:45 AM---i think the write data will also write to the > osd.4 in this case. bc your osd.4 is not down, so the > > From: huang jun <hjwsm1...@gmail.com> > To: Tarek Zegar <tze...@us.ibm.com> > Cc: Paul Emmerich <paul.emmer...@croit.io>, Ceph Users < > ceph-users@lists.ceph.com> > Date: 06/08/2019 05:27 AM > Subject: [EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't > report degraded if UP set under Pool Size > ------------------------------ > > > > i think the write data will also write to the osd.4 in this case. > bc your osd.4 is not down, so the ceph don't think the pg have some osd > down, > and it will replicated the data to all osds in actingbackfill set. > > Tarek Zegar <*tze...@us.ibm.com* <tze...@us.ibm.com>> 于2019年6月7日周五 > 下午10:37写道: > > Paul / All > > I'm not sure what warning your are referring to, I'm on Nautilus. The > point I'm getting at is if you weight out all OSD on a host with a cluster > of 3 OSD hosts with 3 OSD each, crush rule = host, then write to the > cluster, it *should* imo not just say remapped but undersized / degraded. > > See below, 1 out of the 3 OSD hosts has ALL it's OSD marked out and > weight = 0. When you write (say using FIO), the PGs *only* have 2 OSD in > them (UP set), which is pool min size. I don't understand why it's not > saying undersized/degraded, this seems like a bug. Who cares that the > Acting Set has the 3 original OSD in it, the actual data is only on 2 OSD, > which is a degraded state > > * root@hostadmin:~# ceph -s* > cluster: > id: 33d41932-9df2-40ba-8e16-8dedaa4b3ef6 > health: HEALTH_WARN > application not enabled on 1 pool(s) > > services: > mon: 1 daemons, quorum hostmonitor1 (age 29m) > mgr: hostmonitor1(active, since 31m) > osd: 9 osds: 9 up, 6 in; 100 remapped pgs > > data: > pools: 1 pools, 100 pgs > objects: 520 objects, 2.0 GiB > usage: 15 GiB used, 75 GiB / 90 GiB avail > pgs: 520/1560 objects misplaced (33.333%) > * 100 active+clean+remapped* > > * root@hostadmin:~# ceph osd tree* > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 0.08817 root default > -3 0.02939 host hostosd1 > 0 hdd 0.00980 osd.0 up 1.00000 1.00000 > 3 hdd 0.00980 osd.3 up 1.00000 1.00000 > 6 hdd 0.00980 osd.6 up 1.00000 1.00000 > > > > * -5 0.02939 host hostosd2 1 hdd 0.00980 osd.1 up 0 1.00000 4 hdd 0.00980 > osd.4 up 0 1.00000 7 hdd 0.00980 osd.7 up 0 1.00000* > -7 0.02939 host hostosd3 > 2 hdd 0.00980 osd.2 up 1.00000 1.00000 > 5 hdd 0.00980 osd.5 up 1.00000 1.00000 > 8 hdd 0.00980 osd.8 up 1.00000 1.00000 > > > * root@hostadmin:~# ceph osd df* > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR > PGS STATUS > 0 hdd 0.00980 1.00000 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB 8.2 GiB > 17.48 1.03 34 up > 3 hdd 0.00980 1.00000 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB 8.2 GiB > 17.48 1.03 36 up > 6 hdd 0.00980 1.00000 10 GiB 1.6 GiB 593 MiB 4 KiB 1024 MiB 8.4 GiB > 15.80 0.93 30 up > > > * 1 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up 4 hdd 0.00980 0 0 B 0 B > 0 B 0 B 0 B 0 B 0 0 0 up 7 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 100 > up* > 2 hdd 0.00980 1.00000 10 GiB 1.5 GiB 525 MiB 8 KiB 1024 MiB 8.5 GiB > 15.13 0.89 20 up > 5 hdd 0.00980 1.00000 10 GiB 1.9 GiB 941 MiB 4 KiB 1024 MiB 8.1 GiB > 19.20 1.13 43 up > 8 hdd 0.00980 1.00000 10 GiB 1.6 GiB 657 MiB 8 KiB 1024 MiB 8.4 GiB > 16.42 0.97 37 up > TOTAL 90 GiB 15 GiB 6.2 GiB 61 KiB 9.0 GiB 75 GiB 16.92 > MIN/MAX VAR: 0.89/1.13 STDDEV: 1.32 > Tarek Zegar > Senior SDS Engineer > *Email **tze...@us.ibm.com* <email%20address> > Mobile *630.974.7172* > > > > > [image: Inactive hide details for Paul Emmerich ---06/07/2019 05:25:23 > AM---remapped no longer triggers a health warning in nautilus. Y]Paul > Emmerich ---06/07/2019 05:25:23 AM---remapped no longer triggers a health > warning in nautilus. Your data is still there, it's just on the > > From: Paul Emmerich <*paul.emmer...@croit.io* <paul.emmer...@croit.io>> > To: Tarek Zegar <*tze...@us.ibm.com* <tze...@us.ibm.com>> > Cc: Ceph Users <*ceph-users@lists.ceph.com* <ceph-users@lists.ceph.com> > > > Date: 06/07/2019 05:25 AM > Subject: [EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't > report degraded if UP set under Pool Size > ------------------------------ > > > > remapped no longer triggers a health warning in nautilus. > > Your data is still there, it's just on the wrong OSD if that OSD is > still up and running. > > > Paul > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at > *https://croit.io* <https://croit.io> > > croit GmbH > Freseniusstr. 31h > 81247 München > *www.croit.io* <http://www.croit.io> > Tel: +49 89 1896585 90 > > > On Thu, Jun 6, 2019 at 10:48 PM Tarek Zegar <*tze...@us.ibm.com* > <tze...@us.ibm.com>> wrote: > For testing purposes I set a bunch of OSD to 0 weight, this > correctly forces Ceph to not use said OSD. I took enough out such > that the > UP set only had Pool min size # of OSD (i.e 2 OSD). > > Two Questions: > 1. Why doesn't the acting set eventually match the UP set and > simply point to [6,5] only > 2. Why are none of the PGs marked as undersized and degraded? > The data is only hosted on 2 OSD rather then Pool size (3), I would > expect > a undersized warning and degraded for PG with data? > > Example PG: > PG 1.4d active+clean+remapped UP= [6,5] Acting = [6,5,4] > > OSD Tree: > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 0.08817 root default > -3 0.02939 host hostosd1 > 0 hdd 0.00980 osd.0 up 1.00000 1.00000 > 3 hdd 0.00980 osd.3 up 1.00000 1.00000 > 6 hdd 0.00980 osd.6 up 1.00000 1.00000 > -5 0.02939 host hostosd2 > 1 hdd 0.00980 osd.1 up 0 1.00000 > 4 hdd 0.00980 osd.4 up 0 1.00000 > 7 hdd 0.00980 osd.7 up 0 1.00000 > -7 0.02939 host hostosd3 > 2 hdd 0.00980 osd.2 up 1.00000 1.00000 > 5 hdd 0.00980 osd.5 up 1.00000 1.00000 > 8 hdd 0.00980 osd.8 up 0 1.00000 > > > > > _______________________________________________ > ceph-users mailing list > *ceph-users@lists.ceph.com* <ceph-users@lists.ceph.com> > *http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com* > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > _______________________________________________ > ceph-users mailing list > *ceph-users@lists.ceph.com* <ceph-users@lists.ceph.com> > *http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com* > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > -- > Thank you! > HuangJun > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com