Hej David and thanks!

That was indeed the magic trick, no more peering, stale or down PGs.

Upgraded the ceph-packages on the hosts, restarted the OSDs and then "ceph
osd require-osd-release luminous"

/Magnus

2018-07-12 12:05 GMT+02:00 David Majchrzak <da...@oderland.se>:

> Hi/Hej Magnus,
>
> We had a similar issue going from latest hammer to jewel (so might not be
> applicable for you), with PGs stuck peering / data misplaced, right after
> updating all mons to latest jewel at that time 10.2.10.
>
> Finally setting the require_jewel_osds put everything back in place ( we
> were going to do this after restarting all OSDs, following the
> docs/changelogs ).
>
> What does your ceph health detail look like?
>
> Did you perform any other commands after starting your mon upgrade? Any
> commands that might change the crush-map might cause issues AFAIK (correct
> me if im wrong, but i think we ran into this once) if your mons and osds
> are different versions.
>
> // david
>
> On jul 12 2018, at 11:45 am, Magnus Grönlund <mag...@gronlund.se> wrote:
>
>
> Hi list,
>
> Things went from bad to worse, tried to upgrade some OSDs to Luminous to
> see if that could help but that didn’t appear to make any difference.
> But for each restarted OSD there was a few PGs that the OSD seemed to
> “forget” and the number of undersized PGs grew until some PGs had been
> “forgotten” by all 3 acting OSDs and became stale, even though all OSDs
> (and their disks) where available.
> Then the OSDs grew so big that the servers ran out of memory (48GB per
> server with 10 2TB-disks per server) and started killing the OSDs…
> All OSDs where then shutdown to try and preserve some data on the disks at
> least, but maybe it is too late?
>
> /Magnus
>
> 2018-07-11 21:10 GMT+02:00 Magnus Grönlund <mag...@gronlund.se>:
>
> Hi Paul,
>
> No all OSDs are still jewel , the issue started before I had even started
> to upgrade the first OSD and they don't appear to be flapping.
> ceph -w shows a lot of slow request etc, but nothing unexpected as far as
> I can tell considering the state the cluster is in.
>
> 2018-07-11 20:40:09.396642 osd.37 [WRN] 100 slow requests, 2 included
> below; oldest blocked for > 25402.278824 secs
> 2018-07-11 20:40:09.396652 osd.37 [WRN] slow request 1920.957326 seconds
> old, received at 2018-07-11 20:08:08.439214: osd_op(client.73540057.0:8289463
> 2.e57b3e32 (undecoded) ack+ondisk+retry+write+known_if_redirected
> e160294) currently waiting for peered
> 2018-07-11 20:40:09.396660 osd.37 [WRN] slow request 1920.048094 seconds
> old, received at 2018-07-11 20:08:09.348446: osd_op(client.671628641.0:998704
> 2.42f88232 (undecoded) ack+ondisk+retry+write+known_if_redirected
> e160475) currently waiting for peered
> 2018-07-11 20:40:10.397008 osd.37 [WRN] 100 slow requests, 2 included
> below; oldest blocked for > 25403.279204 secs
> 2018-07-11 20:40:10.397017 osd.37 [WRN] slow request 1920.043860 seconds
> old, received at 2018-07-11 20:08:10.353060: osd_op(client.231731103.0:1007729
> 3.e0ff5786 (undecoded) ondisk+write+known_if_redirected e137428)
> currently waiting for peered
> 2018-07-11 20:40:10.397023 osd.37 [WRN] slow request 1920.034101 seconds
> old, received at 2018-07-11 20:08:10.362819: osd_op(client.207458703.0:2000292
> 3.a8143b86 (undecoded) ondisk+write+known_if_redirected e137428)
> currently waiting for peered
> 2018-07-11 20:40:10.790573 mon.0 [INF] pgmap 4104 pgs: 5 down+peering,
> 1142 peering, 210 remapped+peering, 5 active+recovery_wait+degraded, 1551
> active+clean, 2 activating+undersized+degraded+remapped, 15
> active+remapped+backfilling, 178 unknown, 1 active+remapped, 3
> activating+remapped, 78 active+undersized+degraded+remapped+backfill_wait,
> 6 active+recovery_wait+degraded+remapped, 3 
> undersized+degraded+remapped+backfill_wait+peered,
> 5 active+undersized+degraded+remapped+backfilling, 295
> active+remapped+backfill_wait, 3 active+recovery_wait+undersized+degraded,
> 21 activating+undersized+degraded, 559 active+undersized+degraded, 4
> remapped, 17 undersized+degraded+peered, 1 
> active+recovery_wait+undersized+degraded+remapped;
> 13439 GB data, 42395 GB used, 160 TB / 201 TB avail; 4069 B/s rd, 746 kB/s
> wr, 5 op/s; 534753/10756032 objects degraded (4.972%); 779027/10756032
> objects misplaced (7.243%); 256 MB/s, 65 objects/s recovering
>
>
>
> There are a lot of things in the OSD-log files that I'm unfamiliar with
> but so far I haven't found anything that has given me a clue on how to fix
> the issue.
> BTW restarting a OSD doesn't seem to help, on the contrary, that sometimes
> results in PGs beeing stuck undersized!
> I have attaced a osd-log from when a OSD i restarted started up.
>
> Best regards
> /Magnus
>
>
> 2018-07-11 20:39 GMT+02:00 Paul Emmerich <paul.emmer...@croit.io>:
>
> Did you finish the upgrade of the OSDs? Are OSDs flapping? (ceph -w) Is
> there anything weird in the OSDs' log files?
>
>
> Paul
>
> 2018-07-11 20:30 GMT+02:00 Magnus Grönlund <mag...@gronlund.se>:
>
> Hi,
>
> Started to upgrade a ceph-cluster from Jewel (10.2.10) to Luminous (12.2.6)
>
> After upgrading and restarting the mons everything looked OK, the mons had
> quorum, all OSDs where up and in and all the PGs where active+clean.
> But before I had time to start upgrading the OSDs it became obvious that
> something had gone terribly wrong.
> All of a sudden 1600 out of 4100 PGs where inactive and 40% of the data
> was misplaced!
>
> The mons appears OK and all OSDs are still up and in, but a few hours
> later there was still 1483 pgs stuck inactive, essentially all of them in
> peering!
> Investigating one of the stuck PGs it appears to be looping between
> “inactive”, “remapped+peering” and “peering” and the epoch number is rising
> fast, see the attached pg query outputs.
>
> We really can’t afford to loose the cluster or the data so any help or
> suggestions on how to debug or fix this issue would be very, very
> appreciated!
>
>
>     health: HEALTH_ERR
>             1483 pgs are stuck inactive for more than 60 seconds
>             542 pgs backfill_wait
>             14 pgs backfilling
>             11 pgs degraded
>             1402 pgs peering
>             3 pgs recovery_wait
>             11 pgs stuck degraded
>             1483 pgs stuck inactive
>             2042 pgs stuck unclean
>             7 pgs stuck undersized
>             7 pgs undersized
>             111 requests are blocked > 32 sec
>             10586 requests are blocked > 4096 sec
>             recovery 9472/11120724 objects degraded (0.085%)
>             recovery 1181567/11120724 objects misplaced (10.625%)
>             noout flag(s) set
>             mon.eselde02u32 low disk space
>
>   services:
>     mon: 3 daemons, quorum eselde02u32,eselde02u33,eselde02u34
>     mgr: eselde02u32(active), standbys: eselde02u33, eselde02u34
>     osd: 111 osds: 111 up, 111 in; 800 remapped pgs
>          flags noout
>
>   data:
>     pools:   18 pools, 4104 pgs
>     objects: 3620k objects, 13875 GB
>     usage:   42254 GB used, 160 TB / 201 TB avail
>     pgs:     1.876% pgs unknown
>              34.259% pgs not active
>              9472/11120724 objects degraded (0.085%)
>              1181567/11120724 objects misplaced (10.625%)
>              2062 active+clean
>             1221 peering
>              535  active+remapped+backfill_wait
>              181  remapped+peering
>              77   unknown
>              13   active+remapped+backfilling
>              7    active+undersized+degraded+remapped+backfill_wait
>              4    remapped
>              3    active+recovery_wait+degraded+remapped
>              1    active+degraded+remapped+backfilling
>
>   io:
>     recovery: 298 MB/s, 77 objects/s
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g>
> 81247 München
> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g>
> www.croit.io
> Tel: +49 89 1896585 90
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to