Hi list,

Things went from bad to worse, tried to upgrade some OSDs to Luminous to
see if that could help but that didn’t appear to make any difference.
But for each restarted OSD there was a few PGs that the OSD seemed to
“forget” and the number of undersized PGs grew until some PGs had been
“forgotten” by all 3 acting OSDs and became stale, even though all OSDs
(and their disks) where available.
Then the OSDs grew so big that the servers ran out of memory (48GB per
server with 10 2TB-disks per server) and started killing the OSDs…
All OSDs where then shutdown to try and preserve some data on the disks at
least, but maybe it is too late?

/Magnus

2018-07-11 21:10 GMT+02:00 Magnus Grönlund <mag...@gronlund.se>:

> Hi Paul,
>
> No all OSDs are still jewel , the issue started before I had even started
> to upgrade the first OSD and they don't appear to be flapping.
> ceph -w shows a lot of slow request etc, but nothing unexpected as far as
> I can tell considering the state the cluster is in.
>
> 2018-07-11 20:40:09.396642 osd.37 [WRN] 100 slow requests, 2 included
> below; oldest blocked for > 25402.278824 secs
> 2018-07-11 20:40:09.396652 osd.37 [WRN] slow request 1920.957326 seconds
> old, received at 2018-07-11 20:08:08.439214: osd_op(client.73540057.0:8289463
> 2.e57b3e32 (undecoded) ack+ondisk+retry+write+known_if_redirected
> e160294) currently waiting for peered
> 2018-07-11 20:40:09.396660 osd.37 [WRN] slow request 1920.048094 seconds
> old, received at 2018-07-11 20:08:09.348446: osd_op(client.671628641.0:998704
> 2.42f88232 (undecoded) ack+ondisk+retry+write+known_if_redirected
> e160475) currently waiting for peered
> 2018-07-11 20:40:10.397008 osd.37 [WRN] 100 slow requests, 2 included
> below; oldest blocked for > 25403.279204 secs
> 2018-07-11 20:40:10.397017 osd.37 [WRN] slow request 1920.043860 seconds
> old, received at 2018-07-11 20:08:10.353060: osd_op(client.231731103.0:1007729
> 3.e0ff5786 (undecoded) ondisk+write+known_if_redirected e137428)
> currently waiting for peered
> 2018-07-11 20:40:10.397023 osd.37 [WRN] slow request 1920.034101 seconds
> old, received at 2018-07-11 20:08:10.362819: osd_op(client.207458703.0:2000292
> 3.a8143b86 (undecoded) ondisk+write+known_if_redirected e137428)
> currently waiting for peered
> 2018-07-11 20:40:10.790573 mon.0 [INF] pgmap 4104 pgs: 5 down+peering,
> 1142 peering, 210 remapped+peering, 5 active+recovery_wait+degraded, 1551
> active+clean, 2 activating+undersized+degraded+remapped, 15
> active+remapped+backfilling, 178 unknown, 1 active+remapped, 3
> activating+remapped, 78 active+undersized+degraded+remapped+backfill_wait,
> 6 active+recovery_wait+degraded+remapped, 3 
> undersized+degraded+remapped+backfill_wait+peered,
> 5 active+undersized+degraded+remapped+backfilling, 295
> active+remapped+backfill_wait, 3 active+recovery_wait+undersized+degraded,
> 21 activating+undersized+degraded, 559 active+undersized+degraded, 4
> remapped, 17 undersized+degraded+peered, 1 
> active+recovery_wait+undersized+degraded+remapped;
> 13439 GB data, 42395 GB used, 160 TB / 201 TB avail; 4069 B/s rd, 746 kB/s
> wr, 5 op/s; 534753/10756032 objects degraded (4.972%); 779027/10756032
> objects misplaced (7.243%); 256 MB/s, 65 objects/s recovering
>
>
>
> There are a lot of things in the OSD-log files that I'm unfamiliar with
> but so far I haven't found anything that has given me a clue on how to fix
> the issue.
> BTW restarting a OSD doesn't seem to help, on the contrary, that sometimes
> results in PGs beeing stuck undersized!
> I have attaced a osd-log from when a OSD i restarted started up.
>
> Best regards
> /Magnus
>
>
> 2018-07-11 20:39 GMT+02:00 Paul Emmerich <paul.emmer...@croit.io>:
>
>> Did you finish the upgrade of the OSDs? Are OSDs flapping? (ceph -w) Is
>> there anything weird in the OSDs' log files?
>>
>>
>> Paul
>>
>> 2018-07-11 20:30 GMT+02:00 Magnus Grönlund <mag...@gronlund.se>:
>>
>>> Hi,
>>>
>>> Started to upgrade a ceph-cluster from Jewel (10.2.10) to Luminous
>>> (12.2.6)
>>>
>>> After upgrading and restarting the mons everything looked OK, the mons
>>> had quorum, all OSDs where up and in and all the PGs where active+clean.
>>> But before I had time to start upgrading the OSDs it became obvious that
>>> something had gone terribly wrong.
>>> All of a sudden 1600 out of 4100 PGs where inactive and 40% of the data
>>> was misplaced!
>>>
>>> The mons appears OK and all OSDs are still up and in, but a few hours
>>> later there was still 1483 pgs stuck inactive, essentially all of them in
>>> peering!
>>> Investigating one of the stuck PGs it appears to be looping between
>>> “inactive”, “remapped+peering” and “peering” and the epoch number is rising
>>> fast, see the attached pg query outputs.
>>>
>>> We really can’t afford to loose the cluster or the data so any help or
>>> suggestions on how to debug or fix this issue would be very, very
>>> appreciated!
>>>
>>>
>>>     health: HEALTH_ERR
>>>             1483 pgs are stuck inactive for more than 60 seconds
>>>             542 pgs backfill_wait
>>>             14 pgs backfilling
>>>             11 pgs degraded
>>>             1402 pgs peering
>>>             3 pgs recovery_wait
>>>             11 pgs stuck degraded
>>>             1483 pgs stuck inactive
>>>             2042 pgs stuck unclean
>>>             7 pgs stuck undersized
>>>             7 pgs undersized
>>>             111 requests are blocked > 32 sec
>>>             10586 requests are blocked > 4096 sec
>>>             recovery 9472/11120724 objects degraded (0.085%)
>>>             recovery 1181567/11120724 objects misplaced (10.625%)
>>>             noout flag(s) set
>>>             mon.eselde02u32 low disk space
>>>
>>>   services:
>>>     mon: 3 daemons, quorum eselde02u32,eselde02u33,eselde02u34
>>>     mgr: eselde02u32(active), standbys: eselde02u33, eselde02u34
>>>     osd: 111 osds: 111 up, 111 in; 800 remapped pgs
>>>          flags noout
>>>
>>>   data:
>>>     pools:   18 pools, 4104 pgs
>>>     objects: 3620k objects, 13875 GB
>>>     usage:   42254 GB used, 160 TB / 201 TB avail
>>>     pgs:     1.876% pgs unknown
>>>              34.259% pgs not active
>>>              9472/11120724 objects degraded (0.085%)
>>>              1181567/11120724 objects misplaced (10.625%)
>>>              2062 active+clean
>>>             1221 peering
>>>              535  active+remapped+backfill_wait
>>>              181  remapped+peering
>>>              77   unknown
>>>              13   active+remapped+backfilling
>>>              7    active+undersized+degraded+remapped+backfill_wait
>>>              4    remapped
>>>              3    active+recovery_wait+degraded+remapped
>>>              1    active+degraded+remapped+backfilling
>>>
>>>   io:
>>>     recovery: 298 MB/s, 77 objects/s
>>>
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>
>>
>> --
>> Paul Emmerich
>>
>> Looking for help with your Ceph cluster? Contact us at https://croit.io
>>
>> croit GmbH
>> Freseniusstr. 31h
>> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g>
>> 81247 München
>> <https://maps.google.com/?q=Freseniusstr.+31h+81247+M%C3%BCnchen&entry=gmail&source=g>
>> www.croit.io
>> Tel: +49 89 1896585 90
>>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to