What is the status of the cluster with this osd down and out?

On Thu, Aug 2, 2018 at 5:42 AM, J David <j.david.li...@gmail.com> wrote:
> Hello all,
>
> On Luminous 12.2.7, during the course of recovering from a failed OSD,
> one of the other OSDs started repeatedly crashing every few seconds
> with an assertion failure:
>
> 2018-08-01 12:17:20.584350 7fb50eded700 -1 log_channel(cluster) log
> [ERR] : 2.621 past_interal bound [19300,21449) end does not match
> required [21374,21447) end
> /build/ceph-12.2.7/src/osd/PG.cc: In function 'void
> PG::check_past_interval_bounds() const' thread 7fb50eded700 time
> 2018-08-01 12:17:20.584367
> /build/ceph-12.2.7/src/osd/PG.cc: 847: FAILED assert(0 ==
> “past_interval end mismatch")
>
> The console output of a run of this OSD is here:
>
> https://pastebin.com/WSjsVwVu
>
> The last 512k worth of the log file for this OSD is here:
>
> https://pastebin.com/rYQkMatA
>
> Currently I have “debug osd = 5/5” in ceph.conf, but if other values
> would shed useful light, this problem  is easy to reproduce.
>
> There are no disk errors or problems that I can see with the OSD that
> won’t stay running.
>
> Does anyone know what happened here, and whether it's recoverable?
>
> Thanks for any advice!
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to