I did have a problem in my secondary cluster that sounds similar to yours.
I was using XFS, and traced my problem back to 64 kB inodes (osd mkfs
options xfs = -i size=64k).   This showed up with a lot of "XFS: possible
memory allocation deadlock in kmem_alloc" in the kernel logs.  I was able
to keep things limping along by flushing the cache frequently, but I
eventually re-formatted every OSD to get rid of the 64k inodes.

After I finished the reformat, I had problems because of deep-scrubbing.
While reformatting, I disabled deep-scrubbing.  Once I re-enabled it, Ceph
wanted to deep-scrub the whole cluster, and sometimes 90% of my OSDs would
be doing a deep-scrub.  I'm manually deep-scrubbing now, trying to spread
out the schedule a bit.  Once this finishes in a few day, I should be able
to re-enable deep-scrubbing and keep my HEALTH_OK.


My primary cluster has always been well behaved.  It completed the
re-format without having any problems.  The clusters are nearly identical,
the biggest difference being that the secondary had a higher sustained load
due to a replication backlog.




On Sat, Nov 15, 2014 at 12:38 PM, Erik Logtenberg <e...@logtenberg.eu>
wrote:

> Hi,
>
> Thanks for the tip, I applied these configuration settings and it does
> lower the load during rebuilding a bit. Are there settings like these
> that also tune Ceph down a bit during regular operations? The slow
> requests, timeouts and OSD suicides are killing me.
>
> If I allow the cluster to regain consciousness and stay idle a bit, it
> all seems to settle down nicely, but as soon as I apply some load it
> immediately starts to overstress and complain like crazy.
>
> I'm also seeing this behaviour: http://tracker.ceph.com/issues/9844
> This was reported by Dmitry Smirnov 26 days ago, but the report has no
> response yet. Any ideas?
>
> In my experience, OSD's are quite unstable in Giant and very easily
> stressed, causing chain effects, further worsening the issues. It would
> be nice to know if this is also noticed by other users?
>
> Thanks,
>
> Erik.
>
>
> On 11/10/2014 08:40 PM, Craig Lewis wrote:
> > Have you tuned any of the recovery or backfill parameters?  My ceph.conf
> > has:
> > [osd]
> >   osd max backfills = 1
> >   osd recovery max active = 1
> >   osd recovery op priority = 1
> >
> > Still, if it's running for a few hours, then failing, it sounds like
> > there might be something else at play.  OSDs use a lot of RAM during
> > recovery.  How much RAM and how many OSDs do you have in these nodes?
> > What does memory usage look like after a fresh restart, and what does it
> > look like when the problems start?  Even better if you know what it
> > looks like 5 minutes before the problems start.
> >
> > Is there anything interesting in the kernel logs?  OOM killers, or
> > memory deadlocks?
> >
> >
> >
> > On Sat, Nov 8, 2014 at 11:19 AM, Erik Logtenberg <e...@logtenberg.eu
> > <mailto:e...@logtenberg.eu>> wrote:
> >
> >     Hi,
> >
> >     I have some OSD's that keep committing suicide. My cluster has ~1.3M
> >     misplaced objects, and it can't really recover, because OSD's keep
> >     failing before recovering finishes. The load on the hosts is quite
> high,
> >     but the cluster currently has no other tasks than just the
> >     backfilling/recovering.
> >
> >     I attached the logfile from a failed OSD. It shows the suicide, the
> >     recent events and also me starting the OSD again after some time.
> >
> >     It'll keep running for a couple of hours and then fail again, for the
> >     same reason.
> >
> >     I noticed a lot of timeouts. Apparently ceph stresses the hosts to
> the
> >     limit with the recovery tasks, so much that they timeout and can't
> >     finish that task. I don't understand why. Can I somehow throttle
> ceph a
> >     bit so that it doesn't keep overrunning itself? I kinda feel like it
> >     should chill out a bit and simply recover one step at a time instead
> of
> >     full force and then fail.
> >
> >     Thanks,
> >
> >     Erik.
> >
> >     _______________________________________________
> >     ceph-users mailing list
> >     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> >     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to