I did have a problem in my secondary cluster that sounds similar to yours. I was using XFS, and traced my problem back to 64 kB inodes (osd mkfs options xfs = -i size=64k). This showed up with a lot of "XFS: possible memory allocation deadlock in kmem_alloc" in the kernel logs. I was able to keep things limping along by flushing the cache frequently, but I eventually re-formatted every OSD to get rid of the 64k inodes.
After I finished the reformat, I had problems because of deep-scrubbing. While reformatting, I disabled deep-scrubbing. Once I re-enabled it, Ceph wanted to deep-scrub the whole cluster, and sometimes 90% of my OSDs would be doing a deep-scrub. I'm manually deep-scrubbing now, trying to spread out the schedule a bit. Once this finishes in a few day, I should be able to re-enable deep-scrubbing and keep my HEALTH_OK. My primary cluster has always been well behaved. It completed the re-format without having any problems. The clusters are nearly identical, the biggest difference being that the secondary had a higher sustained load due to a replication backlog. On Sat, Nov 15, 2014 at 12:38 PM, Erik Logtenberg <e...@logtenberg.eu> wrote: > Hi, > > Thanks for the tip, I applied these configuration settings and it does > lower the load during rebuilding a bit. Are there settings like these > that also tune Ceph down a bit during regular operations? The slow > requests, timeouts and OSD suicides are killing me. > > If I allow the cluster to regain consciousness and stay idle a bit, it > all seems to settle down nicely, but as soon as I apply some load it > immediately starts to overstress and complain like crazy. > > I'm also seeing this behaviour: http://tracker.ceph.com/issues/9844 > This was reported by Dmitry Smirnov 26 days ago, but the report has no > response yet. Any ideas? > > In my experience, OSD's are quite unstable in Giant and very easily > stressed, causing chain effects, further worsening the issues. It would > be nice to know if this is also noticed by other users? > > Thanks, > > Erik. > > > On 11/10/2014 08:40 PM, Craig Lewis wrote: > > Have you tuned any of the recovery or backfill parameters? My ceph.conf > > has: > > [osd] > > osd max backfills = 1 > > osd recovery max active = 1 > > osd recovery op priority = 1 > > > > Still, if it's running for a few hours, then failing, it sounds like > > there might be something else at play. OSDs use a lot of RAM during > > recovery. How much RAM and how many OSDs do you have in these nodes? > > What does memory usage look like after a fresh restart, and what does it > > look like when the problems start? Even better if you know what it > > looks like 5 minutes before the problems start. > > > > Is there anything interesting in the kernel logs? OOM killers, or > > memory deadlocks? > > > > > > > > On Sat, Nov 8, 2014 at 11:19 AM, Erik Logtenberg <e...@logtenberg.eu > > <mailto:e...@logtenberg.eu>> wrote: > > > > Hi, > > > > I have some OSD's that keep committing suicide. My cluster has ~1.3M > > misplaced objects, and it can't really recover, because OSD's keep > > failing before recovering finishes. The load on the hosts is quite > high, > > but the cluster currently has no other tasks than just the > > backfilling/recovering. > > > > I attached the logfile from a failed OSD. It shows the suicide, the > > recent events and also me starting the OSD again after some time. > > > > It'll keep running for a couple of hours and then fail again, for the > > same reason. > > > > I noticed a lot of timeouts. Apparently ceph stresses the hosts to > the > > limit with the recovery tasks, so much that they timeout and can't > > finish that task. I don't understand why. Can I somehow throttle > ceph a > > bit so that it doesn't keep overrunning itself? I kinda feel like it > > should chill out a bit and simply recover one step at a time instead > of > > full force and then fail. > > > > Thanks, > > > > Erik. > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com