I had a similar problem with some relatively underpowered servers (2x E5-2603 6 core 1.7ghz no HT, 12-14 2TB OSDs per server, 32Gb RAM)
There was a process on a couple of the servers that would hang and chew up all available CPU. When that happened, I started getting scrub errors on those servers. On Mon, Mar 5, 2018 at 8:45 AM, Jan Marquardt <j...@artfiles.de> wrote: > Am 05.03.18 um 13:13 schrieb Ronny Aasen: > > i had some similar issues when i started my proof of concept. especialy > > the snapshot deletion i remember well. > > > > the rule of thumb for filestore that i assume you are running is 1GB ram > > per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for > > osd's + some GB's for the mon service, + some GB's for the os itself. > > > > i suspect if you inspect your dmesg log and memory graphs you will find > > that the out of memory killer ends your osd's when the snap deletion (or > > any other high load task) runs. > > > > I ended up reducing the number of osd's per node, since the old > > mainboard i used was maxed for memory. > > Well, thanks for the broad hint. Somehow I assumed we fulfill the > recommendations, but of course you are right. We'll check if our boards > support 48 GB RAM. Unfortunately, there are currently no corresponding > messages. But I can't rule out that there haven't been any. > > > corruptions occured for me as well. and they was normaly associated with > > disks dying or giving read errors. ceph often managed to fix them but > > sometimes i had to just remove the hurting OSD disk. > > > > hage some graph's to look at. personaly i used munin/munin-node since > > it was just an apt-get away from functioning graphs > > > > also i used smartmontools to send me emails about hurting disks. > > and smartctl to check all disks for errors. > > I'll check S.M.A.R.T stuff. I am wondering if scrubbing errors are > always caused by disk problems or if they also could be triggered > by flapping OSDs or other circumstances. > > > good luck with ceph ! > > Thank you! > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com