Peter, There are 624 PGs across 4 pools:
pool 0 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 2505 flags hashpspool stripe_width 0 removed_snaps [1~3] pool 3 'fsdata' erasure size 14 min_size 11 crush_ruleset 3 object_hash rjenkins pg_num 512 pgp_num 512 last_change 154 lfor 153 flags hashpspool crash_replay_interval 45 tiers 5 read_tier 5 write_tier 5 stripe_width 4160 pool 4 'fsmeta' replicated size 4 min_size 3 crush_ruleset 0 object_hash rjenkins pg_num 16 pgp_num 16 last_change 144 flags hashpspool stripe_width 0 pool 5 'fscache' replicated size 3 min_size 2 crush_ruleset 4 object_hash rjenkins pg_num 32 pgp_num 32 last_change 1016 flags hashpspool,incomplete_clones tier_of 3 cache_mode writeback target_bytes 100000000000 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 86400s x4 decay_rate 0 search_last_n 0 stripe_width 0 Here's the ceph.conf. We're back to no extra configuration for bluestore caching, but previously we had attempted setting the directive bluestore_cache_size as low as 1073741. [global] fsid = c4b3b4ec-fbc2-4861-913f-295ff64f70ad auth client required = cephx auth cluster required = cephx auth service required = cephx cephx require signatures = true public network = 10.42.0.0/16 cluster network = 10.43.100.0/24 mon_initial_members = benjamin, jake, jennifer mon_host = 10.42.5.38,10.42.5.37,10.42.5.36 [osd] osd crush update on start = false Thanks, -Aaron On Sat, Apr 15, 2017 at 5:39 AM, Peter Maloney < peter.malo...@brockmann-consult.de> wrote: > How many PGs do you have? And did you change any config, like mds cache > size? Show your ceph.conf. > > > On 04/15/17 07:34, Aaron Ten Clay wrote: > > Hi all, > > Our cluster is experiencing a very odd issue and I'm hoping for some > guidance on troubleshooting steps and/or suggestions to mitigate the issue. > tl;dr: Individual ceph-osd processes try to allocate > 90GiB of RAM and are > eventually nuked by oom_killer. > > I'll try to explain the situation in detail: > > We have 24-4TB bluestore HDD OSDs, and 4-600GB SSD OSDs. The SSD OSDs are > in a different CRUSH "root", used as a cache tier for the main storage > pools, which are erasure coded and used for cephfs. The OSDs are spread > across two identical machines with 128GiB of RAM each, and there are three > monitor nodes on different hardware. > > Several times we've encountered crippling bugs with previous Ceph releases > when we were on RC or betas, or using non-recommended configurations, so in > January we abandoned all previous Ceph usage, deployed LTS Ubuntu 16.04, > and went with stable Kraken 11.2.0 with the configuration mentioned above. > Everything was fine until the end of March, when one day we find all but a > couple of OSDs are "down" inexplicably. Investigation reveals oom_killer > came along and nuked almost all the ceph-osd processes. > > We've gone through a bunch of iterations of restarting the OSDs, trying to > bring them up one at a time gradually, all at once, various configuration > settings to reduce cache size as suggested in this ticket: > http://tracker.ceph.com/issues/18924... > > I don't know if that ticket really pertains to our situation or not, I > have no experience with memory allocation debugging. I'd be willing to try > if someone can point me to a guide or walk me through the process. > > I've even tried, just to see if the situation was transitory, adding over > 300GiB of swap to both OSD machines. The OSD procs managed to allocate, in > a matter of 5-10 minutes, more than 300GiB of RAM pressure and became > oom_killer victims once again. > > No software or hardware changes took place around the time this problem > started, and no significant data changes occurred either. We added about > 40GiB of ~1GiB files a week or so before the problem started and that's the > last time data was written. > > I can only assume we've found another crippling bug of some kind, this > level of memory usage is entirely unprecedented. What can we do? > > Thanks in advance for any suggestions. > -Aaron > > > _______________________________________________ > ceph-users mailing > listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- Aaron Ten Clay https://aarontc.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com