Do you think 4GB RAM for two OSD's is low, even with 1 OSD and 4GB memory i encounter OOM during recovery.
This is just a test setup to try with ceph. I am planning to use ceph in 36 drive system with Xeon e5v2 proc, 32 GB RAM, 10Gb nic. So, i am trying to optimize on memory usage on test setup which can be latter used if using ceph. On Sun 22 Sep, 2019, 12:02 PM Ashley Merrick, <singap...@amerrick.co.uk> wrote: > I'm not aware of any memory settings that control rebuild memory usage. > > You are running very under on RAM, have you tried adding more swap or > adjusting /proc/sys/vm/*swappiness* > > > > ---- On Fri, 20 Sep 2019 20:41:09 +0800 *Amudhan P <amudha...@gmail.com > <amudha...@gmail.com>>* wrote ---- > > Hi, > > I am using ceph mimic in a small test setup using the below configuration. > > OS: ubuntu 18.04 > > 1 node running (mon,mds,mgr) + 4 core cpu and 4GB RAM and 1 Gb lan > 3 nodes each having 2 osd's, disks are 2TB + 2 core cpu and 4G RAM and 1 > Gb lan > 1 node acting as cephfs client + 2 core cpu and 4G RAM and 1 Gb lan > > configured cephfs_metadata_pool (3 replica) and cephfs_data_pool erasure > 2+1. > > When running a script doing multiple folders creation ceph started > throwing error late IO due to high metadata workload. > once after folder creation complete PG's degraded and I am waiting for PG > to complete recovery but my OSD's starting to crash due to OOM and > restarting after some time. > > Now my question is I can wait for recovery to complete but how do I stop > OOM and OSD crash? basically want to know the way to control memory usage > during recovery and make it stable. > > I have also set very low PG metadata_pool 8 and data_pool 16. > > I have already set "mon osd memory target to 1Gb" and I have set > max-backfill from 1 to 8. > > Attached msg from "kern.log" from one of the node and snippet of error msg > in this mail. > > ---------error msg snippet ---------- > -bash: fork: Cannot allocate memory > > Sep 18 19:01:57 test-node1 kernel: [341246.765644] msgr-worker-0 invoked > oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), > order=0, oom_score_adj=0 > Sep 18 19:02:00 test-node1 kernel: [341246.765645] msgr-worker-0 cpuset=/ > mems_allowed=0 > Sep 18 19:02:00 test-node1 kernel: [341246.765650] CPU: 1 PID: 1737 Comm: > msgr-worker-0 Not tainted 4.15.0-45-generic #48-Ubuntu > > Sep 18 19:02:02 test-node1 kernel: [341246.765833] Out of memory: Kill > process 1727 (ceph-osd) score 489 or sacrifice child > Sep 18 19:02:03 test-node1 kernel: [341246.765919] Killed process 1727 > (ceph-osd) total-vm:3483844kB, anon-rss:1992708kB, file-rss:0kB, > shmem-rss:0kB > Sep 18 19:02:03 test-node1 kernel: [341246.899395] oom_reaper: reaped > process 1727 (ceph-osd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > Sep 18 22:09:57 test-node1 kernel: [352529.433155] perf: interrupt took > too long (4965 > 4938), lowering kernel.perf_event_max_sample_rate to 40250 > > regards > Amudhan > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > > >
_______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io