Hi! One time I faced such a behavior of my home cluster. At the time my OSDs go down I noticed that node is using swap despite of sufficient memory. Tuning /proc/sys/vm/swappiness to 0 helped to solve the problem.
пт, 7 авг. 2015 г. в 20:41, Tuomas Juntunen <tuomas.juntu...@databasement.fi >: > Thanks > > > > We play with the values a bit and see what happens. > > > > Br, > > Tuomas > > > > > > *From:* Quentin Hartman [mailto:qhart...@direwolfdigital.com] > *Sent:* 7. elokuuta 2015 20:32 > *To:* Tuomas Juntunen > *Cc:* ceph-users > *Subject:* Re: [ceph-users] Flapping OSD's when scrubbing > > > > That kind of behavior is usually caused by the OSDs getting busy enough > that they aren't answering heartbeats in a timely fashion. It can also > happen if you have any netowrk flakiness and heartbeats are getting lost > because of that. > > > > I think (I'm not positive though) that increasing your heartbeat interval > may help. Also, looking at the number of threads you have for your OSDs, > that seems potentially problematic. If you've got 24 OSDs per machine and > each one is running 12 threads, that's 288 threads on 12 cores for just the > requests. Plus the disk threads, plus the filestore op threads... That > level of thread contention seems like it might be contributing to missing > the heartbeats. But again, that's conjecture. I've not worked with a setup > as dense as yours. > > > > QH > > > > On Fri, Aug 7, 2015 at 11:21 AM, Tuomas Juntunen < > tuomas.juntu...@databasement.fi> wrote: > > Hi > > > > We are experiencing an annoying problem where scrubs make OSD’s flap down > and cause Ceph cluster to be unusable for couple of minutes. > > > > Our cluster consists of three nodes connected with 40gbit infiniband using > IPoIB, with 2x 6 core X5670 CPU’s and 64GB of memory > > Each node has 6 SSD’s for journals to 12 OSD’s 2TB disks (Fast pools) and > another 12 OSD’s 4TB disks (Archive pools) which have journal on the same > disk. > > > > It seems that our cluster is constantly doing scrubbing, we rarely see > only active+clean, below is the status at the moment. > > > > cluster a2974742-3805-4cd3-bc79-765f2bddaefe > > health HEALTH_OK > > monmap e16: 4 mons at {lb1= > 10.20.60.1:6789/0,lb2=10.20.60.2:6789/0,nc1=10.20.50.2:6789/0,nc2=10.20.50.3:6789/0 > } > > election epoch 1838, quorum 0,1,2,3 nc1,nc2,lb1,lb2 > > mdsmap e7901: 1/1/1 up {0=lb1=up:active}, 4 up:standby > > osdmap e104824: 72 osds: 72 up, 72 in > > pgmap v12941402: 5248 pgs, 9 pools, 19644 GB data, 4810 kobjects > > 59067 GB used, 138 TB / 196 TB avail > > 5241 active+clean > > 7 active+clean+scrubbing > > > > When OSD’s go down, first the load on a node goes high during scrubbing > and after that some OSD’s go down and 30 secs, they are back up. They are > not really going down, but are marked as down. Then it takes around couple > of minutes for everything be OK again. > > > > Any suggestion how to fix this? We can’t go to production while this > behavior exists. > > > > Our config is below: > > > > [global] > > fsid = a2974742-3805-4cd3-bc79-765f2bddaefe > > mon_initial_members = lb1,lb2,nc1,nc2 > > mon_host = 10.20.60.1,10.20.60.2,10.20.50.2,10.20.50.3 > > auth_cluster_required = cephx > > auth_service_required = cephx > > auth_client_required = cephx > > filestore_xattr_use_omap = true > > > > osd pool default pg num = 128 > > osd pool default pgp num = 128 > > > > public network = 10.20.0.0/16 > > > > osd_op_threads = 12 > > osd_op_num_threads_per_shard = 2 > > osd_op_num_shards = 6 > > #osd_op_num_sharded_pool_threads = 25 > > filestore_op_threads = 12 > > ms_nocrc = true > > filestore_fd_cache_size = 64 > > filestore_fd_cache_shards = 32 > > ms_dispatch_throttle_bytes = 0 > > throttler_perf_counter = false > > > > mon osd min down reporters = 25 > > > > [osd] > > osd scrub max interval = 1209600 > > osd scrub min interval = 604800 > > osd scrub load threshold = 3.0 > > osd max backfills = 1 > > osd recovery max active = 1 > > # IO Scheduler settings > > osd scrub sleep = 1.0 > > osd disk thread ioprio class = idle > > osd disk thread ioprio priority = 7 > > osd scrub chunk max = 1 > > osd scrub chunk min = 1 > > osd deep scrub stride = 1048576 > > filestore queue max ops = 10000 > > filestore max sync interval = 30 > > filestore min sync interval = 29 > > > > osd deep scrub interval = 2592000 > > osd heartbeat grace = 240 > > osd heartbeat interval = 12 > > osd mon report interval max = 120 > > osd mon report interval min = 5 > > > > osd_client_message_size_cap = 0 > > osd_client_message_cap = 0 > > osd_enable_op_tracker = false > > > > osd crush update on start = false > > > > [client] > > rbd cache = true > > rbd cache size = 67108864 # 64mb > > rbd cache max dirty = 50331648 # 48mb > > rbd cache target dirty = 33554432 # 32mb > > rbd cache writethrough until flush = true # It's by default > > rbd cache max dirty age = 2 > > admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok > > > > > > Br, > > Tuomas > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com