Hi!

One time I faced such a behavior of my home cluster. At the time my OSDs go
down I noticed that node is using swap despite of sufficient memory. Tuning
/proc/sys/vm/swappiness to 0 helped to solve the problem.

пт, 7 авг. 2015 г. в 20:41, Tuomas Juntunen <tuomas.juntu...@databasement.fi
>:

> Thanks
>
>
>
> We play with the values a bit and see what happens.
>
>
>
> Br,
>
> Tuomas
>
>
>
>
>
> *From:* Quentin Hartman [mailto:qhart...@direwolfdigital.com]
> *Sent:* 7. elokuuta 2015 20:32
> *To:* Tuomas Juntunen
> *Cc:* ceph-users
> *Subject:* Re: [ceph-users] Flapping OSD's when scrubbing
>
>
>
> That kind of behavior is usually caused by the OSDs getting busy enough
> that they aren't answering heartbeats in a timely fashion. It can also
> happen if you have any netowrk flakiness and heartbeats are getting lost
> because of that.
>
>
>
> I think (I'm not positive though) that increasing your heartbeat interval
> may help. Also, looking at the number of threads you have for your OSDs,
> that seems potentially problematic. If you've got 24 OSDs per machine and
> each one is running 12 threads, that's 288 threads on 12 cores for just the
> requests. Plus the disk threads, plus the filestore op threads... That
> level of thread contention seems like it might be contributing to missing
> the heartbeats. But again, that's conjecture. I've not worked with a setup
> as dense as yours.
>
>
>
> QH
>
>
>
> On Fri, Aug 7, 2015 at 11:21 AM, Tuomas Juntunen <
> tuomas.juntu...@databasement.fi> wrote:
>
> Hi
>
>
>
> We are experiencing an annoying problem where scrubs make OSD’s flap down
> and cause Ceph cluster to be unusable for couple of minutes.
>
>
>
> Our cluster consists of three nodes connected with 40gbit infiniband using
> IPoIB, with 2x 6 core X5670 CPU’s and 64GB of memory
>
> Each node has 6 SSD’s for journals to 12 OSD’s 2TB disks (Fast pools) and
> another 12 OSD’s 4TB disks (Archive pools) which have journal on the same
> disk.
>
>
>
> It seems that our cluster is constantly doing scrubbing, we rarely see
> only active+clean, below is the status at the moment.
>
>
>
>     cluster a2974742-3805-4cd3-bc79-765f2bddaefe
>
>      health HEALTH_OK
>
>      monmap e16: 4 mons at {lb1=
> 10.20.60.1:6789/0,lb2=10.20.60.2:6789/0,nc1=10.20.50.2:6789/0,nc2=10.20.50.3:6789/0
> }
>
>             election epoch 1838, quorum 0,1,2,3 nc1,nc2,lb1,lb2
>
>      mdsmap e7901: 1/1/1 up {0=lb1=up:active}, 4 up:standby
>
>      osdmap e104824: 72 osds: 72 up, 72 in
>
>       pgmap v12941402: 5248 pgs, 9 pools, 19644 GB data, 4810 kobjects
>
>             59067 GB used, 138 TB / 196 TB avail
>
>                 5241 active+clean
>
>                    7 active+clean+scrubbing
>
>
>
> When OSD’s go down, first the load on a node goes high during scrubbing
> and after that some OSD’s go down and 30 secs, they are back up. They are
> not really going down, but are marked as down. Then it takes around couple
> of minutes for everything be OK again.
>
>
>
> Any suggestion how to fix this? We can’t go to production while this
> behavior exists.
>
>
>
> Our config is below:
>
>
>
> [global]
>
> fsid = a2974742-3805-4cd3-bc79-765f2bddaefe
>
> mon_initial_members = lb1,lb2,nc1,nc2
>
> mon_host = 10.20.60.1,10.20.60.2,10.20.50.2,10.20.50.3
>
> auth_cluster_required = cephx
>
> auth_service_required = cephx
>
> auth_client_required = cephx
>
> filestore_xattr_use_omap = true
>
>
>
> osd pool default pg num = 128
>
> osd pool default pgp num = 128
>
>
>
> public network = 10.20.0.0/16
>
>
>
>         osd_op_threads = 12
>
>         osd_op_num_threads_per_shard = 2
>
>         osd_op_num_shards = 6
>
>         #osd_op_num_sharded_pool_threads = 25
>
>         filestore_op_threads = 12
>
>         ms_nocrc = true
>
>         filestore_fd_cache_size = 64
>
>         filestore_fd_cache_shards = 32
>
>         ms_dispatch_throttle_bytes = 0
>
>         throttler_perf_counter = false
>
>
>
> mon osd min down reporters = 25
>
>
>
> [osd]
>
> osd scrub max interval = 1209600
>
> osd scrub min interval = 604800
>
> osd scrub load threshold = 3.0
>
> osd max backfills = 1
>
> osd recovery max active = 1
>
> # IO Scheduler settings
>
> osd scrub sleep = 1.0
>
> osd disk thread ioprio class = idle
>
> osd disk thread ioprio priority = 7
>
> osd scrub chunk max = 1
>
> osd scrub chunk min = 1
>
> osd deep scrub stride = 1048576
>
> filestore queue max ops = 10000
>
> filestore max sync interval = 30
>
> filestore min sync interval = 29
>
>
>
> osd deep scrub interval = 2592000
>
>         osd heartbeat grace = 240
>
>         osd heartbeat interval = 12
>
>         osd mon report interval max = 120
>
>         osd mon report interval min = 5
>
>
>
>        osd_client_message_size_cap = 0
>
>         osd_client_message_cap = 0
>
>         osd_enable_op_tracker = false
>
>
>
>         osd crush update on start = false
>
>
>
> [client]
>
>         rbd cache = true
>
>         rbd cache size = 67108864 # 64mb
>
>         rbd cache max dirty = 50331648 # 48mb
>
>         rbd cache target dirty = 33554432 # 32mb
>
>         rbd cache writethrough until flush = true # It's by default
>
>         rbd cache max dirty age = 2
>
>         admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok
>
>
>
>
>
> Br,
>
> Tuomas
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to