During high load testing I'm only seeing user and sys cpu load around 60%... my load doesn't seem to be anything crazy on the host and iowait stays between 6 and 10%. I have very good `ceph osd perf` numbers too.
I am using 10.2.11 Jewel. On Wed, Aug 22, 2018 at 11:30 PM Christian Balzer <ch...@gol.com> wrote: > Hello, > > On Wed, 22 Aug 2018 23:00:24 -0400 Tyler Bishop wrote: > > > Hi, I've been fighting to get good stability on my cluster for about > > 3 weeks now. I am running into intermittent issues with OSD flapping > > marking other OSD down then going back to a stable state for hours and > > days. > > > > The cluster is 4x Cisco UCS S3260 with dual E5-2660, 256GB ram, 40G > > Network to 40G Brocade VDX Switches. The OSD are 6TB HGST SAS drives > > with 400GB HGST SAS 12G SSDs. My configuration is 4 journals per > > host with 12 disk per journal for a total of 56 disk per system and 52 > > OSD. > > > Any denser and you'd have a storage black hole. > > You already pointed your finger in the (or at least one) right direction > and everybody will agree that this setup is woefully underpowered in the > CPU department. > > > I am using CentOS 7 with kernel 3.10 and the redhat tuned-adm profile > > for throughput-performance enabled. > > > Ceph version would be interesting as well... > > > I have these sysctls set: > > > > kernel.pid_max = 4194303 > > fs.file-max = 6553600 > > vm.swappiness = 0 > > vm.vfs_cache_pressure = 50 > > vm.min_free_kbytes = 3145728 > > > > I feel like my issue is directly related to the high number of OSD per > > host but I'm not sure what issue I'm really running into. I believe > > that I have ruled out network issues, i am able to get 38Gbit > > consistently via iperf testing and mtu for jump pings successfully > > with no fragment set and 8972 packet size. > > > The fact that it all works for days at a time suggests this as well, but > you need to verify these things when they're happening. > > > From FIO testing I seem to be able to get 150-200k iops write from my > > rbd clients on 1gbit networking... This is about what I expected due > > to the write penalty and my underpowered CPU for the number of OSD. > > > > I get these messages which I believe are normal? > > 2018-08-22 10:33:12.754722 7f7d009f5700 0 -- 10.20.136.8:6894/718902 > > >> 10.20.136.10:6876/490574 pipe(0x55aed77fd400 sd=192 :40502 s=2 > > pgs=1084 cs=53 l=0 c=0x55aed805bc80).fault with nothing to send, going > > to standby > > > Ignore. > > > Then randomly I'll get a storm of this every few days for 20 minutes or > so: > > 2018-08-22 15:48:32.631186 7f44b7514700 -1 osd.127 37333 > > heartbeat_check: no reply from 10.20.142.11:6861 osd.198 since back > > 2018-08-22 15:48:08.052762 front 2018-08-22 15:48:31.282890 (cutoff > > 2018-08-22 15:48:12.630773) > > > Randomly is unlikely. > Again, catch it in the act, atop in huge terminal windows (showing all > CPUs and disks) for all nodes should be very telling, collecting and > graphing this data might work, too. > > My suspects would be deep scrubs and/or high IOPS spikes when this is > happening, starving out OSD processes (CPU wise, RAM should be fine one > supposes). > > Christian > > > Please help!!! > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > -- > Christian Balzer Network/Systems Engineer > ch...@gol.com Rakuten Communications >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com