Marc, As with you, this problem manifests itself only when the bluestore OSD is involved in some form of deep scrub. Anybody have any insight on what might be causing this?
-Brett On Mon, Sep 3, 2018 at 4:13 AM, Marc Schöchlin <m...@256bit.org> wrote: > Hi, > > we are also experiencing this type of behavior for some weeks on our not > so performance critical hdd pools. > We haven't spent so much time on this problem, because there are > currently more important tasks - but here are a few details: > > Running the following loop results in the following output: > > while true; do ceph health|grep -q HEALTH_OK || (date; ceph health > detail); sleep 2; done > > Sun Sep 2 20:59:47 CEST 2018 > HEALTH_WARN 4 slow requests are blocked > 32 sec > REQUEST_SLOW 4 slow requests are blocked > 32 sec > 4 ops are blocked > 32.768 sec > osd.43 has blocked requests > 32.768 sec > Sun Sep 2 20:59:50 CEST 2018 > HEALTH_WARN 4 slow requests are blocked > 32 sec > REQUEST_SLOW 4 slow requests are blocked > 32 sec > 4 ops are blocked > 32.768 sec > osd.43 has blocked requests > 32.768 sec > Sun Sep 2 20:59:52 CEST 2018 > HEALTH_OK > Sun Sep 2 21:00:28 CEST 2018 > HEALTH_WARN 1 slow requests are blocked > 32 sec > REQUEST_SLOW 1 slow requests are blocked > 32 sec > 1 ops are blocked > 32.768 sec > osd.41 has blocked requests > 32.768 sec > Sun Sep 2 21:00:31 CEST 2018 > HEALTH_WARN 7 slow requests are blocked > 32 sec > REQUEST_SLOW 7 slow requests are blocked > 32 sec > 7 ops are blocked > 32.768 sec > osds 35,41 have blocked requests > 32.768 sec > Sun Sep 2 21:00:33 CEST 2018 > HEALTH_WARN 7 slow requests are blocked > 32 sec > REQUEST_SLOW 7 slow requests are blocked > 32 sec > 7 ops are blocked > 32.768 sec > osds 35,51 have blocked requests > 32.768 sec > Sun Sep 2 21:00:35 CEST 2018 > HEALTH_WARN 7 slow requests are blocked > 32 sec > REQUEST_SLOW 7 slow requests are blocked > 32 sec > 7 ops are blocked > 32.768 sec > osds 35,51 have blocked requests > 32.768 sec > > Our details: > > * system details: > * Ubuntu 16.04 > * Kernel 4.13.0-39 > * 30 * 8 TB Disk (SEAGATE/ST8000NM0075) > * 3* Dell Power Edge R730xd (Firmware 2.50.50.50) > * Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz > * 2*10GBITS SFP+ Network Adapters > * 192GB RAM > * Pools are using replication factor 3, 2MB object size, > 85% write load, 1700 write IOPS/sec > (ops mainly between 4k and 16k size), 300 read IOPS/sec > * we have the impression that this appears on deepscrub/scrub activity. > * Ceph 12.2.5, we alread played with the osd settings OSD Settings > (our assumtion was that the problem is related to rocksdb compaction) > bluestore cache kv max = 2147483648 > bluestore cache kv ratio = 0.9 > bluestore cache meta ratio = 0.1 > bluestore cache size hdd = 10737418240 > * this type problem only appears on hdd/bluestore osds, ssd/bluestore > osds did never experienced that problem > * the system is healthy, no swapping, no high load, no errors in dmesg > > I attached a log excerpt of osd.35 - probably this is useful for > investigating the problem is someone owns deeper bluestore knowledge. > (slow requests appeared on Sun Sep 2 21:00:35) > > Regards > Marc > > > Am 02.09.2018 um 15:50 schrieb Brett Chancellor: > > The warnings look like this. > > > > 6 ops are blocked > 32.768 sec on osd.219 > > 1 osds have slow requests > > > > On Sun, Sep 2, 2018, 8:45 AM Alfredo Deza <ad...@redhat.com > > <mailto:ad...@redhat.com>> wrote: > > > > On Sat, Sep 1, 2018 at 12:45 PM, Brett Chancellor > > <bchancel...@salesforce.com <mailto:bchancel...@salesforce.com>> > > wrote: > > > Hi Cephers, > > > I am in the process of upgrading a cluster from Filestore to > > bluestore, > > > but I'm concerned about frequent warnings popping up against the > new > > > bluestore devices. I'm frequently seeing messages like this, > > although the > > > specific osd changes, it's always one of the few hosts I've > > converted to > > > bluestore. > > > > > > 6 ops are blocked > 32.768 sec on osd.219 > > > 1 osds have slow requests > > > > > > I'm running 12.2.4, have any of you seen similar issues? It > > seems as though > > > these messages pop up more frequently when one of the bluestore > > pgs is > > > involved in a scrub. I'll include my bluestore creation process > > below, in > > > case that might cause an issue. (sdb, sdc, sdd are SATA, sde and > > sdf are > > > SSD) > > > > Would be useful to include what those warnings say. The ceph-volume > > commands look OK to me > > > > > > > > > > > ## Process used to create osds > > > sudo ceph-disk zap /dev/sdb /dev/sdc /dev/sdd /dev/sdd /dev/sde > > /dev/sdf > > > sudo ceph-volume lvm zap /dev/sdb > > > sudo ceph-volume lvm zap /dev/sdc > > > sudo ceph-volume lvm zap /dev/sdd > > > sudo ceph-volume lvm zap /dev/sde > > > sudo ceph-volume lvm zap /dev/sdf > > > sudo sgdisk -n 0:2048:+133GiB -t 0:FFFF -c 1:"ceph block.db sdb" > > /dev/sdf > > > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 2:"ceph block.db sdc" > > /dev/sdf > > > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 3:"ceph block.db sdd" > > /dev/sdf > > > sudo sgdisk -n 0:0:+133GiB -t 0:FFFF -c 4:"ceph block.db sde" > > /dev/sdf > > > sudo ceph-volume lvm create --bluestore --crush-device-class hdd > > --data > > > /dev/sdb --block.db /dev/sdf1 > > > sudo ceph-volume lvm create --bluestore --crush-device-class hdd > > --data > > > /dev/sdc --block.db /dev/sdf2 > > > sudo ceph-volume lvm create --bluestore --crush-device-class hdd > > --data > > > /dev/sdd --block.db /dev/sdf3 > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com