Besides checking what David told you, you can tune the scrub operation. (your ceph -s shows 2 deep scrub operations being performed that could have an impact on your user traffic). For instance you could set the following parameters:
osd scrub chunk max = 5 osd scrub chunk min=1 osd scrub sleep = 0.1 you can display the current values for an osd using the "ceph -n osd.xy --show-config" command. There you could check the current values of the above parameters. Kind regards, Laszlo On 25.03.2018 17:30, David Turner wrote: > I recommend that people check their disk controller caches/batteries as well > as checking for subfolder splitting on filestore (which is the only option on > Jewel). The former leads to high await, the later contributes to blocked > requests. > > On Sun, Mar 25, 2018, 3:36 AM Sam Huracan <nowitzki.sa...@gmail.com > <mailto:nowitzki.sa...@gmail.com>> wrote: > > Thank you all. > > 1. Here is my ceph.conf file: > https://pastebin.com/xpF2LUHs > > 2. Here is result from ceph -s: > root@ceph1:/etc/ceph# ceph -s > cluster 31154d30-b0d3-4411-9178-0bbe367a5578 > health HEALTH_OK > monmap e3: 3 mons at > {ceph1=10.0.30.51:6789/0,ceph2=10.0.30.52:6789/0,ceph3=10.0.30.53:6789/0 > <http://10.0.30.51:6789/0,ceph2=10.0.30.52:6789/0,ceph3=10.0.30.53:6789/0>} > election epoch 18, quorum 0,1,2 ceph1,ceph2,ceph3 > osdmap e2473: 63 osds: 63 up, 63 in > flags sortbitwise,require_jewel_osds > pgmap v34069952: 4096 pgs, 6 pools, 21534 GB data, 5696 kobjects > 59762 GB used, 135 TB / 194 TB avail > 4092 active+clean > 2 active+clean+scrubbing > 2 active+clean+scrubbing+deep > client io 36096 kB/s rd, 41611 kB/s wr, 1643 op/s rd, 1634 op/s wr > > > > 3. We use 1 SSD for journaling 7 HDD (/dev/sdi), I set 16GB for each > journal, here is result from ceph-disk list command: > > /dev/sda : > /dev/sda1 ceph data, active, cluster ceph, osd.0, journal /dev/sdi1 > /dev/sdb : > /dev/sdb1 ceph data, active, cluster ceph, osd.1, journal /dev/sdi2 > /dev/sdc : > /dev/sdc1 ceph data, active, cluster ceph, osd.2, journal /dev/sdi3 > /dev/sdd : > /dev/sdd1 ceph data, active, cluster ceph, osd.3, journal /dev/sdi4 > /dev/sde : > /dev/sde1 ceph data, active, cluster ceph, osd.4, journal /dev/sdi5 > /dev/sdf : > /dev/sdf1 ceph data, active, cluster ceph, osd.5, journal /dev/sdi6 > /dev/sdg : > /dev/sdg1 ceph data, active, cluster ceph, osd.6, journal /dev/sdi7 > /dev/sdh : > /dev/sdh3 other, LVM2_member > /dev/sdh1 other, vfat, mounted on /boot/efi > /dev/sdi : > /dev/sdi1 ceph journal, for /dev/sda1 > /dev/sdi2 ceph journal, for /dev/sdb1 > /dev/sdi3 ceph journal, for /dev/sdc1 > /dev/sdi4 ceph journal, for /dev/sdd1 > /dev/sdi5 ceph journal, for /dev/sde1 > /dev/sdi6 ceph journal, for /dev/sdf1 > /dev/sdi7 ceph journal, for /dev/sdg1 > > 4. With iostat, we just run "iostat -x 2", /dev/sdi is journal SSD, > /dev/sdh is OS Disk, and the rest is OSD Disks. > root@ceph1:/etc/ceph# lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sda 8:0 0 3.7T 0 disk > └─sda1 8:1 0 3.7T 0 part > /var/lib/ceph/osd/ceph-0 > sdb 8:16 0 3.7T 0 disk > └─sdb1 8:17 0 3.7T 0 part > /var/lib/ceph/osd/ceph-1 > sdc 8:32 0 3.7T 0 disk > └─sdc1 8:33 0 3.7T 0 part > /var/lib/ceph/osd/ceph-2 > sdd 8:48 0 3.7T 0 disk > └─sdd1 8:49 0 3.7T 0 part > /var/lib/ceph/osd/ceph-3 > sde 8:64 0 3.7T 0 disk > └─sde1 8:65 0 3.7T 0 part > /var/lib/ceph/osd/ceph-4 > sdf 8:80 0 3.7T 0 disk > └─sdf1 8:81 0 3.7T 0 part > /var/lib/ceph/osd/ceph-5 > sdg 8:96 0 3.7T 0 disk > └─sdg1 8:97 0 3.7T 0 part > /var/lib/ceph/osd/ceph-6 > sdh 8:112 0 278.9G 0 disk > ├─sdh1 8:113 0 512M 0 part /boot/efi > └─sdh3 8:115 0 278.1G 0 part > ├─hnceph--hdd1--vg-swap (dm-0) 252:0 0 59.6G 0 lvm [SWAP] > └─hnceph--hdd1--vg-root (dm-1) 252:1 0 218.5G 0 lvm / > sdi 8:128 0 185.8G 0 disk > ├─sdi1 8:129 0 16.6G 0 part > ├─sdi2 8:130 0 16.6G 0 part > ├─sdi3 8:131 0 16.6G 0 part > ├─sdi4 8:132 0 16.6G 0 part > ├─sdi5 8:133 0 16.6G 0 part > ├─sdi6 8:134 0 16.6G 0 part > └─sdi7 8:135 0 16.6G 0 part > > Could you give me some idea to continue check? > > > 2018-03-25 12:25 GMT+07:00 Budai Laszlo <laszlo.bu...@gmail.com > <mailto:laszlo.bu...@gmail.com>>: > > could you post the result of "ceph -s" ? besides the health status > there are other details that could help, like the status of your PGs., also > the result of "ceph-disk list" would be useful to understand how your disks > are organized. For instance with 1 SSD for 7 HDD the SSD could be the > bottleneck. > >From the outputs you gave us we don't know which are the spinning > disks and which is the ssd (looking at the numbers I suspect that sdi is your > SSD). we also don't kow what parameters were you using when you've ran the > iostat command. > > Unfortunately it's difficult to help you without knowing more about > your system. > > Kind regards, > Laszlo > > On 24.03.2018 20:19, Sam Huracan wrote: > > This is from iostat: > > > > I'm using Ceph jewel, has no HW error. > > Ceph health OK, we've just use 50% total volume. > > > > > > 2018-03-24 22:20 GMT+07:00 <c...@elchaka.de > <mailto:c...@elchaka.de> <mailto:c...@elchaka.de <mailto:c...@elchaka.de>>>: > > > > I would Check with Tools like atop the utilization of your > Disks also. Perhaps something Related in dmesg or dorthin? > > > > - Mehmet > > > > Am 24. März 2018 08:17:44 MEZ schrieb Sam Huracan > <nowitzki.sa...@gmail.com <mailto:nowitzki.sa...@gmail.com> > <mailto:nowitzki.sa...@gmail.com <mailto:nowitzki.sa...@gmail.com>>>: > > > > > > Hi guys, > > We are running a production OpenStack backend by Ceph. > > > > At present, we are meeting an issue relating to high iowait > in VM, in some MySQL VM, we see sometime IOwait reaches abnormal high peaks > which lead to slow queries increase, despite load is stable (we test with > script simulate real load), you can see in graph. > > https://prnt.sc/ivndni > > > > MySQL VM are place on Ceph HDD Cluster, with 1 SSD journal > for 7 HDD. In this cluster, IOwait on each ceph host is about 20%. > > https://prnt.sc/ivne08 > > > > > > Can you guy help me find the root cause of this issue, and > how to eliminate this high iowait? > > > > Thanks in advance. > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > <mailto:ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com