Hi, On one of our test clusters, I have a node with 4 OSDs with SAS / non-SSD drives (sdb, sdc, sdd, sde) and 2 SSD drives (sdf and sdg) for journals to serve the 4 OSDs (2 each).
Model: ATA ST100FM0012 (scsi) Disk /dev/sdf: 100GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 10.7GB 10.7GB ceph journal 2 10.7GB 21.5GB 10.7GB ceph journal Model: ATA ST100FM0012 (scsi) Disk /dev/sdg: 100GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 10.7GB 10.7GB ceph journal 2 10.7GB 21.5GB 10.7GB ceph journal When I did a rados bench test, I noted that the two SSD drives are always overloaded with I/O requests, thus the performance is very bad. A ceph tell osd.X bench test will only give around 25 MB/s of throughput and iostat shows the two SSD journal drives are overloaded: ==== avg-cpu: %user %nice %system %iowait %steal %idle 2.41 0.00 1.65 3.00 0.00 92.93 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 5.67 3.33 166.00 22.67 14923.50 176.53 25.16 148.58 9.60 151.37 2.39 40.40 sdb 0.00 0.00 2.00 0.33 13.33 2.17 13.29 0.01 6.29 4.67 16.00 5.14 1.20 sdd 0.00 0.00 2.00 20.33 10.67 1526.33 137.64 0.12 5.49 10.00 5.05 4.72 10.53 sde 0.00 5.00 4.67 67.67 34.67 5837.00 162.35 8.92 124.06 14.00 131.65 2.49 18.00 sdg 0.00 0.00 0.00 54.67 0.00 25805.33 944.10 36.41 655.88 0.00 655.88 18.29 *100.00* sdf 0.00 0.00 0.00 53.67 0.00 25252.00 941.07 35.61 636.07 0.00 636.07 18.63 *100.00* avg-cpu: %user %nice %system %iowait %steal %idle 2.01 0.00 1.28 2.15 0.00 94.56 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 2.67 14.33 13.33 4.17 2.06 0.03 2.12 8.50 0.93 1.33 2.27 sdb 0.00 0.00 2.00 12.33 10.67 4130.00 577.77 0.09 6.33 4.00 6.70 4.09 5.87 sdd 0.00 0.00 2.33 36.67 18.67 12425.17 638.15 3.77 96.58 9.14 102.15 3.93 15.33 sde 0.00 0.33 1.67 104.33 9.33 11484.00 216.86 11.96 161.61 33.60 163.65 2.93 31.07 sdg 0.00 0.00 0.00 54.33 0.00 25278.67 930.50 33.55 644.54 0.00 644.54 18.38 *99.87* sdf 0.00 0.00 0.00 58.33 0.00 25493.33 874.06 22.67 422.26 0.00 422.26 17.10 *99.73* avg-cpu: %user %nice %system %iowait %steal %idle 2.19 0.00 1.60 3.95 0.00 92.26 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 2.67 9.00 17.33 1431.00 248.29 0.07 6.17 8.00 5.63 5.03 5.87 sdb 0.00 0.33 1.67 88.33 8.00 30435.33 676.52 17.16 139.64 85.60 140.66 3.30 29.73 sdd 0.00 0.00 2.33 17.33 13.33 3040.17 310.53 0.11 5.42 7.43 5.15 4.47 8.80 sde 0.00 0.00 2.67 7.67 14.67 2767.00 538.39 0.08 8.00 8.50 7.83 5.16 5.33 sdg 0.00 0.00 0.00 60.00 0.00 24841.33 828.04 21.26 332.27 0.00 332.27 16.51 99.07 sdf 0.00 0.00 0.00 56.33 0.00 24365.33 865.04 25.00 449.21 0.00 449.21 17.70 99.73 ==== Anyone can advise what could be the problem? I am using 100 GB SSD drives and journal size is only 10 GB, meaning both only occupied 20 GB of the space for each disk. The server is using SATA 3 connectors. I am not able to do a dd test on the SSDs since it's not mounted as filesystem, but dd on the OSD (non-SSD) drives gives normal result. I am using Ceph v0.67.7, latest stable version of Dumpling. Any advice is appreciated. Looking forward to your reply, thank you. Cheers.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com