Re: [ceph-users] SSD journal overload?

Indra Pramana Mon, 28 Apr 2014 21:48:13 -0700

Hi Irek,

Good day to you, and thank you for your e-mail.


Is there a better way other than patching the kernel? I would like to avoid
having to compile a custom kernel for my OS. I read that I can disable
write-caching on the drive using hdparm:

hdparm -W0 /dev/sdf
hdparm -W0 /dev/sdg

I tested on one of my test servers and it seems I can disable it using the
command.

Current setup, write-caching is on:

====
root@ceph-osd-09:/home/indra# hdparm -W /dev/sdg

/dev/sdg:
 write-caching =  1 (on)
====

I tried to disable write-caching and it's successful:

====
root@ceph-osd-09:/home/indra# hdparm -W0 /dev/sdg

/dev/sdg:
 setting drive write-caching to 0 (off)
 write-caching =  0 (off)
====

I check again, and now write-caching is disabled.

====
root@ceph-osd-09:/home/indra# hdparm -W /dev/sdg

/dev/sdg:
 write-caching =  0 (off)
====

Would the above give the same result? If yes, I will try to do that on our
running cluster tonight.

May I also know how I can confirm if my SSD comes with "volatile cache" as
mentioned on your article? I tried to check my SSD's data sheet and there's
no information on whether it comes with volatile cache or not. I also read
that disabling write-caching will also increase the risk of data-loss. Can
you comment on that?

Looking forward to your reply, thank you.

Cheers.



On Mon, Apr 28, 2014 at 7:49 PM, Irek Fasikhov <malm...@gmail.com> wrote:

> This is my article :).
> To patch to the kernel (
> http://www.theirek.com/downloads/code/CMD_FLUSH.diff).
> After rebooting, run the following commands:
> echo temporary write through > /sys/class/scsi_disk/<disk>/cache_type
>
>
> 2014-04-28 15:44 GMT+04:00 Indra Pramana <in...@sg.or.id>:
>
> Hi Irek,
>>
>> Thanks for the article. Do you have any other web sources pertaining to
>> the same issue, which is in English?
>>
>> Looking forward to your reply, thank you.
>>
>> Cheers.
>>
>>
>> On Mon, Apr 28, 2014 at 7:40 PM, Irek Fasikhov <malm...@gmail.com> wrote:
>>
>>> Most likely you need to apply a patch to the kernel.
>>>
>>>
>>> http://www.theirek.com/blog/2014/02/16/patch-dlia-raboty-s-enierghoniezavisimym-keshiem-ssd-diskov
>>>
>>>
>>> 2014-04-28 15:20 GMT+04:00 Indra Pramana <in...@sg.or.id>:
>>>
>>> Hi Udo and Irek,
>>>>
>>>> Good day to you, and thank you for your emails.
>>>>
>>>>
>>>> >perhaps due IOs from the journal?
>>>> >You can test with iostat (like "iostat -dm 5 sdg").
>>>>
>>>> Yes, I have shared the iostat result earlier on this same thread. At
>>>> times the utilisation of the 2 journal drives will hit 100%, especially
>>>> when I simulate writing data using rados bench command. Any suggestions
>>>> what could be the cause of the I/O issue?
>>>>
>>>>
>>>> ====
>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>            1.85    0.00    1.65    3.14    0.00   93.36
>>>>
>>>>
>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>>>  sdg               0.00     0.00    0.00   55.00     0.00 25365.33
>>>> 922.38    34.22  568.90    0.00  568.90  17.82  98.00
>>>> sdf               0.00     0.00    0.00   55.67     0.00 25022.67
>>>> 899.02    29.76  500.57    0.00  500.57  17.60  98.00
>>>>
>>>>
>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>            2.10    0.00    1.37    2.07    0.00   94.46
>>>>
>>>>
>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>>>  sdg               0.00     0.00    0.00   56.67     0.00 25220.00
>>>> 890.12    23.60  412.14    0.00  412.14  17.62  99.87
>>>> sdf               0.00     0.00    0.00   52.00     0.00 24637.33
>>>> 947.59    33.65  587.41    0.00  587.41  19.23 100.00
>>>>
>>>>
>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>            2.21    0.00    1.77    6.75    0.00   89.27
>>>>
>>>>
>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>>>  sdg               0.00     0.00    0.00   54.33     0.00 24802.67
>>>> 912.98    25.75  486.36    0.00  486.36  18.40 100.00
>>>> sdf               0.00     0.00    0.00   53.00     0.00 24716.00
>>>> 932.68    35.26  669.89    0.00  669.89  18.87 100.00
>>>>
>>>>
>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>            1.87    0.00    1.67    5.25    0.00   91.21
>>>>
>>>>
>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>>>  sdg               0.00     0.00    0.00   94.33     0.00 26257.33
>>>> 556.69    18.29  208.44    0.00  208.44  10.50  99.07
>>>> sdf               0.00     0.00    0.00   51.33     0.00 24470.67
>>>> 953.40    32.75  684.62    0.00  684.62  19.51 100.13
>>>>
>>>>
>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>            1.51    0.00    1.34    7.25    0.00   89.89
>>>>
>>>>
>>>> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s
>>>> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>>>>  sdg               0.00     0.00    0.00   52.00     0.00 22565.33
>>>> 867.90    24.73  446.51    0.00  446.51  19.10  99.33
>>>> sdf               0.00     0.00    0.00   64.67     0.00 24892.00
>>>> 769.86    19.50  330.02    0.00  330.02  15.32  99.07
>>>> ====
>>>>
>>>> >You what model SSD?
>>>>
>>>> For this one, I am using Seagate 100GB SSD, model: HDS-2TM-ST100FM0012
>>>>
>>>> >Which version of the kernel?
>>>>
>>>> Ubuntu 13.04, Linux kernel version: 3.8.0-19-generic #30-Ubuntu SMP Wed
>>>> May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
>>>>
>>>> Looking forward to your reply, thank you.
>>>>
>>>> Cheers.
>>>>
>>>>
>>>>
>>>> On Mon, Apr 28, 2014 at 4:45 PM, Irek Fasikhov <malm...@gmail.com>wrote:
>>>>
>>>>> You what model SSD?
>>>>> Which version of the kernel?
>>>>>
>>>>>
>>>>>
>>>>> 2014-04-28 12:35 GMT+04:00 Udo Lembke <ulem...@polarzone.de>:
>>>>>
>>>>>> Hi,
>>>>>> perhaps due IOs from the journal?
>>>>>> You can test with iostat (like "iostat -dm 5 sdg").
>>>>>>
>>>>>> on debian iostat is in the package sysstat.
>>>>>>
>>>>>> Udo
>>>>>>
>>>>>> Am 28.04.2014 07:38, schrieb Indra Pramana:
>>>>>> > Hi Craig,
>>>>>> >
>>>>>> > Good day to you, and thank you for your enquiry.
>>>>>> >
>>>>>> > As per your suggestion, I have created a 3rd partition on the SSDs
>>>>>> and did
>>>>>> > the dd test directly into the device, and the result is very slow.
>>>>>> >
>>>>>> > ====
>>>>>> > root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdg3
>>>>>> > conv=fdatasync oflag=direct
>>>>>> > 128+0 records in
>>>>>> > 128+0 records out
>>>>>> > 134217728 bytes (134 MB) copied, 19.5223 s, 6.9 MB/s
>>>>>> >
>>>>>> > root@ceph-osd-08:/mnt# dd bs=1M count=128 if=/dev/zero of=/dev/sdf3
>>>>>> > conv=fdatasync oflag=direct
>>>>>> > 128+0 records in
>>>>>> > 128+0 records out
>>>>>> > 134217728 bytes (134 MB) copied, 5.34405 s, 25.1 MB/s
>>>>>> > ====
>>>>>> >
>>>>>> > I did a test onto another server with exactly similar specification
>>>>>> and
>>>>>> > similar SSD drive (Seagate SSD 100 GB) but not added into the
>>>>>> cluster yet
>>>>>> > (thus no load), and the result is fast:
>>>>>> >
>>>>>> > ====
>>>>>> > root@ceph-osd-09:/home/indra# dd bs=1M count=128 if=/dev/zero
>>>>>> of=/dev/sdf1
>>>>>> > conv=fdatasync oflag=direct
>>>>>> > 128+0 records in
>>>>>> > 128+0 records out
>>>>>> > 134217728 bytes (134 MB) copied, 0.742077 s, 181 MB/s
>>>>>> > ====
>>>>>> >
>>>>>> > Is the Ceph journal load really takes up a lot of the SSD
>>>>>> resources? I
>>>>>> > don't understand how come the performance can drop significantly.
>>>>>> > Especially since the two Ceph journals are only taking the first 20
>>>>>> GB out
>>>>>> > of the 100 GB of the SSD total capacity.
>>>>>> >
>>>>>> > Any advice is greatly appreciated.
>>>>>> >
>>>>>> > Looking forward to your reply, thank you.
>>>>>> >
>>>>>> > Cheers.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> С уважением, Фасихов Ирек Нургаязович
>>>>> Моб.: +79229045757
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> С уважением, Фасихов Ирек Нургаязович
>>> Моб.: +79229045757
>>>
>>
>>
>
>
> --
> С уважением, Фасихов Ирек Нургаязович
> Моб.: +79229045757
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSD journal overload?

Reply via email to