yeah 3TB SAS disks

*German Anders*
Storage System Engineer Leader
*Despegar* | IT Team
*office* +54 11 4894 3500 x3408
*mobile* +54 911 3493 7262
*mail* gand...@despegar.com

2015-07-02 9:04 GMT-03:00 Jan Schermer <j...@schermer.cz>:

> And those disks are spindles?
> Looks like there’s simply too few of there….
>
> Jan
>
> On 02 Jul 2015, at 13:49, German Anders <gand...@despegar.com> wrote:
>
> output from iostat:
>
> *CEPHOSD01:*
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)       0.00     0.00    1.00  389.00     0.00    35.98
> 188.96    60.32  120.12   16.00  120.39   1.26  49.20
> sdd(ceph-1)       0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> sdf(ceph-2)       0.00     1.00    6.00  521.00     0.02    60.72
> 236.05   143.10  309.75  484.00  307.74   1.90 100.00
> sdg(ceph-3)       0.00     0.00   11.00  535.00     0.04    42.41
> 159.22   139.25  279.72  394.18  277.37   1.83 100.00
> sdi(ceph-4)       0.00     1.00    4.00  560.00     0.02    54.87
> 199.32   125.96  187.07  562.00  184.39   1.65  93.20
> sdj(ceph-5)       0.00     0.00    0.00  566.00     0.00    61.41
> 222.19   109.13  169.62    0.00  169.62   1.53  86.40
> sdl(ceph-6)       0.00     0.00    8.00    0.00     0.09     0.00
> 23.00     0.12   12.00   12.00    0.00   2.50   2.00
> sdm(ceph-7)       0.00     0.00    2.00  481.00     0.01    44.59
> 189.12   116.64  241.41  268.00  241.30   2.05  99.20
> sdn(ceph-8)       0.00     0.00    1.00    0.00     0.00     0.00
> 8.00     0.01    8.00    8.00    0.00   8.00   0.80
> fioa              0.00     0.00    0.00 1016.00     0.00    19.09
> 38.47     0.00    0.06    0.00    0.06   0.00   0.00
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc(ceph-0)       0.00     1.00   10.00  278.00     0.04    26.07
> 185.69    60.82  257.97  309.60  256.12   2.83  81.60
> sdd(ceph-1)       0.00     0.00    2.00    0.00     0.02     0.00
> 20.00     0.02   10.00   10.00    0.00  10.00   2.00
> sdf(ceph-2)       0.00     1.00    6.00  579.00     0.02    54.16
> 189.68   142.78  246.55  328.67  245.70   1.71 100.00
> sdg(ceph-3)       0.00     0.00   10.00   75.00     0.05     5.32
> 129.41     4.94  185.08   11.20  208.27   4.05  34.40
> sdi(ceph-4)       0.00     0.00   19.00  147.00     0.09    12.61
> 156.63    17.88  230.89  114.32  245.96   3.37  56.00
> sdj(ceph-5)       0.00     1.00    2.00  629.00     0.01    43.66
> 141.72   143.00  223.35  426.00  222.71   1.58 100.00
> sdl(ceph-6)       0.00     0.00   10.00    0.00     0.04     0.00
> 8.00     0.16   18.40   18.40    0.00   5.60   5.60
> sdm(ceph-7)       0.00     0.00   11.00    4.00     0.05     0.01
> 8.00     0.48   35.20   25.82   61.00  14.13  21.20
> sdn(ceph-8)       0.00     0.00    9.00    0.00     0.07     0.00
> 15.11     0.07    8.00    8.00    0.00   4.89   4.40
> fioa              0.00     0.00    0.00 6415.00     0.00   125.81
> 40.16     0.00    0.14    0.00    0.14   0.00   0.00
>
> *CEPHOSD02:*
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)      0.00     0.00   13.00    0.00     0.11     0.00
> 16.62     0.17   13.23   13.23    0.00   4.92   6.40
> sdd1(ceph-10)     0.00     0.00   15.00    0.00     0.13     0.00
> 18.13     0.26   17.33   17.33    0.00   1.87   2.80
> sdf1(ceph-11)     0.00     0.00   22.00  650.00     0.11    51.75
> 158.04   143.27  212.07  308.55  208.81   1.49 100.00
> sdg1(ceph-12)     0.00     0.00   12.00  282.00     0.05    54.60
> 380.68    13.16  120.52  352.00  110.67   2.91  85.60
> sdi1(ceph-13)     0.00     0.00    1.00    0.00     0.00     0.00
> 8.00     0.01    8.00    8.00    0.00   8.00   0.80
> sdj1(ceph-14)     0.00     0.00   20.00    0.00     0.08     0.00
> 8.00     0.26   12.80   12.80    0.00   3.60   7.20
> sdl1(ceph-15)     0.00     0.00    0.00    0.00     0.00     0.00
> 0.00     0.00    0.00    0.00    0.00   0.00   0.00
> sdm1(ceph-16)     0.00     0.00   20.00  424.00     0.11    32.20
> 149.05    89.69  235.30  243.00  234.93   2.14  95.20
> sdn1(ceph-17)     0.00     0.00    5.00  411.00     0.02    45.47
> 223.94    98.32  182.28 1057.60  171.63   2.40 100.00
>
> Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdc1(ceph-9)      0.00     0.00   26.00  383.00     0.11    34.32
> 172.44    86.92  258.64  297.08  256.03   2.29  93.60
> sdd1(ceph-10)     0.00     0.00    8.00   31.00     0.09     1.86
> 101.95     0.84  178.15   94.00  199.87   6.46  25.20
> sdf1(ceph-11)     0.00     1.00    5.00  409.00     0.05    48.34
> 239.34    90.94  219.43  383.20  217.43   2.34  96.80
> sdg1(ceph-12)     0.00     0.00    0.00  238.00     0.00     1.64
> 14.12    58.34  143.60    0.00  143.60   1.83  43.60
> sdi1(ceph-13)     0.00     0.00   11.00    0.00     0.05     0.00
> 10.18     0.16   14.18   14.18    0.00   5.09   5.60
> sdj1(ceph-14)     0.00     0.00    1.00    0.00     0.00     0.00
> 8.00     0.02   16.00   16.00    0.00  16.00   1.60
> sdl1(ceph-15)     0.00     0.00    1.00    0.00     0.03     0.00
> 64.00     0.01   12.00   12.00    0.00  12.00   1.20
> sdm1(ceph-16)     0.00     1.00    4.00  587.00     0.03    50.09
> 173.69   143.32  244.97  296.00  244.62   1.69 100.00
> sdn1(ceph-17)     0.00     0.00    0.00  375.00     0.00    23.68
> 129.34    69.76  182.51    0.00  182.51   2.47  92.80
>
> The other OSD server had pretty much the same load.
>
> The config of the OSD's is the following:
>
> - 2x Intel Xeon E5-2609 v2 @ 2.50GHz (4C)
> - 128G RAM
> - 2x 120G SSD Intel SSDSC2BB12 (RAID-1) for OS
> - 2x 10GbE ADPT DP
> - Journals are configured to run on RAMDISK (TMPFS), but in the first OSD
> serv we've the journals going on to a FusionIO (/dev/fioa) ADPT with batt.
>
> CRUSH algorithm is the following:
>
> # begin crush map
> tunable choose_local_tries 0
> tunable choose_local_fallback_tries 0
> tunable choose_total_tries 50
> tunable chooseleaf_descend_once 1
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
> device 4 osd.4
> device 5 osd.5
> device 6 osd.6
> device 7 osd.7
> device 8 osd.8
> device 9 osd.9
> device 10 osd.10
> device 11 osd.11
> device 12 osd.12
> device 13 osd.13
> device 14 osd.14
> device 15 osd.15
> device 16 osd.16
> device 17 osd.17
> device 18 osd.18
> device 19 osd.19
> device 20 osd.20
> device 21 osd.21
> device 22 osd.22
> device 23 osd.23
> device 24 osd.24
> device 25 osd.25
> device 26 osd.26
> device 27 osd.27
> device 28 osd.28
> device 29 osd.29
> device 30 osd.30
> device 31 osd.31
> device 32 osd.32
> device 33 osd.33
> device 34 osd.34
> device 35 osd.35
>
> # types
> type 0 osd
> type 1 host
> type 2 chassis
> type 3 rack
> type 4 row
> type 5 pdu
> type 6 pod
> type 7 room
> type 8 datacenter
> type 9 region
> type 10 root
>
> # buckets
> host cephosd03 {
>     id -4        # do not change unnecessarily
>     # weight 24.570
>     alg straw
>     hash 0    # rjenkins1
>     item osd.18 weight 2.730
>     item osd.19 weight 2.730
>     item osd.20 weight 2.730
>     item osd.21 weight 2.730
>     item osd.22 weight 2.730
>     item osd.23 weight 2.730
>     item osd.24 weight 2.730
>     item osd.25 weight 2.730
>     item osd.26 weight 2.730
> }
> host cephosd04 {
>     id -5        # do not change unnecessarily
>     # weight 24.570
>     alg straw
>     hash 0    # rjenkins1
>     item osd.27 weight 2.730
>     item osd.28 weight 2.730
>     item osd.29 weight 2.730
>     item osd.30 weight 2.730
>     item osd.31 weight 2.730
>     item osd.32 weight 2.730
>     item osd.33 weight 2.730
>     item osd.34 weight 2.730
>     item osd.35 weight 2.730
> }
> root default {
>     id -1        # do not change unnecessarily
>     # weight 49.140
>     alg straw
>     hash 0    # rjenkins1
>     item cephosd03 weight 24.570
>     item cephosd04 weight 24.570
> }
> host cephosd01 {
>     id -2        # do not change unnecessarily
>     # weight 24.570
>     alg straw
>     hash 0    # rjenkins1
>     item osd.0 weight 2.730
>     item osd.1 weight 2.730
>     item osd.2 weight 2.730
>     item osd.3 weight 2.730
>     item osd.4 weight 2.730
>     item osd.5 weight 2.730
>     item osd.6 weight 2.730
>     item osd.7 weight 2.730
>     item osd.8 weight 2.730
> }
> host cephosd02 {
>     id -3        # do not change unnecessarily
>     # weight 24.570
>     alg straw
>     hash 0    # rjenkins1
>     item osd.9 weight 2.730
>     item osd.10 weight 2.730
>     item osd.11 weight 2.730
>     item osd.12 weight 2.730
>     item osd.13 weight 2.730
>     item osd.14 weight 2.730
>     item osd.15 weight 2.730
>     item osd.16 weight 2.730
>     item osd.17 weight 2.730
> }
> root fusionio {
>     id -6        # do not change unnecessarily
>     # weight 49.140
>     alg straw
>     hash 0    # rjenkins1
>     item cephosd01 weight 24.570
>     item cephosd02 weight 24.570
> }
>
> # rules
> rule replicated_ruleset {
>     ruleset 0
>     type replicated
>     min_size 1
>     max_size 10
>     step take default
>     step chooseleaf firstn 0 type host
>     step emit
> }
> rule fusionio_ruleset {
>     ruleset 1
>     type replicated
>     min_size 0
>     max_size 10
>     step take fusionio
>     step chooseleaf firstn 1 type host
>     step emit
>     step take default
>     step chooseleaf firstn -1 type host
>     step emit
> }
>
> # end crush map
>
>
>
>
>
> *German*
>
> 2015-07-02 8:15 GMT-03:00 Lionel Bouton <lionel+c...@bouton.name>:
>
>> On 07/02/15 12:48, German Anders wrote:
>> > The idea is to cache rbd at a host level. Also could be possible to
>> > cache at the osd level. We have high iowait and we need to lower it a
>> > bit, since we are getting the max from our sas disks 100-110 iops per
>> > disk (3TB osd's), any advice? Flashcache?
>>
>> It's hard to suggest anything without knowing more about your setup. Are
>> your I/O mostly reads or writes? Reads: can you add enough RAM on your
>> guests or on your OSD to cache your working set? Writes: do you use SSD
>> for journals already?
>>
>> Lionel
>>
>
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to