Re: [ceph-users] Performance question

Bill Sanders Tue, 24 Nov 2015 11:21:01 -0800

I think what Nick is suggesting is that you create Nx5GB partitions on the
SSD's (where N is the number of OSD's you want to have fast journals for),
and use the rest of the space for OSDs that would form the SSD pool.


Bill

On Tue, Nov 24, 2015 at 10:56 AM, Marek Dohojda <
mdoho...@altitudedigital.com> wrote:

> Oh, well in that you made my life easier, I like that :)
>
> I thought Journal needed to be on a physical space though, not within raw
> rbd pool.  Was I mistaken?
>
> On Tue, Nov 24, 2015 at 11:51 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>
>> Ok, but it’s probably a bit of a waste. The journals for each disk will
>> probably require 200-300iops from each SSD and maybe 5GB of space.
>> Personally I would keep the SSD pool, maybe use it for high perf VM’s?
>>
>>
>>
>> Typically VM’s will generate more random smaller IO’s so a default rados
>> bench might not be a true example of expected performance.
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* 24 November 2015 18:47
>> *To:* Nick Fisk <n...@fisk.me.uk>
>>
>> *Cc:* ceph-users@lists.ceph.com
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> I dunno, I think I just go into my Lotus and mull this over ;) (I wish)
>>
>> This is a storage for a KVM, and we have quite a few boxes.  While right
>> now none are suffering from IO load, I am seeing slowdown personally and
>> know that sooner or later others will notice as well.
>>
>>
>>
>> I think what I will do is remove the SSD from the cluster, and put
>> journals on it.
>>
>>
>>
>> On Tue, Nov 24, 2015 at 11:42 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>>
>> Separate would be best, but as with many things in life we are not all
>> driving around in sports cars!!
>>
>>
>>
>> Moving the journals to the SSD’s that are also OSD’s themselves will be
>> fine. SSD’s tend to be more bandwidth limited than IOPs and the reverse is
>> true for Disks, so you will get maybe 2x improvement for the disk pool and
>> you probably won’t even notice the impact on the SSD pool.
>>
>>
>>
>> Can I just ask what your workload will be? There maybe other things that
>> can be done.
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* 24 November 2015 18:32
>> *To:* Alan Johnson <al...@supermicro.com>
>> *Cc:* ceph-users@lists.ceph.com; Nick Fisk <n...@fisk.me.uk>
>>
>>
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> Thank you! I will do that.  Would you suggest getting another SSD drive
>> or move the journal to the SSD OSD?
>>
>>
>>
>> (Sorry for a stupid question, if that is such).
>>
>>
>>
>> On Tue, Nov 24, 2015 at 11:25 AM, Alan Johnson <al...@supermicro.com>
>> wrote:
>>
>> Or separate the journals as this will bring the workload down on the
>> spinners to 3Xrather than 6X
>>
>>
>>
>> *From:* Marek Dohojda [mailto:mdoho...@altitudedigital.com]
>> *Sent:* Tuesday, November 24, 2015 1:24 PM
>> *To:* Nick Fisk
>> *Cc:* Alan Johnson; ceph-users@lists.ceph.com
>>
>>
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> Crad I think you are 100% correct:
>>
>>
>>
>> rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz
>> await r_await w_await  svctm  %util
>>
>>
>>
>>  0.00   369.00   33.00 1405.00   132.00 135656.00   188.86     5.61
>>  4.02   21.94    3.60   0.70 100.00
>>
>>
>>
>> I was kinda wondering that this maybe the case, which is why I was
>> wondering if I should be doing too much in terms of troubleshooting.
>>
>>
>>
>> So basically what you are saying I need to wait for new version?
>>
>>
>>
>>
>>
>> Thank you very much everybody!
>>
>>
>>
>>
>>
>> On Tue, Nov 24, 2015 at 9:35 AM, Nick Fisk <n...@fisk.me.uk> wrote:
>>
>> You haven’t stated what size replication you are running. Keep in mind
>> that with a replication factor of 3, you will be writing 6x the amount of
>> data down to disks than what the benchmark says (3x replication x2 for
>> data+journal write).
>>
>>
>>
>> You might actually be near the hardware maximums. What does iostat looks
>> like whilst you are running rados bench, are the disks getting maxed out?
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* 24 November 2015 16:27
>> *To:* Alan Johnson <al...@supermicro.com>
>>
>>
>> *Cc:* ceph-users@lists.ceph.com
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> 7 total servers, 20 GIG pipe between servers, both reads and writes.  The
>> network itself has plenty of pipe left, it is averaging 40Mbits/s
>>
>>
>>
>> Rados Bench SAS 30 writes
>>
>>  Total time run:         30.591927
>>
>> Total writes made:      386
>>
>> Write size:             4194304
>>
>> Bandwidth (MB/sec):     50.471
>>
>>
>>
>> Stddev Bandwidth:       48.1052
>>
>> Max bandwidth (MB/sec): 160
>>
>> Min bandwidth (MB/sec): 0
>>
>> Average Latency:        1.25908
>>
>> Stddev Latency:         2.62018
>>
>> Max latency:            21.2809
>>
>> Min latency:            0.029227
>>
>>
>>
>> Rados Bench SSD writes
>>
>>  Total time run:         20.425192
>>
>> Total writes made:      1405
>>
>> Write size:             4194304
>>
>> Bandwidth (MB/sec):     275.150
>>
>>
>>
>> Stddev Bandwidth:       122.565
>>
>> Max bandwidth (MB/sec): 576
>>
>> Min bandwidth (MB/sec): 0
>>
>> Average Latency:        0.231803
>>
>> Stddev Latency:         0.190978
>>
>> Max latency:            0.981022
>>
>> Min latency:            0.0265421
>>
>>
>>
>>
>>
>> As you can see SSD is better but not as much as I would expect SSD to be.
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Nov 24, 2015 at 9:10 AM, Alan Johnson <al...@supermicro.com>
>> wrote:
>>
>> Hard to know without more config details such as no of servers, network
>>  – GigE or !0 GigE, also not sure how you are measuring, (reads or writes)
>> you could try RADOS bench as a baseline, I would expect more performance
>> with 7 X 10K spinners journaled to SSDs. The fact that SSDs did not perform
>> much better may mean to a bottleneck elsewhere – network perhaps?
>>
>> *From:* Marek Dohojda [mailto:mdoho...@altitudedigital.com]
>> *Sent:* Tuesday, November 24, 2015 10:37 AM
>> *To:* Alan Johnson
>> *Cc:* Haomai Wang; ceph-users@lists.ceph.com
>>
>>
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> Yeah they are, that is one thing I was planning on changing, What I am
>> really interested at the moment, is vague expected performance.  I mean is
>> 100MB around normal, very low, or "could be better"?
>>
>>
>>
>> On Tue, Nov 24, 2015 at 8:02 AM, Alan Johnson <al...@supermicro.com>
>> wrote:
>>
>> Are the journals on the same device – it might be better to use the SSDs
>> for journaling since you are not getting better performance with SSDs?
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* Monday, November 23, 2015 10:24 PM
>> *To:* Haomai Wang
>> *Cc:* ceph-users@lists.ceph.com
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>>  Sorry I should have specified SAS is the 100 MB :) , but to be honest
>> SSD isn't much faster.
>>
>>
>>
>> On Mon, Nov 23, 2015 at 7:38 PM, Haomai Wang <haomaiw...@gmail.com>
>> wrote:
>>
>> On Tue, Nov 24, 2015 at 10:35 AM, Marek Dohojda
>> <mdoho...@altitudedigital.com> wrote:
>> > No SSD and SAS are in two separate pools.
>> >
>> > On Mon, Nov 23, 2015 at 7:30 PM, Haomai Wang <haomaiw...@gmail.com>
>> wrote:
>> >>
>> >> On Tue, Nov 24, 2015 at 10:23 AM, Marek Dohojda
>> >> <mdoho...@altitudedigital.com> wrote:
>> >> > I have a Hammer Ceph cluster on 7 nodes with total 14 OSDs.  7 of
>> which
>> >> > are
>> >> > SSD and 7 of which are SAS 10K drives.  I get typically about 100MB
>> IO
>> >> > rates
>> >> > on this cluster.
>>
>> So which pool you get with 100 MB?
>>
>>
>> >>
>> >> You mixed up sas and ssd in one pool?
>> >>
>> >> >
>> >> > I have a simple question.  Is 100MB within my configuration what I
>> >> > should
>> >> > expect, or should it be higher? I am not sure if I should be looking
>> for
>> >> > issues, or just accept what I have.
>> >> >
>> >> > _______________________________________________
>> >> > ceph-users mailing list
>> >> > ceph-users@lists.ceph.com
>>
>> >> >
>> http://xo4t.mj.am/link/xo4t/rslsxpz/1/3wKgDcrDtjRIz7sAUYjmWA/aHR0cDovL3hvNHQubWouYW0vbGluay94bzR0L3JzbHdsbXMvMS9CTUF1cXZUWmE5UHVEZ2VmRFB4bkR3L2FIUjBjRG92TDNodk5IUXViV291WVcwdmJHbHVheTk0YnpSMEwzSnplR3BwZERFdk1TOU9iRVZ4YUhWaE1uSlBTSGh0V0dScFQwTk1YM2RCTDJGSVVqQmpSRzkyVERKNGNHTXpVbnBNYlU1c1kwZG5kVmt5T1hSTU1uaHdZek5TY0dKdFduWk1iVTV1WVZNNWFscFlRbTlNV0ZaNldsaEtla3hYVG14alIyZDFXVEk1ZEE
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards,
>> >>
>> >> Wheat
>> >
>> >
>>
>> --
>> Best Regards,
>>
>> Wheat
>>
>>
>>
>>
>>
>>
>>
>>
>> [image: Image removed by sender.]
>>
>>
>>
>>
>>
>>
>> [image: Image removed by sender.]
>>
>>
>>
>>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Performance question

Reply via email to