I'm going to install next week a small 3 nodes test ssd cluster, I have some intel s3500 and crucial m550. I'll try to bench them with firefly and master.
Is a debian wheezy gitbuilder repository available ? (I'm a bit lazy to compile all packages) ----- Mail original ----- De: "Sebastien Han" <sebastien....@enovance.com> À: "Alexandre DERUMIER" <aderum...@odiso.com> Cc: ceph-users@lists.ceph.com, "Cédric Lemarchand" <c.lemarch...@yipikai.org> Envoyé: Mardi 2 Septembre 2014 15:25:05 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Well the last time I ran two processes in parallel I got half the total amount available so 1,7k per client. On 02 Sep 2014, at 15:19, Alexandre DERUMIER <aderum...@odiso.com> wrote: > > Do you have same results, if you launch 2 fio benchs in parallel on 2 > differents rbd volumes ? > > > ----- Mail original ----- > > De: "Sebastien Han" <sebastien....@enovance.com> > À: "Cédric Lemarchand" <c.lemarch...@yipikai.org> > Cc: "Alexandre DERUMIER" <aderum...@odiso.com>, ceph-users@lists.ceph.com > Envoyé: Mardi 2 Septembre 2014 13:59:13 > Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K > IOPS > > @Dan, hop my bad I forgot to use these settings, I’ll try again and see how > much I can get on the read performance side. > @Mark, thanks again and yes I believe that due to some hardware variance we > have difference results, I won’t say that the deviance is decent but results > are close enough to say that we experience the same limitations (ceph level). > @Cédric, yes I did and what fio was showing was consistent with the iostat > output, same goes for disk utilisation. > > > On 02 Sep 2014, at 12:44, Cédric Lemarchand <c.lemarch...@yipikai.org> wrote: > >> Hi Sebastian, >> >>> Le 2 sept. 2014 à 10:41, Sebastien Han <sebastien....@enovance.com> a écrit >>> : >>> >>> Hey, >>> >>> Well I ran an fio job that simulates the (more or less) what ceph is doing >>> (journal writes with dsync and o_direct) and the ssd gave me 29K IOPS too. >>> I could do this, but for me it definitely looks like a major waste since we >>> don’t even get a third of the ssd performance. >> >> Did you had a look if the raw ssd IOPS (using iostat -x for example) show >> same results during fio bench ? >> >> Cheers >> >>> >>>> On 02 Sep 2014, at 09:38, Alexandre DERUMIER <aderum...@odiso.com> wrote: >>>> >>>> Hi Sebastien, >>>> >>>>>> I got 6340 IOPS on a single OSD SSD. (journal and data on the same >>>>>> partition). >>>> >>>> Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ? >>>> >>>> (I'm thinking about filesystem write syncs) >>>> >>>> >>>> >>>> >>>> ----- Mail original ----- >>>> >>>> De: "Sebastien Han" <sebastien....@enovance.com> >>>> À: "Somnath Roy" <somnath....@sandisk.com> >>>> Cc: ceph-users@lists.ceph.com >>>> Envoyé: Mardi 2 Septembre 2014 02:19:16 >>>> Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, >>>> 2K IOPS >>>> >>>> Mark and all, Ceph IOPS performance has definitely improved with Giant. >>>> With this version: ceph version 0.84-940-g3215c52 >>>> (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel >>>> 3.14-0. >>>> >>>> I got 6340 IOPS on a single OSD SSD. (journal and data on the same >>>> partition). >>>> So basically twice the amount of IOPS that I was getting with Firefly. >>>> >>>> Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here. >>>> >>>> The SSD is still under-utilised: >>>> >>>> Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await >>>> w_await svctm %util >>>> sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 >>>> 40.15 >>>> sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 >>>> 30.61 >>>> >>>> Thanks a ton for all your comments and assistance guys :). >>>> >>>> One last question for Sage (or other that might know), what’s the status >>>> of the S2FS implementation? (or maybe we are waiting for S2FS to provide >>>> atomic transactions?) >>>> I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr >>>> test: >>>> >>>> fremovexattr(10, "user.test@5848273") = 0 >>>> >>>>> On 01 Sep 2014, at 11:13, Sebastien Han <sebastien....@enovance.com> >>>>> wrote: >>>>> >>>>> Mark, thanks a lot for experimenting this for me. >>>>> I’m gonna try master soon and will tell you how much I can get. >>>>> >>>>> It’s interesting to see that using 2 SSDs brings up more performance, >>>>> even both SSDs are under-utilized… >>>>> They should be able to sustain both loads at the same time (journal and >>>>> osd data). >>>>> >>>>>> On 01 Sep 2014, at 09:51, Somnath Roy <somnath....@sandisk.com> wrote: >>>>>> >>>>>> As I said, 107K with IOs serving from memory, not hitting the disk.. >>>>>> >>>>>> From: Jian Zhang [mailto:amberzhan...@gmail.com] >>>>>> Sent: Sunday, August 31, 2014 8:54 PM >>>>>> To: Somnath Roy >>>>>> Cc: Haomai Wang; ceph-users@lists.ceph.com >>>>>> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over >>>>>> 3, 2K IOPS >>>>>> >>>>>> Somnath, >>>>>> on the small workload performance, 107k is higher than the theoretical >>>>>> IOPS of 520, any idea why? >>>>>> >>>>>> >>>>>> >>>>>>>> Single client is ~14K iops, but scaling as number of clients >>>>>>>> increases. 10 clients ~107K iops. ~25 cpu cores are used. >>>>>> >>>>>> >>>>>> 2014-09-01 11:52 GMT+08:00 Jian Zhang <amberzhan...@gmail.com>: >>>>>> Somnath, >>>>>> on the small workload performance, >>>>>> >>>>>> >>>>>> >>>>>> 2014-08-29 14:37 GMT+08:00 Somnath Roy <somnath....@sandisk.com>: >>>>>> >>>>>> Thanks Haomai ! >>>>>> >>>>>> Here is some of the data from my setup. >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >>>>>> >>>>>> >>>>>> Set up: >>>>>> >>>>>> -------- >>>>>> >>>>>> >>>>>> >>>>>> 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) >>>>>> -> one OSD. 5 client m/c with 12 core cpu and each running two instances >>>>>> of ceph_smalliobench (10 clients total). Network is 10GbE. >>>>>> >>>>>> >>>>>> >>>>>> Workload: >>>>>> >>>>>> ------------- >>>>>> >>>>>> >>>>>> >>>>>> Small workload – 20K objects with 4K size and io_size is also 4K RR. The >>>>>> intent is to serve the ios from memory so that it can uncover the >>>>>> performance problems within single OSD. >>>>>> >>>>>> >>>>>> >>>>>> Results from Firefly: >>>>>> >>>>>> -------------------------- >>>>>> >>>>>> >>>>>> >>>>>> Single client throughput is ~14K iops, but as the number of client >>>>>> increases the aggregated throughput is not increasing. 10 clients ~15K >>>>>> iops. ~9-10 cpu cores are used. >>>>>> >>>>>> >>>>>> >>>>>> Result with latest master: >>>>>> >>>>>> ------------------------------ >>>>>> >>>>>> >>>>>> >>>>>> Single client is ~14K iops, but scaling as number of clients increases. >>>>>> 10 clients ~107K iops. ~25 cpu cores are used. >>>>>> >>>>>> >>>>>> >>>>>> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> More realistic workload: >>>>>> >>>>>> ----------------------------- >>>>>> >>>>>> Let’s see how it is performing while > 90% of the ios are served from >>>>>> disks >>>>>> >>>>>> Setup: >>>>>> >>>>>> ------- >>>>>> >>>>>> 40 cpu core server as a cluster node (single node cluster) with 64 GB >>>>>> RAM. 8 SSDs -> 8 OSDs. One similar node for monitor and rgw. Another >>>>>> node for client running fio/vdbench. 4 rbds are configured with >>>>>> ‘noshare’ option. 40 GbE network >>>>>> >>>>>> >>>>>> >>>>>> Workload: >>>>>> >>>>>> ------------ >>>>>> >>>>>> >>>>>> >>>>>> 8 SSDs are populated , so, 8 * 800GB = ~6.4 TB of data. Io_size = 4K RR. >>>>>> >>>>>> >>>>>> >>>>>> Results from Firefly: >>>>>> >>>>>> ------------------------ >>>>>> >>>>>> >>>>>> >>>>>> Aggregated output while 4 rbd clients stressing the cluster in parallel >>>>>> is ~20-25K IOPS , cpu cores used ~8-10 cores (may be less can’t remember >>>>>> precisely) >>>>>> >>>>>> >>>>>> >>>>>> Results from latest master: >>>>>> >>>>>> -------------------------------- >>>>>> >>>>>> >>>>>> >>>>>> Aggregated output while 4 rbd clients stressing the cluster in parallel >>>>>> is ~120K IOPS , cpu is 7% idle i.e ~37-38 cpu cores. >>>>>> >>>>>> >>>>>> >>>>>> Hope this helps. >>>>>> >>>>>> >>>>>> >>>>>> Thanks & Regards >>>>>> >>>>>> Somnath >>>>>> >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Haomai Wang [mailto:haomaiw...@gmail.com] >>>>>> Sent: Thursday, August 28, 2014 8:01 PM >>>>>> To: Somnath Roy >>>>>> Cc: Andrey Korolyov; ceph-users@lists.ceph.com >>>>>> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over >>>>>> 3, 2K IOPS >>>>>> >>>>>> >>>>>> Hi Roy, >>>>>> >>>>>> >>>>>> >>>>>> I already scan your merged codes about "fdcache" and "optimizing for >>>>>> lfn_find/lfn_open", could you give some performance improvement data >>>>>> about it? I fully agree with your orientation, do you have any update >>>>>> about it? >>>>>> >>>>>> >>>>>> >>>>>> As for messenger level, I have some very early works on >>>>>> it(https://github.com/yuyuyu101/ceph/tree/msg-event), it contains a new >>>>>> messenger implementation which support different event mechanism. >>>>>> >>>>>> It looks like at least one more week to make it work. >>>>>> >>>>>> >>>>>> >>>>>>> On Fri, Aug 29, 2014 at 5:48 AM, Somnath Roy <somnath....@sandisk.com> >>>>>>> wrote: >>>>>>> >>>>>>> Yes, what I saw the messenger level bottleneck is still huge ! >>>>>> >>>>>>> Hopefully RDMA messenger will resolve that and the performance gain >>>>>>> will be significant for Read (on SSDs). For write we need to uncover >>>>>>> the OSD bottlenecks first to take advantage of the improved upstream. >>>>>> >>>>>>> What I experienced that till you remove the very last bottleneck the >>>>>>> performance improvement will not be visible and that could be confusing >>>>>>> because you might think that the upstream improvement you did is not >>>>>>> valid (which is not). >>>>>> >>>>>> >>>>>>> Thanks & Regards >>>>>> >>>>>>> Somnath >>>>>> >>>>>>> -----Original Message----- >>>>>> >>>>>>> From: Andrey Korolyov [mailto:and...@xdel.ru] >>>>>> >>>>>>> Sent: Thursday, August 28, 2014 12:57 PM >>>>>> >>>>>>> To: Somnath Roy >>>>>> >>>>>>> Cc: David Moreau Simard; Mark Nelson; ceph-users@lists.ceph.com >>>>>> >>>>>>> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go >>>>>> >>>>>>> over 3, 2K IOPS >>>>>> >>>>>> >>>>>>> On Thu, Aug 28, 2014 at 10:48 PM, Somnath Roy <somnath....@sandisk.com> >>>>>>> wrote: >>>>>> >>>>>>>> Nope, this will not be back ported to Firefly I guess. >>>>>> >>>>>> >>>>>>>> Thanks & Regards >>>>>> >>>>>>>> Somnath >>>>>> >>>>>> >>>>>> >>>>>>> Thanks for sharing this, the first thing in thought when I looked at >>>>>> >>>>>>> this thread, was your patches :) >>>>>> >>>>>> >>>>>>> If Giant will incorporate them, both the RDMA support and those should >>>>>>> give a huge performance boost for RDMA-enabled Ceph backnets. >>>>>> >>>>>> >>>>>>> ________________________________ >>>>>> >>>>>> >>>>>>> PLEASE NOTE: The information contained in this electronic mail message >>>>>>> is intended only for the use of the designated recipient(s) named >>>>>>> above. If the reader of this message is not the intended recipient, you >>>>>>> are hereby notified that you have received this message in error and >>>>>>> that any review, dissemination, distribution, or copying of this >>>>>>> message is strictly prohibited. If you have received this communication >>>>>>> in error, please notify the sender by telephone or e-mail (as shown >>>>>>> above) immediately and destroy any and all copies of this message in >>>>>>> your possession (whether hard copies or electronically stored copies). >>>>>> >>>>>> >>>>>>> _______________________________________________ >>>>>> >>>>>>> ceph-users mailing list >>>>>> >>>>>>> ceph-users@lists.ceph.com >>>>>> >>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Best Regards, >>>>>> >>>>>> >>>>>> >>>>>> Wheat >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@lists.ceph.com >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> ceph-users mailing list >>>>>> ceph-users@lists.ceph.com >>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>>> >>>>> Cheers. >>>>> –––– >>>>> Sébastien Han >>>>> Cloud Architect >>>>> >>>>> "Always give 100%. Unless you're giving blood." >>>>> >>>>> Phone: +33 (0)1 49 70 99 72 >>>>> Mail: sebastien....@enovance.com >>>>> Address : 11 bis, rue Roquépine - 75008 Paris >>>>> Web : www.enovance.com - Twitter : @enovance >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@lists.ceph.com >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>>> Cheers. >>>> –––– >>>> Sébastien Han >>>> Cloud Architect >>>> >>>> "Always give 100%. Unless you're giving blood." >>>> >>>> Phone: +33 (0)1 49 70 99 72 >>>> Mail: sebastien....@enovance.com >>>> Address : 11 bis, rue Roquépine - 75008 Paris >>>> Web : www.enovance.com - Twitter : @enovance >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> Cheers. >>> –––– >>> Sébastien Han >>> Cloud Architect >>> >>> "Always give 100%. Unless you're giving blood." >>> >>> Phone: +33 (0)1 49 70 99 72 >>> Mail: sebastien....@enovance.com >>> Address : 11 bis, rue Roquépine - 75008 Paris >>> Web : www.enovance.com - Twitter : @enovance >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > > > Cheers. > –––– > Sébastien Han > Cloud Architect > > "Always give 100%. Unless you're giving blood." > > Phone: +33 (0)1 49 70 99 72 > Mail: sebastien....@enovance.com > Address : 11 bis, rue Roquépine - 75008 Paris > Web : www.enovance.com - Twitter : @enovance Cheers. –––– Sébastien Han Cloud Architect "Always give 100%. Unless you're giving blood." Phone: +33 (0)1 49 70 99 72 Mail: sebastien....@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com