Hi Sebastien, >>I got 6340 IOPS on a single OSD SSD. (journal and data on the same >>partition).
Shouldn't it better to have 2 partitions, 1 for journal and 1 for datas ? (I'm thinking about filesystem write syncs) ----- Mail original ----- De: "Sebastien Han" <sebastien....@enovance.com> À: "Somnath Roy" <somnath....@sandisk.com> Cc: ceph-users@lists.ceph.com Envoyé: Mardi 2 Septembre 2014 02:19:16 Objet: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS Mark and all, Ceph IOPS performance has definitely improved with Giant. With this version: ceph version 0.84-940-g3215c52 (3215c520e1306f50d0094b5646636c02456c9df4) on Debian 7.6 with Kernel 3.14-0. I got 6340 IOPS on a single OSD SSD. (journal and data on the same partition). So basically twice the amount of IOPS that I was getting with Firefly. Rand reads 4k went from 12431 to 10201, so I’m a bit disappointed here. The SSD is still under-utilised: Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdp1 0.00 540.37 0.00 5902.30 0.00 47.14 16.36 0.87 0.15 0.00 0.15 0.07 40.15 sdp2 0.00 0.00 0.00 4454.67 0.00 49.16 22.60 0.31 0.07 0.00 0.07 0.07 30.61 Thanks a ton for all your comments and assistance guys :). One last question for Sage (or other that might know), what’s the status of the S2FS implementation? (or maybe we are waiting for S2FS to provide atomic transactions?) I tried to run the OSD on f2fs however ceph-osd mkfs got stuck on a xattr test: fremovexattr(10, "user.test@5848273") = 0 On 01 Sep 2014, at 11:13, Sebastien Han <sebastien....@enovance.com> wrote: > Mark, thanks a lot for experimenting this for me. > I’m gonna try master soon and will tell you how much I can get. > > It’s interesting to see that using 2 SSDs brings up more performance, even > both SSDs are under-utilized… > They should be able to sustain both loads at the same time (journal and osd > data). > > On 01 Sep 2014, at 09:51, Somnath Roy <somnath....@sandisk.com> wrote: > >> As I said, 107K with IOs serving from memory, not hitting the disk.. >> >> From: Jian Zhang [mailto:amberzhan...@gmail.com] >> Sent: Sunday, August 31, 2014 8:54 PM >> To: Somnath Roy >> Cc: Haomai Wang; ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, >> 2K IOPS >> >> Somnath, >> on the small workload performance, 107k is higher than the theoretical IOPS >> of 520, any idea why? >> >> >> >>>> Single client is ~14K iops, but scaling as number of clients increases. 10 >>>> clients ~107K iops. ~25 cpu cores are used. >> >> >> 2014-09-01 11:52 GMT+08:00 Jian Zhang <amberzhan...@gmail.com>: >> Somnath, >> on the small workload performance, >> >> >> >> 2014-08-29 14:37 GMT+08:00 Somnath Roy <somnath....@sandisk.com>: >> >> Thanks Haomai ! >> >> Here is some of the data from my setup. >> >> >> >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ >> >> >> Set up: >> >> -------- >> >> >> >> 32 core cpu with HT enabled, 128 GB RAM, one SSD (both journal and data) -> >> one OSD. 5 client m/c with 12 core cpu and each running two instances of >> ceph_smalliobench (10 clients total). Network is 10GbE. >> >> >> >> Workload: >> >> ------------- >> >> >> >> Small workload – 20K objects with 4K size and io_size is also 4K RR. The >> intent is to serve the ios from memory so that it can uncover the >> performance problems within single OSD. >> >> >> >> Results from Firefly: >> >> -------------------------- >> >> >> >> Single client throughput is ~14K iops, but as the number of client increases >> the aggregated throughput is not increasing. 10 clients ~15K iops. ~9-10 cpu >> cores are used. >> >> >> >> Result with latest master: >> >> ------------------------------ >> >> >> >> Single client is ~14K iops, but scaling as number of clients increases. 10 >> clients ~107K iops. ~25 cpu cores are used. >> >> >> >> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> >> >> >> >> >> More realistic workload: >> >> ----------------------------- >> >> Let’s see how it is performing while > 90% of the ios are served from disks >> >> Setup: >> >> ------- >> >> 40 cpu core server as a cluster node (single node cluster) with 64 GB RAM. 8 >> SSDs -> 8 OSDs. One similar node for monitor and rgw. Another node for >> client running fio/vdbench. 4 rbds are configured with ‘noshare’ option. 40 >> GbE network >> >> >> >> Workload: >> >> ------------ >> >> >> >> 8 SSDs are populated , so, 8 * 800GB = ~6.4 TB of data. Io_size = 4K RR. >> >> >> >> Results from Firefly: >> >> ------------------------ >> >> >> >> Aggregated output while 4 rbd clients stressing the cluster in parallel is >> ~20-25K IOPS , cpu cores used ~8-10 cores (may be less can’t remember >> precisely) >> >> >> >> Results from latest master: >> >> -------------------------------- >> >> >> >> Aggregated output while 4 rbd clients stressing the cluster in parallel is >> ~120K IOPS , cpu is 7% idle i.e ~37-38 cpu cores. >> >> >> >> Hope this helps. >> >> >> >> Thanks & Regards >> >> Somnath >> >> >> >> -----Original Message----- >> From: Haomai Wang [mailto:haomaiw...@gmail.com] >> Sent: Thursday, August 28, 2014 8:01 PM >> To: Somnath Roy >> Cc: Andrey Korolyov; ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, >> 2K IOPS >> >> >> Hi Roy, >> >> >> >> I already scan your merged codes about "fdcache" and "optimizing for >> lfn_find/lfn_open", could you give some performance improvement data about >> it? I fully agree with your orientation, do you have any update about it? >> >> >> >> As for messenger level, I have some very early works on >> it(https://github.com/yuyuyu101/ceph/tree/msg-event), it contains a new >> messenger implementation which support different event mechanism. >> >> It looks like at least one more week to make it work. >> >> >> >> On Fri, Aug 29, 2014 at 5:48 AM, Somnath Roy <somnath....@sandisk.com> >> wrote: >> >>> Yes, what I saw the messenger level bottleneck is still huge ! >> >>> Hopefully RDMA messenger will resolve that and the performance gain will be >>> significant for Read (on SSDs). For write we need to uncover the OSD >>> bottlenecks first to take advantage of the improved upstream. >> >>> What I experienced that till you remove the very last bottleneck the >>> performance improvement will not be visible and that could be confusing >>> because you might think that the upstream improvement you did is not valid >>> (which is not). >> >>> >> >>> Thanks & Regards >> >>> Somnath >> >>> -----Original Message----- >> >>> From: Andrey Korolyov [mailto:and...@xdel.ru] >> >>> Sent: Thursday, August 28, 2014 12:57 PM >> >>> To: Somnath Roy >> >>> Cc: David Moreau Simard; Mark Nelson; ceph-users@lists.ceph.com >> >>> Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go >> >>> over 3, 2K IOPS >> >>> >> >>> On Thu, Aug 28, 2014 at 10:48 PM, Somnath Roy <somnath....@sandisk.com> >>> wrote: >> >>>> Nope, this will not be back ported to Firefly I guess. >> >>>> >> >>>> Thanks & Regards >> >>>> Somnath >> >>>> >> >>> >> >>> Thanks for sharing this, the first thing in thought when I looked at >> >>> this thread, was your patches :) >> >>> >> >>> If Giant will incorporate them, both the RDMA support and those should give >>> a huge performance boost for RDMA-enabled Ceph backnets. >> >>> >> >>> ________________________________ >> >>> >> >>> PLEASE NOTE: The information contained in this electronic mail message is >>> intended only for the use of the designated recipient(s) named above. If >>> the reader of this message is not the intended recipient, you are hereby >>> notified that you have received this message in error and that any review, >>> dissemination, distribution, or copying of this message is strictly >>> prohibited. If you have received this communication in error, please notify >>> the sender by telephone or e-mail (as shown above) immediately and destroy >>> any and all copies of this message in your possession (whether hard copies >>> or electronically stored copies). >> >>> >> >>> _______________________________________________ >> >>> ceph-users mailing list >> >>> ceph-users@lists.ceph.com >> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> >> >> >> >> -- >> >> Best Regards, >> >> >> >> Wheat >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > Cheers. > –––– > Sébastien Han > Cloud Architect > > "Always give 100%. Unless you're giving blood." > > Phone: +33 (0)1 49 70 99 72 > Mail: sebastien....@enovance.com > Address : 11 bis, rue Roquépine - 75008 Paris > Web : www.enovance.com - Twitter : @enovance > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. –––– Sébastien Han Cloud Architect "Always give 100%. Unless you're giving blood." Phone: +33 (0)1 49 70 99 72 Mail: sebastien....@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovance.com - Twitter : @enovance _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com