Not specifically set, according to the docs the default is off... I am using the async qemu cuttlefish rpm. Maybe it does cache something but I think not: Specifically setting writeback on in the client config yielded different results: In our DEV environment we had issues with the virtual becoming unreachable under heavy IO load so I have not enabled it on the live machines..
I am currently at home but I could run a few tests tomorrow at work. Is there an way to do small random IO write tests with the rados tools? I just gave it a quick glance and it looked like it just did large sequential writes. Cheers, Robert van Leeuwen Sent from my iPad > On 3 dec. 2013, at 17:02, "Mike Dawson" <mike.daw...@cloudapt.com> wrote: > > Robert, > > Do you have rbd writeback cache enabled on these volumes? That could > certainly explain the higher than expected write performance. Any chance you > could re-test with rbd writeback on vs. off? > > Thanks, > Mike Dawson > >> On 12/3/2013 10:37 AM, Robert van Leeuwen wrote: >> Hi Mike, >> >> I am using filebench within a kvm virtual. (Like an actual workload we will >> have) >> Using 100% synchronous 4k writes with a 50GB file on a 100GB volume with 32 >> writer threads. >> Also tried from multiple KVM machines from multiple hosts. >> Aggregated performance keeps at 2k+ IOPS >> >> The disks are 7200RPM 2.5 inch drives, no RAID whatsoever. >> I agree the amount of IOPS seem high. >> Maybe the journal on SSD (2 x Intel 3500) helps a bit in this regard but the >> SSD's where not maxed out yet. >> The writes seem to be limited by the spinning disks: >> As soon as the benchmark starts the are used for 100% percent. >> Also the usage dropped to 0% pretty much immediately after the benchmark so >> it looks like it's not lagging behind the journal. >> >> Did not really test reads yet since we have so much read cache (128 GB per >> node) I assume we will mostly be write limited. >> >> Cheers, >> Robert van Leeuwen >> >> >> >> Sent from my iPad >> >>> On 3 dec. 2013, at 16:15, "Mike Dawson" <mike.daw...@cloudapt.com> wrote: >>> >>> Robert, >>> >>> Interesting results on the effect of # of PG/PGPs. My cluster struggles a >>> bit under the strain of heavy random small-sized writes. >>> >>> The IOPS you mention seem high to me given 30 drives and 3x replication >>> unless they were pure reads or on high-rpm drives. Instead of assuming, I >>> want to pose a few questions: >>> >>> - How are you testing? rados bench, rbd bench, rbd bench with writeback >>> cache, etc? >>> >>> - Were the 2000-2500 random 4k IOPS more reads than writes? If you test >>> 100% 4k random reads, what do you get? If you test 100% 4k random writes, >>> what do you get? >>> >>> - What drives do you have? Any RAID involved under your OSDs? >>> >>> Thanks, >>> Mike Dawson >>> >>> >>>> On 12/3/2013 1:31 AM, Robert van Leeuwen wrote: >>>> >>>>>> On 2 dec. 2013, at 18:26, "Brian Andrus" <brian.and...@inktank.com> >>>>>> wrote: >>>>> >>>>> Setting your pg_num and pgp_num to say... 1024 would A) increase data >>>>> granularity, B) likely lend no noticeable increase to resource >>>>> consumption, and C) allow some room for future OSDs two be added while >>>>> still within range of acceptable pg numbers. You could probably safely >>>>> double even that number if you plan on expanding at a rapid rate and want >>>>> to avoid splitting PGs every time a node is added. >>>>> >>>>> In general, you can conservatively err on the larger side when it comes >>>>> to pg/p_num. Any excess resource utilization will be negligible (up to a >>>>> certain point). If you have a comfortable amount of available RAM, you >>>>> could experiment with increasing the multiplier in the equation you are >>>>> using and see how it affects your final number. >>>>> >>>>> The pg_num and pgp_num parameters can safely be changed before or after >>>>> your new nodes are integrated. >>>> >>>> I would be a bit conservative with the PGs / PGPs. >>>> I've experimented with the PG number a bit and noticed the following >>>> random IO performance drop. >>>> ( this could be something to our specific setup but since the PG is easily >>>> increased and impossible to decrease I would be conservative) >>>> >>>> The setup: >>>> 3 OSD nodes with 128GB ram, 2 * 6 core CPU (12 with ht). >>>> Nodes have 10 OSDs running on 1 tb disks and 2 SSDs for Journals. >>>> >>>> We use a replica count of 3 so optimum according to formula is about 1000 >>>> With 1000 PGs I got about 2000-2500 random 4k IOPS. >>>> >>>> Because the nodes are fast enough and I expect the cluster to be expanded >>>> with 3 more nodes I set the PGs to 2000. >>>> Performance dropped to about 1200-1400 IOPS. >>>> >>>> I noticed that the spinning disks where no longer maxing out on 100% usage. >>>> Memory and CPU did not seem to be a problem. >>>> Since had the option to recreate the pool and I was not using the >>>> recommended settings I did not really dive into the issue. >>>> I will not stray to far from the recommended settings in the future though >>>> :) >>>> >>>> Cheers, >>>> Robert van Leeuwen >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com