Hi Mike/Warren, Thanks for helping out here. I am running the below fio command to test this with 4 jobs and a iodepth of 128
fio --time_based --name=benchmark --size=4G --filename=/mnt/test.bin --ioengine=libaio --randrepeat=0 --iodepth=128 --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k --group_reportin The QEMU instance is created using nova, the settings I can see in the config are below: <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <auth username='$$'> <secret type='ceph' uuid='$$'/> </auth> <source protocol='rbd' name='ssd_volume/volume-$$'> <host name='$$' port='6789'/> <host name='$$' port='6789'/> <host name='$$' port='6789'/> </source> <target dev='vde' bus='virtio'/> <serial>$$</serial> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </disk> The below shows the output from running Fio: # fio --time_based --name=benchmark --size=4G --filename=/mnt/test.bin --ioengine=libaio --randrepeat=0 --iodepth=128 --direct=1 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=4 --rw=randwrite --blocksize=4k --group_reporting fio: time_based requires a runtime/timeout setting benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128 ... benchmark: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=128 fio-2.0.13 Starting 4 processes Jobs: 3 (f=3): [_www] [99.7% done] [0K/36351K/0K /s] [0 /9087 /0 iops] [eta 00m:03s] benchmark: (groupid=0, jobs=4): err= 0: pid=8547: Thu Nov 19 05:16:31 2015 write: io=16384MB, bw=19103KB/s, iops=4775 , runt=878269msec slat (usec): min=4 , max=2339.4K, avg=807.17, stdev=12460.02 clat (usec): min=1 , max=2469.6K, avg=106265.05, stdev=138893.39 lat (usec): min=67 , max=2469.8K, avg=107073.04, stdev=139377.68 clat percentiles (usec): | 1.00th=[ 1928], 5.00th=[ 9408], 10.00th=[12352], 20.00th=[18816], | 30.00th=[43776], 40.00th=[64768], 50.00th=[78336], 60.00th=[89600], | 70.00th=[102912], 80.00th=[123392], 90.00th=[216064], 95.00th=[370688], | 99.00th=[733184], 99.50th=[782336], 99.90th=[1044480], 99.95th=[2088960], | 99.99th=[2342912] bw (KB/s) : min= 4, max=14968, per=26.11%, avg=4987.39, stdev=1947.67 lat (usec) : 2=0.01%, 20=0.01%, 50=0.01%, 100=0.05%, 250=0.30% lat (usec) : 500=0.24%, 750=0.11%, 1000=0.08% lat (msec) : 2=0.23%, 4=0.46%, 10=4.47%, 20=15.08%, 50=11.28% lat (msec) : 100=35.47%, 250=23.52%, 500=5.92%, 750=1.96%, 1000=0.70% lat (msec) : 2000=0.06%, >=2000=0.06% cpu : usr=0.62%, sys=2.42%, ctx=1602209, majf=1, minf=101 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1% issued : total=r=0/w=4194304/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=16384MB, aggrb=19102KB/s, minb=19102KB/s, maxb=19102KB/s, mint=878269msec, maxt=878269msec Disk stats (read/write): vde: ios=1119/4330437, merge=0/105599, ticks=556/121755054, in_queue=121749666, util=99.86 The below shows lspci from within the guest: # lspci | grep -i scsi 00:04.0 SCSI storage controller: Red Hat, Inc Virtio block devic Thanks On Wed, Nov 18, 2015 at 7:05 PM, Warren Wang - ISD <warren.w...@walmart.com> wrote: > What were you using for iodepth and numjobs? If you’re getting an average > of 2ms per operation, and you’re single threaded, I’d expect about 500 IOPS > / thread, until you hit the limit of your QEMU setup, which may be a single > IO thread. That’s also what I think Mike is alluding to. > > Warren > > From: Sean Redmond <sean.redmo...@gmail.com<mailto:sean.redmo...@gmail.com > >> > Date: Wednesday, November 18, 2015 at 6:39 AM > To: "ceph-us...@ceph.com<mailto:ceph-us...@ceph.com>" <ceph-us...@ceph.com > <mailto:ceph-us...@ceph.com>> > Subject: [ceph-users] All SSD Pool - Odd Performance > > Hi, > > I have a performance question for anyone running an SSD only pool. Let me > detail the setup first. > > 12 X Dell PowerEdge R630 ( 2 X 2620v3 64Gb RAM) > 8 X intel DC 3710 800GB > Dual port Solarflare 10GB/s NIC (one front and one back) > Ceph 0.94.5 > Ubuntu 14.04 (3.13.0-68-generic) > > The above is in one pool that is used for QEMU guests, A 4k FIO test on > the SSD directly yields around 55k Iops, the same test inside a QEMU guest > seems to hit a limit around 4k Iops. If I deploy multiple guests they can > all reach 4K Iops simultaneously. > > I don't see any evidence of a bottle neck on the OSD hosts,Is this limit > inside the guest expected or I am just not looking deep enough yet? > > Thanks > > This email and any files transmitted with it are confidential and intended > solely for the individual or entity to whom they are addressed. If you have > received this email in error destroy it immediately. *** Walmart > Confidential *** >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com