I think the answer is that with 1 thread you can only ever write to one journal at a time. Theoretically, you would need 10 threads to be able to write to 10 nodes at the same time.
Jake On Thursday, July 21, 2016, w...@globe.de <w...@globe.de> wrote: > What i not really undertand is: > > Lets say the Intel P3700 works with 200 MByte/s rados bench one thread... > See Nicks results below... > > If we have multiple OSD Nodes. For example 10 Nodes. > > Every Node has exactly 1x P3700 NVMe built in. > > Why is the single Thread performance exactly at 200 MByte/s on the rbd > client with 10 OSD Node Cluster??? > > I think it must be at 10 Nodes * 200 MByte/s = 2000 MByte/s. > > > Everyone look yourself at your cluster. > > dstat -D sdb,sdc,sdd,sdX .... > > You will see that Ceph stripes the data over all OSD's in the cluster if > you test at the client side with rados bench... > > *rados bench -p rbd 60 write -b 4M -t 1* > > > > Am 21.07.16 um 14:38 schrieb w...@globe.de > <javascript:_e(%7B%7D,'cvml','w...@globe.de');>: > > Is there not a way to enable Linux page Cache? So do not user D_Sync... > > Then we would the dramatically performance improve. > > > Am 21.07.16 um 14:33 schrieb Nick Fisk: > > -----Original Message----- > From: w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');> [ > mailto:w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');>] > Sent: 21 July 2016 13:23 > To: n...@fisk.me.uk <javascript:_e(%7B%7D,'cvml','n...@fisk.me.uk');>; > 'Horace Ng' <hor...@hkisl.net> > <javascript:_e(%7B%7D,'cvml','hor...@hkisl.net');> > Cc: ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > Okay and what is your plan now to speed up ? > > Now I have come up with a lower latency hardware design, there is not much > further improvement until persistent RBD caching is implemented, as you > will be moving the SSD/NVME closer to the client. But I'm happy with what I > can achieve at the moment. You could also experiment with bcache on the > RBD. > > Would it help to put in multiple P3700 per OSD Node to improve performance > for a single Thread (example Storage VMotion) ? > > Most likely not, it's all the other parts of the puzzle which are causing > the latency. ESXi was designed for storage arrays that service IO's in > 100us-1ms range, Ceph is probably about 10x slower than this, hence the > problem. Disable the BBWC on a RAID controller or SAN and you will the same > behaviour. > > Regards > > > Am 21.07.16 um 14:17 schrieb Nick Fisk: > > -----Original Message----- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users-boun...@lists.ceph.com');>] On > Behalf > Of w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');> > Sent: 21 July 2016 13:04 > To: n...@fisk.me.uk <javascript:_e(%7B%7D,'cvml','n...@fisk.me.uk');>; > 'Horace Ng' <hor...@hkisl.net> > <javascript:_e(%7B%7D,'cvml','hor...@hkisl.net');> > Cc: ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > Hi, > > hmm i think 200 MByte/s is really bad. Is your Cluster in production right > now? > > It's just been built, not running yet. > > So if you start a storage migration you get only 200 MByte/s right? > > I wish. My current cluster (not this new one) would storage migrate at > ~10-15MB/s. Serial latency is the problem, without being able to > buffer, ESXi waits on an ack for each IO before sending the next. Also it > submits the migrations in 64kb chunks, unless you get VAAI > > working. I think esxi will try and do them in parallel, which will help as > well. > > I think it would be awesome if you get 1000 MByte/s > > Where is the Bottleneck? > > Latency serialisation, without a buffer, you can't drive the devices > to 100%. With buffered IO (or high queue depths) I can max out the > journals. > > A FIO Test from Sebastien Han give us 400 MByte/s raw performance from the > P3700. > > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your > -ssd-is-suitable-as-a-journal-device/ > > How could it be that the rbd client performance is 50% slower? > > Regards > > > Am 21.07.16 um 12:15 schrieb Nick Fisk: > > I've had a lot of pain with this, smaller block sizes are even worse. > You want to try and minimize latency at every point as there is no > buffering happening in the iSCSI stack. This means:- > > 1. Fast journals (NVME or NVRAM) > 2. 10GB or better networking > 3. Fast CPU's (Ghz) > 4. Fix CPU c-state's to C1 > 5. Fix CPU's Freq to max > > Also I can't be sure, but I think there is a metadata update > happening with VMFS, particularly if you are using thin VMDK's, this > can also be a major bottleneck. For my use case, I've switched over to NFS > as it has given much more performance at scale and > > less headache. > > For the RADOS Run, here you go (400GB P3700): > > Total time run: 60.026491 > Total writes made: 3104 > Write size: 4194304 > Object size: 4194304 > Bandwidth (MB/sec): 206.842 > Stddev Bandwidth: 8.10412 > Max bandwidth (MB/sec): 224 > Min bandwidth (MB/sec): 180 > Average IOPS: 51 > Stddev IOPS: 2 > Max IOPS: 56 > Min IOPS: 45 > Average Latency(s): 0.0193366 > Stddev Latency(s): 0.00148039 > Max latency(s): 0.0377946 > Min latency(s): 0.015909 > > Nick > > -----Original Message----- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users-boun...@lists.ceph.com');>] On > Behalf Of Horace > Sent: 21 July 2016 10:26 > To: w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');> > Cc: ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance > > Hi, > > Same here, I've read some blog saying that vmware will frequently > verify the locking on VMFS over iSCSI, hence it will have much slower > performance than NFS (with different locking mechanism). > > Regards, > Horace Ng > > ----- Original Message ----- > From: w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');> > To: ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > Sent: Thursday, July 21, 2016 5:11:21 PM > Subject: [ceph-users] Ceph + VMware + Single Thread Performance > > Hi everyone, > > we see at our cluster relatively slow Single Thread Performance on the > iscsi Nodes. > > > Our setup: > > 3 Racks: > > 18x Data Nodes, 3 Mon Nodes, 3 iscsi Gateway Nodes with tgt (rbd cache > off). > > 2x Samsung SM863 Enterprise SSD for Journal (3 OSD per SSD) and 6x > WD Red 1TB per Data Node as OSD. > > Replication = 3 > > chooseleaf = 3 type Rack in the crush map > > > We get only ca. 90 MByte/s on the iscsi Gateway Servers with: > > rados bench -p rbd 60 write -b 4M -t 1 > > > If we test with: > > rados bench -p rbd 60 write -b 4M -t 32 > > we get ca. 600 - 700 MByte/s > > > We plan to replace the Samsung SSD with Intel DC P3700 PCIe NVM'e > for the Journal to get better Single Thread Performance. > > Is anyone of you out there who has an Intel P3700 for Journal an > can give me back test results with: > > > rados bench -p rbd 60 write -b 4M -t 1 > > > Thank you very much !! > > Kind Regards !! > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com