Re: [ceph-users] Ceph + VMware + Single Thread Performance

Jake Young Thu, 21 Jul 2016 06:03:22 -0700

I think the answer is that with 1 thread you can only ever write to one
journal at a time. Theoretically, you would need 10 threads to be able to
write to 10 nodes at the same time.


Jake

On Thursday, July 21, 2016, w...@globe.de <w...@globe.de> wrote:

> What i not really undertand is:
>
> Lets say the Intel P3700 works with 200 MByte/s rados bench one thread...
> See Nicks results below...
>
> If we have multiple OSD Nodes. For example 10 Nodes.
>
> Every Node has exactly 1x P3700 NVMe built in.
>
> Why is the single Thread performance exactly at 200 MByte/s on the rbd
> client with 10 OSD Node Cluster???
>
> I think it must be at 10 Nodes * 200 MByte/s = 2000 MByte/s.
>
>
> Everyone look yourself at your cluster.
>
> dstat -D sdb,sdc,sdd,sdX ....
>
> You will see that Ceph stripes the data over all OSD's in the cluster if
> you test at the client side with rados bench...
>
> *rados bench -p rbd 60 write -b 4M -t 1*
>
>
>
> Am 21.07.16 um 14:38 schrieb w...@globe.de
> <javascript:_e(%7B%7D,'cvml','w...@globe.de');>:
>
> Is there not a way to enable Linux page Cache? So do not user D_Sync...
>
> Then we would the dramatically performance improve.
>
>
> Am 21.07.16 um 14:33 schrieb Nick Fisk:
>
> -----Original Message-----
> From: w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');> [
> mailto:w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');>]
> Sent: 21 July 2016 13:23
> To: n...@fisk.me.uk <javascript:_e(%7B%7D,'cvml','n...@fisk.me.uk');>;
> 'Horace Ng' <hor...@hkisl.net>
> <javascript:_e(%7B%7D,'cvml','hor...@hkisl.net');>
> Cc: ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance
>
> Okay and what is your plan now to speed up ?
>
> Now I have come up with a lower latency hardware design, there is not much
> further improvement until persistent RBD caching is implemented, as you
> will be moving the SSD/NVME closer to the client. But I'm happy with what I
> can achieve at the moment. You could also experiment with bcache on the
> RBD.
>
> Would it help to put in multiple P3700 per OSD Node to improve performance
> for a single Thread (example Storage VMotion) ?
>
> Most likely not, it's all the other parts of the puzzle which are causing
> the latency. ESXi was designed for storage arrays that service IO's in
> 100us-1ms range, Ceph is probably about 10x slower than this, hence the
> problem. Disable the BBWC on a RAID controller or SAN and you will the same
> behaviour.
>
> Regards
>
>
> Am 21.07.16 um 14:17 schrieb Nick Fisk:
>
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users-boun...@lists.ceph.com');>] On
> Behalf
> Of w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');>
> Sent: 21 July 2016 13:04
> To: n...@fisk.me.uk <javascript:_e(%7B%7D,'cvml','n...@fisk.me.uk');>;
> 'Horace Ng' <hor...@hkisl.net>
> <javascript:_e(%7B%7D,'cvml','hor...@hkisl.net');>
> Cc: ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance
>
> Hi,
>
> hmm i think 200 MByte/s is really bad. Is your Cluster in production right
> now?
>
> It's just been built, not running yet.
>
> So if you start a storage migration you get only 200 MByte/s right?
>
> I wish. My current cluster (not this new one) would storage migrate at
> ~10-15MB/s. Serial latency is the problem, without being able to
> buffer, ESXi waits on an ack for each IO before sending the next. Also it
> submits the migrations in 64kb chunks, unless you get VAAI
>
> working. I think esxi will try and do them in parallel, which will help as
> well.
>
> I think it would be awesome if you get 1000 MByte/s
>
> Where is the Bottleneck?
>
> Latency serialisation, without a buffer, you can't drive the devices
> to 100%. With buffered IO (or high queue depths) I can max out the
> journals.
>
> A FIO Test from Sebastien Han give us 400 MByte/s raw performance from the
> P3700.
>
> https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your
> -ssd-is-suitable-as-a-journal-device/
>
> How could it be that the rbd client performance is 50% slower?
>
> Regards
>
>
> Am 21.07.16 um 12:15 schrieb Nick Fisk:
>
> I've had a lot of pain with this, smaller block sizes are even worse.
> You want to try and minimize latency at every point as there is no
> buffering happening in the iSCSI stack. This means:-
>
> 1. Fast journals (NVME or NVRAM)
> 2. 10GB or better networking
> 3. Fast CPU's (Ghz)
> 4. Fix CPU c-state's to C1
> 5. Fix CPU's Freq to max
>
> Also I can't be sure, but I think there is a metadata update
> happening with VMFS, particularly if you are using thin VMDK's, this
> can also be a major bottleneck. For my use case, I've switched over to NFS
> as it has given much more performance at scale and
>
> less headache.
>
> For the RADOS Run, here you go (400GB P3700):
>
> Total time run:         60.026491
> Total writes made:      3104
> Write size:             4194304
> Object size:            4194304
> Bandwidth (MB/sec):     206.842
> Stddev Bandwidth:       8.10412
> Max bandwidth (MB/sec): 224
> Min bandwidth (MB/sec): 180
> Average IOPS:           51
> Stddev IOPS:            2
> Max IOPS:               56
> Min IOPS:               45
> Average Latency(s):     0.0193366
> Stddev Latency(s):      0.00148039
> Max latency(s):         0.0377946
> Min latency(s):         0.015909
>
> Nick
>
> -----Original Message-----
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users-boun...@lists.ceph.com');>] On
> Behalf Of Horace
> Sent: 21 July 2016 10:26
> To: w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');>
> Cc: ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> Subject: Re: [ceph-users] Ceph + VMware + Single Thread Performance
>
> Hi,
>
> Same here, I've read some blog saying that vmware will frequently
> verify the locking on VMFS over iSCSI, hence it will have much slower
> performance than NFS (with different locking mechanism).
>
> Regards,
> Horace Ng
>
> ----- Original Message -----
> From: w...@globe.de <javascript:_e(%7B%7D,'cvml','w...@globe.de');>
> To: ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> Sent: Thursday, July 21, 2016 5:11:21 PM
> Subject: [ceph-users] Ceph + VMware + Single Thread Performance
>
> Hi everyone,
>
> we see at our cluster relatively slow Single Thread Performance on the
> iscsi Nodes.
>
>
> Our setup:
>
> 3 Racks:
>
> 18x Data Nodes, 3 Mon Nodes, 3 iscsi Gateway Nodes with tgt (rbd cache
> off).
>
> 2x Samsung SM863 Enterprise SSD for Journal (3 OSD per SSD) and 6x
> WD Red 1TB per Data Node as OSD.
>
> Replication = 3
>
> chooseleaf = 3 type Rack in the crush map
>
>
> We get only ca. 90 MByte/s on the iscsi Gateway Servers with:
>
> rados bench -p rbd 60 write -b 4M -t 1
>
>
> If we test with:
>
> rados bench -p rbd 60 write -b 4M -t 32
>
> we get ca. 600 - 700 MByte/s
>
>
> We plan to replace the Samsung SSD with Intel DC P3700 PCIe NVM'e
> for the Journal to get better Single Thread Performance.
>
> Is anyone of you out there who has an Intel P3700 for Journal an
> can give me back test results with:
>
>
> rados bench -p rbd 60 write -b 4M -t 1
>
>
> Thank you very much !!
>
> Kind Regards !!
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> <javascript:_e(%7B%7D,'cvml','ceph-users@lists.ceph.com');>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph + VMware + Single Thread Performance

Reply via email to