Re: [ceph-users] Terrible iSCSI tgt RBD performance

Thomas Foster Tue, 17 Mar 2015 15:08:55 -0700

Also what are you getting locally on your filesystem?  Looking at the specs
for a 840 pro, ~520MBps and based on the numbers you stated earlier your
arent getting close to that so there might be a problem at the server.
Once you start seeing better numbers at the local, then retry your iscsi
targets.
On Mar 17, 2015 6:02 PM, "Nick Fisk" <n...@fisk.me.uk> wrote:


> Hi Robin,
>
> Just a few things to try:-
>
> 1. Increase the number of worker threads for tgt (it's a parameter of tgtd,
> so modify however its being started)
> 2. Disable librbd caching in ceph.conf
> 3. Do you see the same performance problems exporting a krbd as a block
> device via tgt?
>
> Nick
>
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Robin H. Johnson
> > Sent: 17 March 2015 18:25
> > To: ceph-users@lists.ceph.com
> > Subject: [ceph-users] Terrible iSCSI tgt RBD performance
> >
> > I'm trying to get better performance out of exporting RBD volumes via tgt
> for
> > iSCSI consumers...
> >
> > By terrible, I'm getting <5MB/sec reads, <50IOPS. I'm pretty sure neither
> RBD
> > or iSCSI themselves are the problems; as the individually perform well.
> >
> > iSCSI to RAM-backed: >60MB/sec, >500IOPS iSCSI to SSD-backed:
> > >50MB/sec, >300IOPS iSCSI to RBD-backed: <5MB/sec, <50IOPS
> >
> > Cluster:
> > 4 nodes (ceph1..4):
> > - Supermicro 6027TR-D70RF+ (2U twin systems)
> >   - Chassis A: ceph1, ceph2
> >   - Chassis B: ceph3, ceph4
> > - 2x E5-2650
> > - 256GB RAM
> > - 4x 4TB Seagate ST4000NM0023 SAS, dedicated to Ceph
> > - 2x 512GB Samsung 840 PRO
> >   - MD RAID1
> >   - LVM
> >   - LV: OS on 'root', 20GiB
> >   - LV: Ceph Journals, 8GB, one per Ceph disk
> > - 2x Bonded 1GbE network
> > - 10GbE network:
> >   - port1: to switch
> >   - port2: direct-connect pairs: ceph1/3 ceph2/4 (vertical between
> chassis)
> > - All 4 nodes run OSPF
> >   - ceph1/2; ceph3/4: ~9.8Gbit bandwidth confirmed
> >   - ceph1/3; ceph2/4: ~18.2Gbit bandwidth confirmed
> > - The nodes also co-house VMs with Ganeti, backed onto the SSDs w/ DRBD;
> > - S3 is the main Ceph use-case, and it works well from the VMs.
> >
> > Direct performance on the nodes is reasonable good, but it would be nice
> if
> > the random performance were better.
> >
> > # rbd bench-write XXXXX
> > bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern seq ...
> > elapsed:    36  ops:   246603  ops/sec:  6681.20  bytes/sec: 29090920.91
> > # rbd bench-write XXXXX
> > bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern seq ...
> > elapsed:    48  ops:   246585  ops/sec:  5070.70  bytes/sec: 22080207.55
> > # rbd bench-write test.libraries.coop --io-pattern rand bench-write
> io_size
> > 4096 io_threads 16 bytes 1073741824 pattern rand ...
> > elapsed:   324  ops:   246178  ops/sec:   757.74  bytes/sec: 3305000.99
> > # rbd bench-write test.libraries.coop --io-threads 16 --io-pattern rand
> --io-
> > size 32768 bench-write  io_size 32768 io_threads 16 bytes 1073741824
> pattern
> > rand ...
> > elapsed:    86  ops:    30141  ops/sec:   347.39  bytes/sec: 12375512.34
> >
> > Yes I know the data below seems small; I have another older cluster of
> data
> > that I still have to merge to this newer hardware.
> >
> > # ceph -w
> >     cluster 401a58ef-5075-49ec-9615-1c2973624252
> >      health HEALTH_WARN 6 pgs stuck unclean; recovery 8472/241829 objects
> > degraded (3.503%); mds cluster is degraded; mds ceph1 is laggy
> >      monmap e3: 3 mons at
> > {ceph1=10.77.10.41:6789/0,ceph2=10.77.10.42:6789/0,ceph4=10.77.10.44:678
> > 9/0}, election epoch 11486, quorum 0,1,2 ceph1,ceph2,ceph4
> >      mdsmap e1496661: 1/1/1 up {0=ceph1=up:replay(laggy or crashed)}
> >      osdmap e4323895: 16 osds: 16 up, 16 in
> >       pgmap v14695205: 481 pgs, 17 pools, 186 GB data, 60761 objects
> >             1215 GB used, 58356 GB / 59571 GB avail
> >             8472/241829 objects degraded (3.503%)
> >                    6 active
> >                  475 active+clean
> >   client io 67503 B/s rd, 7297 B/s wr, 13 op/s
> >
> >
> > TGT setups:
> > Target 1: rbd.XXXXXXXXXXX
> >     System information:
> >         Driver: iscsi
> >         State: ready
> >     I_T nexus information:
> >         I_T nexus: 11
> >             Initiator: iqn.1993-08.org.debian:01:6b14da6a48b6 alias:
> > XXXXXXXXXXXXXXXX
> >             Connection: 0
> >                 IP Address: 10.77.110.6
> >     LUN information:
> >         LUN: 0
> >             Type: controller
> >             SCSI ID: IET     00010000
> >             SCSI SN: beaf10
> >             Size: 0 MB, Block size: 1
> >             Online: Yes
> >             Removable media: No
> >             Prevent removal: No
> >             Readonly: No
> >             SWP: No
> >             Thin-provisioning: No
> >             Backing store type: null
> >             Backing store path: None
> >             Backing store flags:
> >         LUN: 1
> >             Type: disk
> >             SCSI ID: IET     00010001
> >             SCSI SN: beaf11
> >             Size: 161061 MB, Block size: 512
> >             Online: Yes
> >             Removable media: No
> >             Prevent removal: No
> >             Readonly: No
> >             SWP: No
> >             Thin-provisioning: No
> >             Backing store type: rbd
> >             Backing store path: XXXXXXXXXXXXXXXXXXXXXXx
> >             Backing store flags:
> >     Account information:
> >     ACL information:
> >         XXXXXXXXXXXXXXXXXXXXXXXXXXXxx
> >
> > # tgtadm --lld iscsi --mode target --op show --tid 1
> > MaxRecvDataSegmentLength=8192
> > HeaderDigest=None
> > DataDigest=None
> > InitialR2T=Yes
> > MaxOutstandingR2T=1
> > ImmediateData=Yes
> > FirstBurstLength=65536
> > MaxBurstLength=262144
> > DataPDUInOrder=Yes
> > DataSequenceInOrder=Yes
> > ErrorRecoveryLevel=0
> > IFMarker=No
> > OFMarker=No
> > DefaultTime2Wait=2
> > DefaultTime2Retain=20
> > OFMarkInt=Reject
> > IFMarkInt=Reject
> > MaxConnections=1
> > RDMAExtensions=Yes
> > TargetRecvDataSegmentLength=262144
> > InitiatorRecvDataSegmentLength=262144
> > MaxOutstandingUnexpectedPDUs=0
> > MaxXmitDataSegmentLength=8192
> > MaxQueueCmd=128
> >
> >
> > --
> > Robin Hugh Johnson
> > Gentoo Linux: Developer, Infrastructure Lead
> > E-Mail     : robb...@gentoo.org
> > GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Terrible iSCSI tgt RBD performance

Reply via email to