On 20 December 2015 at 19:23, Francois Lafont <flafdiv...@free.fr> wrote:
> On 20/12/2015 22:51, Don Waterloo wrote: > > > All nodes have 10Gbps to each other > > Even the link client node <---> cluster nodes? > > > OSD: > > $ ceph osd tree > > ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY > > -1 5.48996 root default > > -2 0.89999 host nubo-1 > > 0 0.89999 osd.0 up 1.00000 1.00000 > > -3 0.89999 host nubo-2 > > 1 0.89999 osd.1 up 1.00000 1.00000 > > -4 0.89999 host nubo-3 > > 2 0.89999 osd.2 up 1.00000 1.00000 > > -5 0.92999 host nubo-19 > > 3 0.92999 osd.3 up 1.00000 1.00000 > > -6 0.92999 host nubo-20 > > 4 0.92999 osd.4 up 1.00000 1.00000 > > -7 0.92999 host nubo-21 > > 5 0.92999 osd.5 up 1.00000 1.00000 > > > > Each contains 1 x Samsung 850 Pro 1TB SSD (on sata) > > > > Each are Ubuntu 15.10 running 4.3.0-040300-generic kernel. > > Each are running ceph 0.94.5-0ubuntu0.15.10.1 > > > > nubo-1/nubo-2/nubo-3 are 2x X5650 @ 2.67GHz w/ 96GB ram. > > nubo-19/nubo-20/nubo-21 are 2x E5-2699 v3 @ 2.30GHz, w/ 576GB ram. > > > > the connections are to the chipset sata in each case. > > The fio test to the underlying xfs disk > > (e.g. cd /var/lib/ceph/osd/ceph-1; fio --randrepeat=1 --ioengine=libaio > > --direct=1 --gtod_reduce=1 --name=readwrite --filename=rw.data --bs=4k > > --iodepth=64 --size=5000MB --readwrite=randrw --rwmixread=50) > > shows ~22K IOPS on each disk. > > > > nubo-1/2/3 are also the mon and the mds: > > $ ceph status > > cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded > > health HEALTH_OK > > monmap e1: 3 mons at {nubo-1= > > > 10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0} > > election epoch 1104, quorum 0,1,2 nubo-1,nubo-2,nubo-3 > > mdsmap e621: 1/1/1 up {0=nubo-3=up:active}, 2 up:standby > > osdmap e2459: 6 osds: 6 up, 6 in > > pgmap v127331: 840 pgs, 6 pools, 144 GB data, 107 kobjects > > 289 GB used, 5332 GB / 5622 GB avail > > 840 active+clean > > client io 0 B/s rd, 183 kB/s wr, 54 op/s > > And you have "replica size == 3" in your cluster, correct? > Do you have specific mount options or specific options in ceph.conf > concerning ceph-fuse? > > So the hardware configuration of your cluster seems to me globally highly > better than my cluster (config given in my first message) because you have > 10Gb links (between the client and the cluster I have just 1Gb) and you > have full SSD OSDs. > > I have tried to put _all_ cephfs in my SSD: ie the pools "cephfsdata" _and_ > "cephfsmetadata" are in the SSD. The performances are slightly improved > because > I have ~670 iops now (with the fio command of my first message again) but > it > still seems to me bad. > > In fact, I'm curious to have the opinion of "cephfs" experts to know what > iops we can expect. If anaything, ~700 iops is a correct iops for our > hardware > configuration and maybe we are searching a problem which doesn't exist... All nodes are interconnected on 10G (actually 8x10G, so 80Gbps, but i have 7 disabled for this test). I have done a 'iperf' w/ TCP and verified I can achieve ~9Gbps between each pair. I have jumbo frames enabled (so 9000 MTU, 8982 route mtu). i have replica 2. My 2 cephfs pools are: pool 12 'cephfs_metadata' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 2239 flags hashpspool stripe_width 0 pool 13 'cephfs_data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 2243 flags hashpspool crash_replay_interval 45 stripe_width 0 w/ cephfs-fuse, i used default except added noatime. My ceph.conf is: [global] fsid = XXXX mon_initial_members = nubo-2, nubo-3, nubo-1 mon_host = 10.100.10.61,10.100.10.62,10.100.10.60 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd_pool_default_size = 2 public_network = 10.100.10.0/24 osd op threads = 6 osd disk threads = 6 [mon] mon clock drift allowed = .600
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com