Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

Jake Young Tue, 28 Apr 2015 08:49:11 -0700

On Tuesday, April 28, 2015, Nick Fisk <n...@fisk.me.uk> wrote:

>
>
>
>
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com
> <javascript:;>] On Behalf Of
> > Dominik Hannen
> > Sent: 28 April 2015 15:30
> > To: Jake Young
> > Cc: ceph-users@lists.ceph.com <javascript:;>
> > Subject: Re: [ceph-users] Cost- and Powerefficient OSD-Nodes
> >
> > >> Interconnect as currently planned:
> > >> 4 x 1Gbit LACP Bonds over a pair of MLAG-capable switches (planned:
> > >> EX3300)
> >
> > > One problem with LACP is that it will only allow you to have 1Gbps
> > > between any two IPs or MACs (depending on your switch config). This
> > > will most likely limit the throughput of any client to 1Gbps, which is
> > > equivalent to 125MBps storage throughput.  It is not really equivalent
> > > to a 4Gbps interface or 2x 2Gbps interfaces (if you plan to have a
> > > client network and cluster network).
> >
> > 2 x (2 x 1Gbit) was on my mind with cluster/public separated, if the
> > performance of 4 x 1Gbit LACP would not deliver.
> > Regarding source-IP/dest-IP hashing with LACP. Wouldn't it be sufficient
> to
> > give each osd-process its own IP for cluster/public then?



I'm not sure this is supported. It would probably require a custom CRUSH
map.  I don't know if a host bucket can support multiple IPs. It is a good
idea though, I wish I thought of it last year!

>
> > I am not sure if 4-link LACP will be problematic with enough systems in
> the
> > cluster. Maybe 8 osd-nodes will not be enough to balance it out.
> > It is not important if every client is able to get peak performance out
> of
> it.
> >
> > > I have implemented a small cluster with no SSD journals, and the
> > > performance is pretty good.
> > >
> > > 42 osds, 3x replication, 40Gb NICs rados bench shows me 2000 iops at
> > > 4k writes and 500MBps at 4M writes.
> > >
> > > I would trade your SSD journals for 10Gb NICs and switches.  I started
> > > out with the same 4x 1Gb LACP config and things like
> > > rebalancing/recovery were terribly slow, as well as the throughput
> limit
> I
> > mentioned above.
> >
> > The SSDs are about ~100USD a piece. I tried to find cost-efficient 10G-
> > switches. There it also the power-efficiency in question, a 10G-T Port
> burns
> > about 3~5 Watt on its own. Which would put SFP+ Ports (0.7W/Port) on the
> > table.
>
> I think the latest switches/Nic's reduce this slightly more if you enable
> the power saving options and keep the cable length short.
>
> >
> > Can you recommend a 'cheap' 10G-switch/NICs?
>
> I using the Dell N4032's. they seem to do the job and aren't too expensive.
> For the server side, we got servers with 10GB-T built in for almost the
> same
> cost at the 4x1GB models.


I'm using a pair of Cisco Nexus 5672UP switches. There are other Nexus
5000 models that are less expensive, but it's pretty affordable for 48 10Gb
ports and 6 40Gb uplinks.

I have Cisco UCS servers that have the Cisco VICs.


> >
> > > When you get more funding next quarter/year, you can choose to add the
> > > SSD journals or more OSD nodes. Moving to 10Gb networking after you
> > > get the cluster up and running will be much harder.
> >
> > My thinking was that the switches (EX3300) with their 10G uplinks would
> > deliver in the case that I would like to add in some 10G switches and
> hosts
> > later.
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com <javascript:;>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

Reply via email to