Hi, > On 18 Mar 2015, at 05:29, Christian Balzer <ch...@gol.com> wrote: > > > Hello, > > On Wed, 18 Mar 2015 03:52:22 +0100 Josef Johansson wrote: > >> Hi, >> >> I’m planning a Ceph SSD cluster, I know that we won’t get the full >> performance from the SSD in this case, but SATA won’t cut it as backend >> storage and SAS is the same price as SSD now. >> > Have you actually tested SATA with SSD journals? > Given a big enough (number of OSDs) cluster you should be able to come > close to SSD performance currently achievable with regards to a single > client. > Yeah, The problem is really the latency when backing storage is fully utilised, especially while rebalancing data and deep scrubbing. The MySQL is actually living inside a Journal + SATA backing storage atm, so this is the problem I’m trying to solve. >> The backend network will be a 10GbE active/passive, but will be used >> mainly for MySQL, so we’re aiming for swallowing IO. >> > Is this a single MySQL instance or are we talking various VMs here? > If you're flexible in regards to the network, Infiniband will give you > lower latency, especially with the RDMA stuff being developed currently > for Ceph (I'd guess a year or so out). > Because with single (or few) clients, IOPS per client/thread are limited > by the latency of the network (and of course the whole Ceph stack) more > than anything else, so on a single thread you're never going to see > performance anywhere near what a local SSD could deliver. > Going to use 150-200 MySQL clients, one on each VM, so the load should be good for Ceph =) And sadly I’m in no position to use RDMA etc, as It’s decided with 10Gbase-T. Really liking the SM servers with 4x 10Gbase-T =) Thanks for the recommendation though.
>> So, for 10x SSD drivers, what kind of CPU would that need? Just go all >> out with two 10x cores 3.5GHz? I read somewhere that you should use as >> fast CPUs that you can afford. >> > Indeed. > With 10 SSDs even that will probably be CPU bound with small IOPS and > current stable Ceph versions. > See my list archive URL below. > > What size SSDs? > Is the number of SSDs a result of needing the space, or is it there to get > more OSDs and thus performance? Both, performance and space. So 1TB drives (well 960GB in this case) 100GB MySQL for 100VMs. (Calculated on a replication of 3) > >> Planning on using the Samsung 845 DC EVO, anyone using these in current >> ceph clusters? >> > I'm using them in a DRBD cluster where they were a good fit as their write > endurance was a match for the use case, I needed lots of space (960GB ones) > and the relatively low price was within my budget. > > While I'm not tearing out my hairs and curse the day I ever considered > using them, their speed, endurance and some spurious errors I'm not seeing > with Intel DC S3700s in the same server have me considering DC S3610s > instead for the next cluster of this type I'm currently planning. > > Compare those Intel DC S3610 and DC S3700 with Samsung 845 DC Pro if you're > building a Ceph cluster, the write amplification I'm seeing with SSD > backed Ceph clusters will turn your EVO's into scrap metal in no time,. > > Consider what you think your IO load (writes) generated by your client(s) > will be, multiply that by your replication factor, divide by the number of > OSDs, that will give you the base load per OSD. > Then multiply by 2 (journal on OSD) per OSD. > Finally based on my experience and measurements (link below) multiply that > by at least 6, probably 10 to be on safe side. Use that number to find the > SSD that can handle this write load for the time period you're budgeting > that cluster for. > http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html > > <http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html> It feels that I can’t go with anything else than at least S3610. Especially if it’s replication set of 2. Haven’t done much reading about the S3610 I will go into depth on them. > >> We though of doing a cluster with 3 servers, and any recommendation of >> supermicro servers would be appreciated. >> > Why 3, replication of 3? > With Intel SSDs and diligent (SMART/NAGIOS) wear level monitoring I'd > personally feel safe with a replication factor of 2. > I’ve seen recommendations of replication 2! The Intel SSDs are indeed endurable. This is only with Intel SSDs I assume? > I used one of these chassis for the DRBD cluster mentioned above, the > version with Infiniband actually: > http://www.supermicro.com.tw/products/system/2U/2028/SYS-2028TP-DC0TR.cfm > > It's compact, the LSI can be flashed into IT mode (or demand it in IT mode > from your SM vendor) so all the SSD drives are directly accessible and thus > capable of being (fs)TRIM'ed. Not that this matters much with Intel DCs. > > SM also has 1U servers that fit this drive density bill, but compared to > the 2U servers their 1U rails are very dingy (comes with the size I > guess). ^o^ Yeah, IT mode is the way to go. I tried using RAID 0 to utilise the RAID Cache, but then you have problems with not being able to plug in a new drive easily etc. This 1U http://www.supermicro.com.tw/products/system/1U/1028/SYS-1028U-TR4T_.cfm <http://www.supermicro.com.tw/products/system/1U/1028/SYS-1028U-TR4T_.cfm> is really nice, missing the SuperDOM peripherals though.. so you really get 8 drives if you need two for OS. And the rails.. don’t get me started, but lately they do just snap into the racks! No screws needed. That’s a refresh from earlier 1U SM rails. Thanks! Josef > > Regards, > > Christian > >> Cheers, >> Josef >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- > Christian Balzer Network/Systems Engineer > ch...@gol.com Global OnLine Japan/Fusion Communications > http://www.gol.com/
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com