Hi,

> On 18 Mar 2015, at 05:29, Christian Balzer <ch...@gol.com> wrote:
> 
> 
> Hello,
> 
> On Wed, 18 Mar 2015 03:52:22 +0100 Josef Johansson wrote:
> 
>> Hi,
>> 
>> I’m planning a Ceph SSD cluster, I know that we won’t get the full
>> performance from the SSD in this case, but SATA won’t cut it as backend
>> storage and SAS is the same price as SSD now.
>> 
> Have you actually tested SATA with SSD journals?
> Given a big enough (number of OSDs) cluster you should be able to come
> close to SSD performance currently achievable with regards to a single
> client.
> 
Yeah, 
The problem is really the latency when backing storage is fully utilised, 
especially while rebalancing data and deep scrubbing.
The MySQL is actually living inside a Journal + SATA backing storage atm, so 
this is the problem I’m trying to solve.
>> The backend network will be a 10GbE active/passive, but will be used
>> mainly for MySQL, so we’re aiming for swallowing IO.
>> 
> Is this a single MySQL instance or are we talking various VMs here?
> If you're flexible in regards to the network, Infiniband will give you
> lower latency, especially with the RDMA stuff being developed currently
> for Ceph (I'd guess a year or so out).
> Because with single (or few) clients, IOPS per client/thread are limited
> by the latency of the network (and of course the whole Ceph stack) more
> than anything else, so on a single thread you're never going to see
> performance anywhere near what a local SSD could deliver.
> 
Going to use 150-200 MySQL clients, one on each VM, so the load should be good 
for Ceph =)
And sadly I’m in no position to use RDMA etc, as It’s decided with 10Gbase-T.
Really liking the SM servers with 4x 10Gbase-T =)
Thanks for the recommendation though.

>> So, for 10x SSD drivers, what kind of CPU would that need? Just go all
>> out with two 10x cores 3.5GHz? I read somewhere that you should use as
>> fast CPUs that you can afford.
>> 
> Indeed. 
> With 10 SSDs even that will probably be CPU bound with small IOPS and
> current stable Ceph versions.
> See my list archive URL below.
> 
> What size SSDs?
> Is the number of SSDs a result of needing the space, or is it there to get
> more OSDs and thus performance?
Both, performance and space. So 1TB drives (well 960GB in this case)
100GB MySQL for 100VMs. (Calculated on a replication of 3)
> 
>> Planning on using the Samsung 845 DC EVO, anyone using these in current
>> ceph clusters? 
>> 
> I'm using them in a DRBD cluster where they were a good fit as their write
> endurance was a match for the use case, I needed lots of space (960GB ones)
> and the relatively low price was within my budget.
> 
> While I'm not tearing out my hairs and curse the day I ever considered
> using them, their speed, endurance and some spurious errors I'm not seeing
> with Intel DC S3700s in the same server have me considering DC S3610s
> instead for the next cluster of this type I'm currently planning.
> 
> Compare those Intel DC S3610 and DC S3700 with Samsung 845 DC Pro if you're
> building a Ceph cluster, the write amplification I'm seeing with SSD
> backed Ceph clusters will turn your EVO's into scrap metal in no time,.
> 
> Consider what you think your IO load (writes) generated by your client(s)
> will be, multiply that by your replication factor, divide by the number of
> OSDs, that will give you the base load per OSD. 
> Then multiply by 2 (journal on OSD) per OSD.
> Finally based on my experience and measurements (link below) multiply that
> by at least 6, probably 10 to be on safe side. Use that number to find the
> SSD that can handle this write load for the time period you're budgeting
> that cluster for.
> http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html
>  
> <http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html>
It feels that I can’t go with anything else than at least S3610. Especially if 
it’s replication set of 2.
Haven’t done much reading about the S3610 I will go into depth on them.
> 
>> We though of doing a cluster with 3 servers, and any recommendation of
>> supermicro servers would be appreciated.
>> 
> Why 3, replication of 3? 
> With Intel SSDs and diligent (SMART/NAGIOS) wear level monitoring I'd
> personally feel safe with a replication factor of 2.
> 
I’ve seen recommendations  of replication 2!  The Intel SSDs are indeed 
endurable.
This is only with Intel SSDs I assume?
> I used one of these chassis for the DRBD cluster mentioned above, the
> version with Infiniband actually:
> http://www.supermicro.com.tw/products/system/2U/2028/SYS-2028TP-DC0TR.cfm
> 
> It's compact, the LSI can be flashed into IT mode (or demand it in IT mode
> from your SM vendor) so all the SSD drives are directly accessible and thus
> capable of being (fs)TRIM'ed. Not that this matters much with Intel DCs. 
> 
> SM also has 1U servers that fit this drive density bill, but compared to
> the 2U servers their 1U rails are very dingy (comes with the size I
> guess). ^o^
Yeah, IT mode is the way to go. I tried using RAID 0 to utilise the RAID Cache, 
but then you have problems with not being able to plug in a new drive easily 
etc.

This 1U 
http://www.supermicro.com.tw/products/system/1U/1028/SYS-1028U-TR4T_.cfm 
<http://www.supermicro.com.tw/products/system/1U/1028/SYS-1028U-TR4T_.cfm> is 
really nice, missing the SuperDOM peripherals though.. so you really get 8 
drives if you need two for OS.
And the rails.. don’t get me started, but lately they do just snap into the 
racks! No screws needed. That’s a refresh from earlier 1U SM rails.

Thanks!

Josef
> 
> Regards,
> 
> Christian
> 
>> Cheers,
>> Josef
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> -- 
> Christian Balzer        Network/Systems Engineer                
> ch...@gol.com         Global OnLine Japan/Fusion Communications
> http://www.gol.com/

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to