Completely agree with Steve.

1. Intel NVMe looks like the best bet if you have modern enough hardware for 
NVMe. Otherwise e.g. S3700 mentioned elsewhere.

2. RAID controllers. 

We have e.g. 10-12 of these here and e.g. 25-30 SSDs, among various machines. 
This might give people idea about where the risk lies in the path from disk to 
CPU. 

We've had 2 RAID card failures in the last 12 months that nuked the array with 
days of downtime, and 2 problems with batteries suddenly becoming useless or 
suddenly reporting wildly varying temperatures/overheating. There may have been 
other RAID problems I don't know about. 

Our IT dept were replacing Seagate HDDs last year at a rate of 2-3 per week (I 
guess they have 100-200 disks?). We also have about 25-30 Hitachi/HGST HDDs.

So by my estimates:
30% annual problem rate with RAID controllers
30-50% failure rate with Seagate HDDs (backblaze saw similar results)
0% failure rate with HGST HDDs. 
0% failure in our SSDs.   (to be fair, our one samsung SSD apparently has a bug 
in TRIM under linux, which I'll need to investigate to see if we have been 
affected by). 

also, RAID controllers aren't free - not just the money but also the management 
of them (ever tried writing a complex install script that interacts work with 
MegaCLI? It can be done but it's not much fun.). Just take a look at the 
MegaCLI manual and ask yourself... is this even worth it (if you have a good 
MTBF on an enterprise SSD).

RAID was meant to be about ensuring availability of data. I have trouble 
believing that these days....

Graeme Bell


On 06 Jul 2015, at 18:56, Steve Crawford <scrawf...@pinpointresearch.com> wrote:

> 
> 2. We don't typically have redundant electronic components in our servers. 
> Sure, we have dual power supplies and dual NICs (though generally to handle 
> external failures) and ECC-RAM but no hot-backup CPU or redundant RAM banks 
> and...no backup RAID card. Intel Enterprise SSD already have power-fail 
> protection so I don't need a RAID card to give me BBU. Given the MTBF of good 
> enterprise SSD I'm left to wonder if placing a RAID card in front merely adds 
> a new point of failure and scheduled-downtime-inducing hands-on maintenance 
> (I'm looking at you, RAID backup battery).



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to