Re: [PERFORM] New server: SSD/RAID recommendations?

Graeme B. Bell Tue, 07 Jul 2015 10:04:13 -0700

> 
> This raises another interesting question. Does anyone hear have a document 
> explaining how their BBU cache works EXACTLY (at cache / sata level) on their 
> server? Because I haven't been able to find any for mine (Dell PERC 
> H710/H710P). Can anyone tell me with godlike authority and precision, what 
> exactly happens inside that BBU post-power failure?



(and if you have that manual - how can you know it's accurate? that the 
implementation matches the manual and is free of bugs? because my M500s didn't 
match the packaging and neither did a  H710 we bought - Dell had advertised 
features in some marketing material that were only present on the H710P)

And I see UBER (unrecoverable bit error) rates for SSDs and HDDs, but has 
anyone ever seen them for the flash-based cache on their raid controller?

Sleep well, friends.

Graeme. 

On 07 Jul 2015, at 18:54, Graeme B. Bell <[email protected]> wrote:

> 
> That is a very good question, which I have raised elsewhere on the postgresql 
> lists previously.
> 
> In practice: I have *never* managed to make diskchecker fail with the BBU 
> enabled in front of the drives and I spent days trying with plug pulls till I 
> reached the point where as a statistical event it just can't be that likely 
> at all. That's not to say it can't ever happen, just that I've taken all 
> reasonable measures that I can to find out on the time and money budget I had 
> available. 
> 
> In theory: It may be the fact the BBU makes the drives run at about half 
> speed, so that the capacitors go a good bit further to empty the cache, after 
> all: without the BBU in the way, the drive manages to save everything but the 
> last fragment of writes. But I also suspect that the controller itself maybe 
> replaying the last set of writes from around the time of power loss. 
> 
> Anyway I'm 50/50 on those two explanations. Any other thoughts welcome. 
> 
> This raises another interesting question. Does anyone hear have a document 
> explaining how their BBU cache works EXACTLY (at cache / sata level) on their 
> server? Because I haven't been able to find any for mine (Dell PERC 
> H710/H710P). Can anyone tell me with godlike authority and precision, what 
> exactly happens inside that BBU post-power failure?
> 
> There is rather too much magic involved for me to be happy.
> 
> G
> 
> On 07 Jul 2015, at 18:27, Vitalii Tymchyshyn <[email protected]> wrote:
> 
>> Hi.
>> 
>> How would BBU cache help you if it lies about fsync? I suppose any RAID 
>> controller removes data from BBU cache after it was fsynced by the drive. As 
>> I know, there is no other "magic command" for drive to tell controller that 
>> the data is safe now and can be removed from BBU cache.
>> 
>> Вт, 7 лип. 2015 11:59 Graeme B. Bell <[email protected]> пише:
>> 
>> Yikes. I would not be able to sleep tonight if it were not for the BBU cache 
>> in front of these disks...
>> 
>> diskchecker.pl consistently reported several examples of corruption 
>> post-power-loss (usually 10 - 30 ) on unprotected M500s/M550s, so I think 
>> it's pretty much open to debate what types of madness and corruption you'll 
>> find if you look close enough.
>> 
>> G
>> 
>> 
>> On 07 Jul 2015, at 16:59, Heikki Linnakangas <[email protected]> wrote:
>> 
>>> 
>>> So it lies about fsync()... The next question is, does it nevertheless 
>>> enforce the correct ordering of persisting fsync'd data? If you write to 
>>> file A and fsync it, then write to another file B and fsync it too, is it 
>>> guaranteed that if B is persisted, A is as well? Because if it isn't, you 
>>> can end up with filesystem (or database) corruption anyway.
>>> 
>>> - Heikki
>> 
>> 
>> 
>> --
>> Sent via pgsql-performance mailing list ([email protected])
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
> 


-- 
Sent via pgsql-performance mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] New server: SSD/RAID recommendations?

Reply via email to