> > This raises another interesting question. Does anyone hear have a document > explaining how their BBU cache works EXACTLY (at cache / sata level) on their > server? Because I haven't been able to find any for mine (Dell PERC > H710/H710P). Can anyone tell me with godlike authority and precision, what > exactly happens inside that BBU post-power failure?
(and if you have that manual - how can you know it's accurate? that the implementation matches the manual and is free of bugs? because my M500s didn't match the packaging and neither did a H710 we bought - Dell had advertised features in some marketing material that were only present on the H710P) And I see UBER (unrecoverable bit error) rates for SSDs and HDDs, but has anyone ever seen them for the flash-based cache on their raid controller? Sleep well, friends. Graeme. On 07 Jul 2015, at 18:54, Graeme B. Bell <graeme.b...@nibio.no> wrote: > > That is a very good question, which I have raised elsewhere on the postgresql > lists previously. > > In practice: I have *never* managed to make diskchecker fail with the BBU > enabled in front of the drives and I spent days trying with plug pulls till I > reached the point where as a statistical event it just can't be that likely > at all. That's not to say it can't ever happen, just that I've taken all > reasonable measures that I can to find out on the time and money budget I had > available. > > In theory: It may be the fact the BBU makes the drives run at about half > speed, so that the capacitors go a good bit further to empty the cache, after > all: without the BBU in the way, the drive manages to save everything but the > last fragment of writes. But I also suspect that the controller itself maybe > replaying the last set of writes from around the time of power loss. > > Anyway I'm 50/50 on those two explanations. Any other thoughts welcome. > > This raises another interesting question. Does anyone hear have a document > explaining how their BBU cache works EXACTLY (at cache / sata level) on their > server? Because I haven't been able to find any for mine (Dell PERC > H710/H710P). Can anyone tell me with godlike authority and precision, what > exactly happens inside that BBU post-power failure? > > There is rather too much magic involved for me to be happy. > > G > > On 07 Jul 2015, at 18:27, Vitalii Tymchyshyn <v...@tym.im> wrote: > >> Hi. >> >> How would BBU cache help you if it lies about fsync? I suppose any RAID >> controller removes data from BBU cache after it was fsynced by the drive. As >> I know, there is no other "magic command" for drive to tell controller that >> the data is safe now and can be removed from BBU cache. >> >> Вт, 7 лип. 2015 11:59 Graeme B. Bell <graeme.b...@nibio.no> пише: >> >> Yikes. I would not be able to sleep tonight if it were not for the BBU cache >> in front of these disks... >> >> diskchecker.pl consistently reported several examples of corruption >> post-power-loss (usually 10 - 30 ) on unprotected M500s/M550s, so I think >> it's pretty much open to debate what types of madness and corruption you'll >> find if you look close enough. >> >> G >> >> >> On 07 Jul 2015, at 16:59, Heikki Linnakangas <hlinn...@iki.fi> wrote: >> >>> >>> So it lies about fsync()... The next question is, does it nevertheless >>> enforce the correct ordering of persisting fsync'd data? If you write to >>> file A and fsync it, then write to another file B and fsync it too, is it >>> guaranteed that if B is persisted, A is as well? Because if it isn't, you >>> can end up with filesystem (or database) corruption anyway. >>> >>> - Heikki >> >> >> >> -- >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-performance > -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance