Re: ECC and DMA to/from disk controllers

2007-09-14 Thread KELEMEN Peter
* Alan Cox ([EMAIL PROTECTED]) [20070910 14:54]: Alan, Thanks for your interest (and Bruce, for posting). > - The ECC level on the drive processors and memory cache vary > by vendor. Good luck getting any information on this although > maybe if you are Cern sized they will talk Do you have any

Re: ECC and DMA to/from disk controllers

2007-09-12 Thread Bruce Allen
Alan, Robert, Dick, Thank you all for the informed and helpful response! Alan, I'll pass your comments on to Peter Kelemen. Not sure if he follows LKML. I think he'll be interested in your characterization of the error types. I'll point him to the thread. (I think Peter and his collaborat

Re: ECC and DMA to/from disk controllers

2007-09-10 Thread Robert Hancock
Bruce Allen wrote: Dear LKML, Apologies in advance for potential mis-use of LKML, but I don't know where else to ask. An ongoing study on datasets of several Petabytes have shown that there can be 'silent data corruption' at rates much larger than one might naively expect from the expected

Re: ECC and DMA to/from disk controllers

2007-09-10 Thread linux-os \(Dick Johnson\)
On Mon, 10 Sep 2007, Bruce Allen wrote: > Dear LKML, > > Apologies in advance for potential mis-use of LKML, but I don't know where > else to ask. > > An ongoing study on datasets of several Petabytes have shown that there > can be 'silent data corruption' at rates much larger than one might > na

Re: ECC and DMA to/from disk controllers

2007-09-10 Thread Alan Cox
> In thinking about this, I began to wonder about the following. Suppose > that a (possibly RAID) disk controller correctly reads data from disk and > has correct data in the controller memory and buffers. However when that > data is DMA'd into system memory some errors occur (cosmic rays, >

ECC and DMA to/from disk controllers

2007-09-10 Thread Bruce Allen
Dear LKML, Apologies in advance for potential mis-use of LKML, but I don't know where else to ask. An ongoing study on datasets of several Petabytes have shown that there can be 'silent data corruption' at rates much larger than one might naively expect from the expected error rates in RAID