Bob Friesenhahn wrote: > On Mon, 28 Jul 2008, BG wrote: > > >> indeed that's one of the nice things that ZFS is picky on data and >> allerts you immediatly. Before some files became corrupt and one was >> wondering what happend and how this was possible since everything >> seems fine for months :) >> > > Unfortunately, ZFS does not detect or correct memory errors. Memory > reliability is currently an Achilles' heel for ZFS, which blows MTTDL > models which are based on disk media reliability alone. >
We can (and do) model systems complete with the data path from CPU to memory to PCI* to HBA to disk and back. Basically, the results will show that you want ECC memory and PCI-Express as major technology components. FWIW, Sun no longer sells computers without ECC memory. But ZFS can do better. I filed CR6674679 which basically says that if redundant copies of data have the same, wrong checksum, then ZFS should issue an e-report to that effect. This will allow you to move suspicion away from the disks as a root cause towards a common cause, like memory, shared HBA or bus, etc. It won't be able to recover the data, but it can help debug the system. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss