Adam Cheal wrote:
Just submitted the bug yesterday, under advice of James, so I don't have a number you can 
refer to you...the "change request" number is 6894775 if that helps or is 
directly related to the future bugid.

From what I seen/read this problem has been around for awhile but only rears 
its ugly head under heavy IO with large filesets, probably related to large 
metadata sets as you spoke of. We are using snv_118 x64 but it seems to appear 
in snv_123 and snv_125 as well from what I read here.

We've tried installing SSD's to act as a read-cache for the pool to reduce the metadata 
hits on the physical disks and as a last-ditch effort we even tried switching to the 
"latest" LSI-supplied itmpt driver from 2007 (from reading 
http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/) and disabling 
the mpt driver but we ended up with the same timeout issues. In our case, the drives in 
the JBODs are all WD (model WD1002FBYS-18A6B0) 1TB 7.2k SATA drives.

In revisting our architecture, we compared it to Sun's x4540 Thumper offering which uses 
the same controller with similar (though apparently customized) firmware and 48 disks. 
The difference is that they use 6 x LSI1068e controllers which each have to deal with 
only 8 disks...obviously better on performance but this architecture could be 
"hiding" the real IO issue by distributing the IO across so many controllers.

Hi Adam,
I was watching the incoming queues all day yesterday for the
bug, but missed seeing it, not sure why.

I've now moved the bug to the appropriate category so it will
get attention from the right people.


Thanks,
James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp       http://www.jmcp.homeunix.com/blog
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to