Adam Cheal wrote:
Just submitted the bug yesterday, under advice of James, so I don't have a number you can
refer to you...the "change request" number is 6894775 if that helps or is
directly related to the future bugid.
From what I seen/read this problem has been around for awhile but only rears
its ugly head under heavy IO with large filesets, probably related to large
metadata sets as you spoke of. We are using snv_118 x64 but it seems to appear
in snv_123 and snv_125 as well from what I read here.
We've tried installing SSD's to act as a read-cache for the pool to reduce the metadata
hits on the physical disks and as a last-ditch effort we even tried switching to the
"latest" LSI-supplied itmpt driver from 2007 (from reading
http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/) and disabling
the mpt driver but we ended up with the same timeout issues. In our case, the drives in
the JBODs are all WD (model WD1002FBYS-18A6B0) 1TB 7.2k SATA drives.
In revisting our architecture, we compared it to Sun's x4540 Thumper offering which uses
the same controller with similar (though apparently customized) firmware and 48 disks.
The difference is that they use 6 x LSI1068e controllers which each have to deal with
only 8 disks...obviously better on performance but this architecture could be
"hiding" the real IO issue by distributing the IO across so many controllers.
Hi Adam,
I was watching the incoming queues all day yesterday for the
bug, but missed seeing it, not sure why.
I've now moved the bug to the appropriate category so it will
get attention from the right people.
Thanks,
James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss