Re: [zfs-discuss] Bottlenecks in building a system

Adam Lindsay Fri, 20 Apr 2007 16:40:05 -0700

[EMAIL PROTECTED] wrote:

I suspect that if you have a bottleneck in your system, it would be due
to the available bandwidth on the PCI bus.
Mm. yeah, it's what I was worried about, too (mostly through ignoranceof the issues), which is why I was hoping HyperTransport and PCIe weregoing to give that data enough room on the bus.But after others expressed the opinion that the Areca PCIe cards wereoverkill, I'm now looking to putting some PCI-X cards on a different(probably slower) motherboard.


I dug up a copy of the S2895 block diagram and asked Bill Moore about
it.  He said that you should be able to get about 700mb/s off of each of
the PCI-X channels and that you only need 100mb/s to saturate a GigE
link.  He also observed that the RAID card you were using was
unnecessary and would probably hamper performance.  He reccomended
non-RAID SATA cards based upon the Marvell chipset.

Here's the e-mail trail on this list where he discusses Marvell SATA
cards in a bit more detail:

http://mail.opensolaris.org/pipermail/zfs-discuss/2006-March/016874.html

It sounds like if getting disk -> network is the concern, you'll have
plenty of bandwidth, assuming you have a reasonable controller card.


Well, if that isn't from the horse's mouth, I don't know what is.

Elsewhere in the thread, I mention that I'm trying to go for a simplersystem (well, less dependent upon PCIe) in favour of the S2892, whichhas the added benefit of having a NIC that is less maligned in thecommunity. From what I can tell of the block diagram, it looks like thePCI-X subsystem is similar enough (except that it's shared with theNIC). It's sounding like a safe compromise to me, to use the Marvellchips on the oft-cited SuperMicro cards.

Caching isn't going to be a huge help for writes, unless there's another
thread reading simultaneoulsy from the same file.

Prefetch will definitely use the additional RAM to try to boost the
performance of sequential reads.  However, in the interest of full
disclosure, there is a pathology that we've seen where the number of
sequential readers exceeds the available space in the cache.  In this
situation, sometimes the competeing prefetches for the different streams
will cause more temporally favorable data to be evicted from the cache
and performance will drop.  The workaround right now is just to disable
prefetch.  We're looking into more comprehensive solutions.

Interesting. So noted. I will expect to have to test thoroughly.


If you run across this problem and are willing to let me debug on your
system, shoot me an e-mail.  We've only seen this in a couple of
situations and it was combined with another problem where we were seeing
excessive overhead for kcopyout.  It's unlikely, but possible that you'll
hit this.

That's one heck of an offer. I'd have no problem with this, nor withtaking requests for particular benchmarks from the community. It'sessentially a research machine, and if it can help others out, I'm allfor it.


Now time to check on the project budget... :)

thanks,
adam
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Bottlenecks in building a system

Reply via email to