Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Bill, I see similar results on my test systems Thanks for this report and for confirming our observations. Could you please confirm that a single-port bidrectional UDP link runs at wire speed? This helps to localize the problem to the TCP stack or interaction of the TCP stack with the e10

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi David, Could this be an issue with pause frames? At a previous job I remember having issues with a similar configuration using two broadcom sb1250 3 gigE port devices. If I ran bidirectional tests on a single pair of ports connected via cross over, it was slower than when I gave each dire

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Bill, I see similar results on my test systems Thanks for this report and for confirming our observations. Could you please confirm that a single-port bidrectional UDP link runs at wire speed? This helps to localize the problem to the TCP stack or interaction of the TCP stack with the

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-31 Thread Bruce Allen
Hi Sangtae, Thanks for joining this discussion -- it's good to a CUBIC author and expert here! In our application (cluster computing) we use a very tightly coupled high-speed low-latency network. There is no 'wide area traffic'. So it's hard for me to understand why any networking componen

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Stephen, Indeed, we are not asking to see 1000 Mb/s. We'd be happy to see 900 Mb/s. Netperf is trasmitting a large buffer in MTU-sized packets (min 1500 bytes). Since the acks are only about 60 bytes in size, they should be around 4% of the total traffic. Hence we would not expect to see

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Stephen, Thanks for your helpful reply and especially for the literature pointers. Indeed, we are not asking to see 1000 Mb/s. We'd be happy to see 900 Mb/s. Netperf is trasmitting a large buffer in MTU-sized packets (min 1500 bytes). Since the acks are only about 60 bytes in size, they s

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi David, Thanks for your note. (The performance of a full duplex stream should be close to 1Gb/s in both directions.) This is not a reasonable expectation. ACKs take up space on the link in the opposite direction of the transfer. So the link usage in the opposite direction of the transfer

Re: e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
Hi Andi, Thanks for the reply. You forgot to specify what user programs you used to get to the benchmark results. e.g. if the user space does not use large enough reads/writes then performance will be not optimal. We used netperf (as stated in the first paragraph of the original post). Tell

e1000 full-duplex TCP performance well below wire speed

2008-01-30 Thread Bruce Allen
ther plots): We're happy to do additional testing, if that would help, and very grateful for any advice! Bruce Allen Carsten Aulbert Henning Fehrmann -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Mor

Re: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

2007-11-06 Thread Bruce Allen
All these are caused by smartd. Updating should fix the problem. Okay, but there is no newer smartd than what I'm using. (5.37) Bruce? Original thread can be read from... http://thread.gmane.org/gmane.linux.kernel/588972 The fixes were added in smartmontools CVS, but there hasn't been a r

Re: ECC and DMA to/from disk controllers

2007-09-12 Thread Bruce Allen
Alan, Robert, Dick, Thank you all for the informed and helpful response! Alan, I'll pass your comments on to Peter Kelemen. Not sure if he follows LKML. I think he'll be interested in your characterization of the error types. I'll point him to the thread. (I think Peter and his collaborat

ECC and DMA to/from disk controllers

2007-09-10 Thread Bruce Allen
Dear LKML, Apologies in advance for potential mis-use of LKML, but I don't know where else to ask. An ongoing study on datasets of several Petabytes have shown that there can be 'silent data corruption' at rates much larger than one might naively expect from the expected error rates in RAID

Re: SMART problems in 2.6.22

2007-07-16 Thread Bruce Allen
Tejun: thanks for pointing out this patch. Kai, Klaus: thanks for testing the patch! Petr: thanks for fixing the SMART 2.6.22 problems! Jeff: two user (Kai, Klaus) both saw the SMART STATUS problem disappear when they tested this libata patch. I hope you stick it into your own source tree.

Re: SMART problems in 2.6.22

2007-07-10 Thread Bruce Allen
On Tue, 10 Jul 2007, Douglas Gilbert wrote: Kai Makisara wrote: I have done some more debugging on this one. An easy way to reproduce the The log shows that the sense data returned by the commands differ: with 2.6.22 the bytes 4f and 2c (tf.lbam and tf.lbah) are not returned. Both of th

Re: SMART problems in 2.6.22

2007-07-09 Thread Bruce Allen
Hi Jeff, It's possible that the recent addition of ACPI support will cause disks to be in different modes than previously expected. ACPI supplies ATA taskfiles to be pushed to the disk, and who knows what's in there... Is there a simple way I can have affected users test this? Is there a k

Re: SMART problems in 2.6.22

2007-07-09 Thread Bruce Allen
Hi David, http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg164863.html This is mine and although it's a 'real' problem, it is something that's easy to hack around by having the suspend script turn on smart after it is resumed. (Of course I can't use resume until a skge wol bug is

Re: SMART problems in 2.6.22

2007-07-09 Thread Bruce Allen
On Sun, 8 Jul 2007, Jeff Garzik wrote: Jeff, thanks for the quick feedback. On the base point, libata has never enabled SMART on its own. That's always up to the BIOS, etc. OK, clear. It's possible that the recent addition of ACPI support will cause disks to be in different modes than prev

Re: SMART problems in 2.6.22

2007-07-08 Thread Bruce Allen
grade Bruce On Sun, 8 Jul 2007, Bruce Allen wrote: Mark, David, Doug, Tejin, Alan, Jeff, LKML, I'm afraid that there may be some problem with SMART + libata in the 2.6.22 kernel. An hour ago I discovered that I missed a month of correspondence (some LKML, some private) about this problem

SMART problems in 2.6.22

2007-07-08 Thread Bruce Allen
Mark, David, Doug, Tejin, Alan, Jeff, LKML, I'm afraid that there may be some problem with SMART + libata in the 2.6.22 kernel. An hour ago I discovered that I missed a month of correspondence (some LKML, some private) about this problem which Alan, Tejun, Jeff, Mark and others copied to me -