What I think is happening is this: The initial transfer rate you are seeing is the burst rate; after a long time average, your sustained transfer rate emerges. Like George said, you should use a proven tool to measure your bandwidth. We use netperf, a freeware from HP.
That said, the ethernet technology is not a good candidate for HPC (one reason people don't use it in the backplanes, despite the low cost). Do the math yourself: there is a 54 byte overhead (14 B ethernet + 20B IP + 20B TCP) for every packet sent, for socket communication. That is why protocols like uDAPL over Infiniband is gaining in popularity. Durga On 10/23/06, Jayanta Roy <j...@ncra.tifr.res.in> wrote:
Hi, I have tried with lamboot with a host file where odd-even nodes will talk within themselves using eth0 and talk across them using eth1. So my transfer runs @ 230MB/s at starting. But after few transfers rate falls down to ~130MB/s and after long run finally comes to ~54MB/s. Why this type of network slowing down with time is happenning? Regards, Jayanta On Mon, 23 Oct 2006, Durga Choudhury wrote: > Did you try channel bonding? If your OS is Linux, there are plenty of > "howto" on the internet which will tell you how to do it. > > However, your CPU might be the bottleneck in this case. How much of CPU > horsepower is available at 140MB/s? > > If the CPU *is* the bottleneck, changing your network driver (e.g. from > interrupt-based to poll-based packet transfer) might help. If you are > unfamiliar with writing network drivers for your OS, this may not be a > trivial task, though. > > Oh, and like I pointed out last time, if all of the above seem OK, try > putting your second link to a separate PC and see if you can gate twice the > throughput. If so, then the ECMP implementation of your IP stack is what is > causing the problem. This is the hardest one to fix. You could rewrite a few > routines in ipv4 processing and recompile the Kernel, if you are familiar > with Kernel building and your OS is Linux. > > > On 10/23/06, Jayanta Roy <j...@ncra.tifr.res.in> wrote: >> >> Hi, >> >> Sometime before I have posted doubts about using dual gigabit support >> fully. See I get ~140MB/s full duplex transfer rate in each of following >> runs..... >> >> mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out >> >> mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out >> >> How to combine these two port or use a proper routing table in place host >> file? I am using openmpi-1.1 version. >> >> -Jayanta >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > Devil wanted omnipresence; > He therefore created communists. > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Jayanta Roy National Centre for Radio Astrophysics | Phone : +91-20-25697107 Tata Institute of Fundamental Research | Fax : +91-20-25692149 Pune University Campus, Pune 411 007 | e-mail : j...@ncra.tifr.res.in India ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
-- Devil wanted omnipresence; He therefore created communists.