What I think is happening is this:

The initial transfer rate you are seeing is the burst rate; after a long
time average, your sustained transfer rate emerges. Like George said, you
should use a proven tool to measure your bandwidth. We use netperf, a
freeware from HP.

That said, the ethernet technology is not a good candidate for HPC (one
reason people don't use it in the backplanes, despite the low cost). Do the
math yourself: there is a 54 byte overhead (14 B ethernet + 20B IP + 20B
TCP) for every packet sent, for socket communication. That is why protocols
like uDAPL over Infiniband is gaining in popularity.

Durga


On 10/23/06, Jayanta Roy <j...@ncra.tifr.res.in> wrote:

Hi,

I have tried with lamboot with a host file where odd-even nodes will talk
within themselves using eth0 and talk across them using eth1. So my
transfer runs @ 230MB/s at starting. But after few transfers rate falls
down to ~130MB/s and after long run finally comes to ~54MB/s. Why this
type of network slowing down with time is happenning?

Regards,
Jayanta

On Mon, 23 Oct 2006, Durga Choudhury wrote:

> Did you try channel bonding? If your OS is Linux, there are plenty of
> "howto" on the internet which will tell you how to do it.
>
> However, your CPU might be the bottleneck in this case. How much of CPU
> horsepower is available at 140MB/s?
>
> If the CPU *is* the bottleneck, changing your network driver (e.g. from
> interrupt-based to poll-based packet transfer) might help. If you are
> unfamiliar with writing network drivers for your OS, this may not be a
> trivial task, though.
>
> Oh, and like I pointed out last time, if all of the above seem OK, try
> putting your second link to a separate PC and see if you can gate twice
the
> throughput. If so, then the ECMP implementation of your IP stack is what
is
> causing the problem. This is the hardest one to fix. You could rewrite a
few
> routines in ipv4 processing and recompile the Kernel, if you are
familiar
> with Kernel building and your OS is Linux.
>
>
> On 10/23/06, Jayanta Roy <j...@ncra.tifr.res.in> wrote:
>>
>> Hi,
>>
>> Sometime before I have posted doubts about using dual gigabit support
>> fully. See I get ~140MB/s full duplex transfer rate in each of
following
>> runs.....
>>
>> mpirun --mca btl_tcp_if_include eth0 -n 4 -bynode -hostfile host a.out
>>
>> mpirun --mca btl_tcp_if_include eth1 -n 4 -bynode -hostfile host a.out
>>
>> How to combine these two port or use a proper routing table in place
host
>> file? I am using openmpi-1.1 version.
>>
>> -Jayanta
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Devil wanted omnipresence;
> He therefore created communists.
>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Jayanta Roy
National Centre for Radio Astrophysics  |  Phone  : +91-20-25697107
Tata Institute of Fundamental Research  |  Fax    : +91-20-25692149 Pune
University Campus, Pune 411 007    |  e-mail : j...@ncra.tifr.res.in
India
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Devil wanted omnipresence;
He therefore created communists.

Reply via email to