Re: [OMPI users] very bad parallel scaling of vasp using openmpi

Craig Plaisance Tue, 18 Aug 2009 13:34:14 -0400

I ran a test of tcp using NetPIPE and got throughput of 850 Mb/s atmessage sizes of 128 Kb. The latency was 50 us. At message sizes above1000 Kb, the throughput oscillated wildly between 850 Mb/s and values aslow as 200 Mb/s. This test was done with no other network traffic. Ithen ran four tests simultaneously between different pairs of computenodes and saw a drastic decrease in performance. The highest stable(non-oscillating) throughput was about 500 Mb/s at a message size of 16Kb. The throughput then oscillated wildly, with the maximum valueclimbing to 850 Mb/s at a message size greater than 128 Kb and droppingto values as low as 100 Mb/s. The code I am using (VASP) has 100 to1000 double complex (16 byte) arrays containing 100,000 to 1,000,000elements each. Typically, the arrays are distributed among the nodes.The most communication intensive part involves executing an MPI_alltoallto redistribute the arrays so that node i contains the ith block of eacharray. The default message size is 1000 elements (128 Kb), so accordingto the NetPIPE test, I should be getting very good throughput when thereis no other network traffic. I will run a NetPIPE test with openmpi andmpich2 now and post the results. So, does anyone know what causes thewild oscillations in the throughput at larger message sizes and highernetwork traffic? Thanks!

Re: [OMPI users] very bad parallel scaling of vasp using openmpi

Reply via email to