Here is an example of my data measured in seconds: communication overhead = commuT + migraT + print, compuT is computational cost, totalT = compuT + communication overhead, overhead% denotes percentage of communication overhead
intelmpi (walltime=00:03:51) iter [commuT migraT printT] compuT totalT overhead% 3999 4.945993e-03 2.689362e-04 1.440048e-04 1.689100e-02 2.224994e-02 2.343795e+01 5999 4.938126e-03 1.451969e-04 2.689362e-04 1.663089e-02 2.198315e-02 2.312373e+01 7999 4.904985e-03 1.490116e-04 1.451969e-04 1.678491e-02 2.198410e-02 2.298933e+01 9999 4.915953e-03 1.380444e-04 1.490116e-04 1.687193e-02 2.207494e-02 2.289473e+01 openmpi (walltime=00:04:32) iter [commuT migraT printT] compuT totalT overhead% 3999 3.574133e-03 1.139641e-04 1.089573e-04 1.598001e-02 1.977706e-02 1.864836e+01 5999 3.574848e-03 1.189709e-04 1.139641e-04 1.599526e-02 1.980305e-02 1.865278e+01 7999 3.571033e-03 1.168251e-04 1.189709e-04 1.601100e-02 1.981783e-02 1.860879e+01 9999 3.587008e-03 1.258850e-04 1.168251e-04 1.596618e-02 1.979589e-02 1.875587e+01 It can be seen that Open MPI is faster in both communication and computation measured by MPI_Wtime calls, but the wall time reported by PBS pro is larger. Beichuan -----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Beichuan Yan Sent: Thursday, March 20, 2014 15:15 To: Open MPI Users Subject: Re: [OMPI users] OpenMPI job initializing problem As for the performance, my 4-node (64-processes) 3-hour job indicates Intel MPI and OpenMPI have close benchmarks. Intel MPI takes 2:53 while Open MPI takes 3:10. It is interesting that all my MPI_Wtime calls show OpenMPI is faster (up to twice or even more) than Intel MPI in communication for a single loop, but in overall wall time Open MPI is 10% slower for like 500K loops. The computing times are nearly the same. This is a little confusing. I may set up and run a new test. Beichuan -----Original Message----- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres (jsquyres) Sent: Thursday, March 20, 2014 11:15 To: Open MPI Users Subject: Re: [OMPI users] OpenMPI job initializing problem On Mar 20, 2014, at 12:48 PM, Beichuan Yan <beichuan....@colorado.edu> wrote: > 2. http://www.open-mpi.org/community/lists/users/2011/11/17684.php > In the upcoming OMPI v1.7, we revamped the shared memory setup code such that > it'll actually use /dev/shm properly, or use some other mechanism other than > a mmap file backed in a real filesystem. So the issue goes away. Woo hoo! > my comment: up to OMPI v1.7.4, this shmem issue is still there. However, it > is resolved in OMPI v1.7.5rc5. This is surprising. > > Anyway, OMPI v1.7.5rc5 works well for multi-processes-on-one-node (shmem) > mode on Spirit. There is no need to tune TCP or IB parameters to use it. My > code just runs well: Great! > My test data takes 20 minutes to run with OMPI v1.7.4, but needs less than 1 > minute with OMPI v1.7.5rc5. I don't know what the magic is. I am wondering > when OMPI v1.7.5 final will be released. Wow -- that sounds like a fundamental difference there. Could be something to do with the NFS tmp directory...? I could see how that could cause oodles of unnecessary network traffic. 1.7.5 should be released ...immanently... -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users