Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released
Does use of 1.3.3 require recompilation of applications that were compiled using 1.3.2? Jim -Original Message- From: announce-boun...@open-mpi.org [mailto:announce-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Tuesday, July 14, 2009 2:11 PM To: OpenMPI Announce Subject: [Open MPI Announce] Open MPI v1.3.3 released The Open MPI Team, representing a consortium of research, academic, and industry partners, is pleased to announce the release of Open MPI version 1.3.3. This release is mainly a bug fix release over the v1.3.3 release, but there are few new features, including support for Microsoft Windows. We strongly recommend that all users upgrade to version 1.3.3 if possible. Version 1.3.3 can be downloaded from the main Open MPI web site or any of its mirrors (mirrors will be updating shortly). Here is a list of changes in v1.3.3 as compared to v1.3.2: - Fix a number of issues with the openib BTL (OpenFabrics) RDMA CM, including a memory corruption bug, a shutdown deadlock, and a route timeout. Thanks to David McMillen and Hal Rosenstock for help in tracking down the issues. - Change the behavior of the EXTRA_STATE parameter that is passed to Fortran attribute callback functions: this value is now stored internally in MPI -- it no longer references the original value passed by MPI_*_CREATE_KEYVAL. - Allow the overriding RFC1918 and RFC3330 for the specification of "private" networks, thereby influencing Open MPI's TCP "reachability" computations. - Improve flow control issues in the sm btl, by both tweaking the shared memory progression rules and by enabling the "sync" collective to barrier every 1,000th collective. - Various fixes for the IBM XL C/C++ v10.1 compiler. - Allow explicit disabling of ptmalloc2 hooks at runtime (e.g., enable support for Debain's builtroot system). Thanks to Manuel Prinz and the rest of the Debian crew for helping identify and fix this issue. - Various minor fixes for the I/O forwarding subsystem. - Big endian iWARP fixes in the Open Fabrics RDMA CM support. - Update support for various OpenFabrics devices in the openib BTL's .ini file. - Fixed undefined symbol issue with Open MPI's parallel debugger message queue support so it can be compiled by Sun Studio compilers. - Update MPI_SUBVERSION to 1 in the Fortran bindings. - Fix MPI_GRAPH_CREATE Fortran 90 binding. - Fix MPI_GROUP_COMPARE behavior with regards to MPI_IDENT. Thanks to Geoffrey Irving for identifying the problem and supplying the fix. - Silence gcc 4.1 compiler warnings about type punning. Thanks to Number Cruncher for the fix. - Added more Valgrind and other memory-cleanup fixes. Thanks to various Open MPI users for help with these issues. - Miscellaneous VampirTrace fixes. - More fixes for openib credits in heavy-congestion scenarios. - Slightly decrease the latency in the openib BTL in some conditions (add "send immediate" support to the openib BTL). - Ensure to allow MPI_REQUEST_GET_STATUS to accept an MPI_STATUS_IGNORE parameter. Thanks to Shaun Jackman for the bug report. - Added Microsoft Windows support. See README.WINDOWS file for details. ___ announce mailing list annou...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/announce
Re: [OMPI users] Open MPI:Problem with 64-bit openMPIandintel compiler
You can avoid the "library confusion problem" by building 64 bit and 32 bit version of openMPI in two different directories and then use mpi-selector (on your head and compute nodes) to switch between the two. Just my $0.02 Jim -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Friday, July 24, 2009 7:22 AM To: Open MPI Users Subject: Re: [OMPI users] Open MPI:Problem with 64-bit openMPIandintel compiler On Jul 23, 2009, at 11:14 PM, Ralph Castain wrote: > 3. get a multi-node allocation and run "pbsdsh echo $LD_LIBRARY_PATH" > and see what libs you are defaulting to on the other nodes. > Be careful with this one; you want to ensure that your local shell doesn't expand $LD_LIBRARY_PATH and simply display the same value on all nodes. It might be easiest to write a 2 line script and run that: $ cat myscript #!/bin/sh echo LD_LIB_PATH on `hostname` is: $LD_LIBRARY_PATH $ chmod +x myscript $ pdsh myscript -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] very bad parallel scaling of vasp using openmpi
Gbit Ethernet is well known to perform poorly for fine grained code like VASP. The latencies for Gbit Ethernet are much too high. If you want good scaling in a cluster for VASP, you'll need to run InfiniBand or some other high speed/ low latency network. Jim -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Monday, August 17, 2009 9:24 PM To: Open MPI Users Cc: David Hibbitts Subject: Re: [OMPI users] very bad parallel scaling of vasp using openmpi You might want to run some performance testing of you TCP stacks and the switch -- use a non-MPI application such as NetPIPE (or others -- google around) and see what kind of throughput you get. Try it between individual server peers and then try running it simultaneously between a bunch of peers and see if the results are different, etc. On Aug 17, 2009, at 5:51 PM, Craig Plaisance wrote: > Hi - I have compiled vasp 4.6.34 using the Intel fortran compiler 11.1 > with openmpi 1.3.3 on a cluster of 104 nodes running Rocks 5.2 with > two > quad core opterons connected by a Gbit ethernet. Running in > parallel on > one node (8 cores) runs very well, faster than any other cluster I > have > run it on. However, running on 2 nodes in parallel only improves the > performance by 10% over the one node case while running on 4 and 8 > nodes > yields no improvement over the two node case. Furthermore, when > running > multiple (3-4) jobs simultaneously, the performance decreases by > around > 50% compared to running only a single job on the entire cluster. The > nodes are connected by a Dell Powerconnect 6248 managed switch. I get > the same performance with mpich2, so I don't think it is a problem > specific to openmpi. Other vasp users have reported very good scaling > up to 4 nodes on a similar cluster, so I don't think the problem is > vasp > either. Could something be wrong with the way mpi is configured to > work > with the switch? Or the operating system is not configured to work > with > the switch properly? Or the switch itself needs to be configured? > Thanks! > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] very bad parallel scaling of vasp using openmpi
Gus, You hit the nail on the head. CPMD and VASP are both fine grained parallel Quantum Mechanics Molecular Dynamics Codes. I believe CPMD has implemented the domain decomposition methodology found in gromacs (a classical fine grained molecular dynamics code) which significantly diminishes the scaling problem). I do not believe VASP has done the same. Jim -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Tuesday, August 18, 2009 6:43 PM To: Open MPI Users Subject: Re: [OMPI users] very bad parallel scaling of vasp using openmpi Hi Craig, list Independent of any issues with your GigE switch, which you may need to address, you may want to take a look at the performance of the default OpenMPI MPI_Alltoall algorithm, which you say is a cornerstone of VASP. You can perhaps try alternative algorithms for different message sizes, using OpenMPI tuned collectives. Please, see this long thread from last May, where it was reported that the CPMD code (seems to be another molecular dynamics code, like VASP, right?), which also uses MPI_Alltoall, didn't perform well for not-so-large messages, and the scaling was poor. I suppose your messages also get smaller when you increase the number of processors, assuming the problem size is kept constant, right? The thread suggests diagnostics and solutions, and I found it quite helpful: http://www.open-mpi.org/community/lists/users/2009/05/9355.php Sorry, we're not computational chemists here, but our programs also use MPI collectives. Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA - Craig Plaisance wrote: > I ran a test of tcp using NetPIPE and got throughput of 850 Mb/s at > message sizes of 128 Kb. The latency was 50 us. At message sizes above > 1000 Kb, the throughput oscillated wildly between 850 Mb/s and values as > low as 200 Mb/s. This test was done with no other network traffic. I > then ran four tests simultaneously between different pairs of compute > nodes and saw a drastic decrease in performance. The highest stable > (non-oscillating) throughput was about 500 Mb/s at a message size of 16 > Kb. The throughput then oscillated wildly, with the maximum value > climbing to 850 Mb/s at a message size greater than 128 Kb and dropping > to values as low as 100 Mb/s. The code I am using (VASP) has 100 to > 1000 double complex (16 byte) arrays containing 100,000 to 1,000,000 > elements each. Typically, the arrays are distributed among the nodes. > The most communication intensive part involves executing an MPI_alltoall > to redistribute the arrays so that node i contains the ith block of each > array. The default message size is 1000 elements (128 Kb), so according > to the NetPIPE test, I should be getting very good throughput when there > is no other network traffic. I will run a NetPIPE test with openmpi and > mpich2 now and post the results. So, does anyone know what causes the > wild oscillations in the throughput at larger message sizes and higher > network traffic? Thanks! > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users