Gus Correa wrote: > Hi Craig, list > > I suppose WRF uses MPI collective calls (MPI_Reduce, > MPI_Bcast, MPI_Alltoall etc), > just like the climate models we run here do. > A recursive grep on the source code will tell. >
I will check this out. I am not the WRF expert, but I was under the impression that most weather models are nearest neighbor communications, not collectives. > If that is the case, you may need to tune the collectives dynamically. > We are experimenting with tuned collectives here also. > > Specifically, we had a scaling problem with the MITgcm > (also running on an IB cluster) > that is probably due to collectives. > Similar problems were reported on this list before, > with computational chemistry software. > See these threads: > http://www.open-mpi.org/community/lists/users/2009/07/10045.php > http://www.open-mpi.org/community/lists/users/2009/05/9419.php > > If WRF outputs timing information, particularly the time spent on MPI > routines, you may also want to compare how the OpenMPI and > MVAPICH versions fare w.r.t. MPI collectives. > > I hope this helps. > I will look into this. Thanks for the ideas. Craig > Gus Correa > --------------------------------------------------------------------- > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > --------------------------------------------------------------------- > > > > Craig Tierney wrote: >> I am running openmpi-1.3.3 on my cluster which is using >> OFED-1.4.1 for Infiniband support. I am comparing performance >> between this version of OpenMPI and Mvapich2, and seeing a >> very large difference in performance. >> >> The code I am testing is WRF v3.0.1. I am running the >> 12km benchmark. >> >> The two builds are the exact same codes and configuration >> files. All I did different was use modules to switch versions >> of MPI, and recompiled the code. >> >> Performance: >> >> Cores Mvapich2 Openmpi >> --------------------------- >> 8 17.3 13.9 >> 16 31.7 25.9 >> 32 62.9 51.6 >> 64 110.8 92.8 >> 128 219.2 189.4 >> 256 384.5 317.8 >> 512 687.2 516.7 >> >> The performance number is GFlops (so larger is better). >> >> I am calling openmpi as: >> >> /opt/openmpi/1.3.3-intel/bin/mpirun --mca plm_rsh_disable_qrsh 1 >> --mca btl openib,sm,self \ >> -machinefile /tmp/6026489.1.qntest.q/machines -x LD_LIBRARY_PATH -np >> $NSLOTS /home/ctierney/bin/noaa_affinity ./wrf.exe >> >> So, >> >> Is this expected? Are some common sense optimizations to use? >> Is there a way to verify that I am really using the IB? When >> I try: >> >> -mca bta ^tcp,openib,sm,self >> >> I get the errors: >> -------------------------------------------------------------------------- >> >> No available btl components were found! >> >> This means that there are no components of this type installed on your >> system or all the components reported that they could not be used. >> >> This is a fatal error; your MPI process is likely to abort. Check the >> output of the "ompi_info" command and ensure that components of this >> type are available on your system. You may also wish to check the >> value of the "component_path" MCA parameter and ensure that it has at >> least one directory that contains valid MCA components. >> -------------------------------------------------------------------------- >> >> >> But ompi_info is telling me that I have openib support: >> >> MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.3) >> >> Note, I did rebuild OFED and put it in a different directory >> and did not rebuild OpenMPI. However, since ompi_info isn't >> complaining and the libraries are available, I am thinking that >> is isn't a problem. I could be wrong. >> >> Thanks, >> Craig > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Craig Tierney (craig.tier...@noaa.gov)