Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released
> Does use of 1.3.3 require recompilation of applications that > were compiled using 1.3.2? > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of jimkress_58 > Sent: Tuesday, July 14, 2009 3:05 PM > To: us...@open-mpi.org > Subject: Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released > > Does use of 1.3.3 require recompilation of applications that > were compiled using 1.3.2? > > Jim > > -Original Message- > From: announce-boun...@open-mpi.org > [mailto:announce-boun...@open-mpi.org] > On Behalf Of Ralph Castain > Sent: Tuesday, July 14, 2009 2:11 PM > To: OpenMPI Announce > Subject: [Open MPI Announce] Open MPI v1.3.3 released > > The Open MPI Team, representing a consortium of research, > academic, and industry partners, is pleased to announce the > release of Open MPI version 1.3.3. This release is mainly a > bug fix release over the v1.3.3 release, but there are few > new features, including support for Microsoft Windows. We > strongly recommend that all users upgrade to version 1.3.3 if > possible. > > Version 1.3.3 can be downloaded from the main Open MPI web > site or any of its mirrors (mirrors will be updating shortly). > > Here is a list of changes in v1.3.3 as compared to v1.3.2: > > - Fix a number of issues with the openib BTL (OpenFabrics) RDMA CM, >including a memory corruption bug, a shutdown deadlock, and a route >timeout. Thanks to David McMillen and Hal Rosenstock for help in >tracking down the issues. > - Change the behavior of the EXTRA_STATE parameter that is passed to >Fortran attribute callback functions: this value is now stored >internally in MPI -- it no longer references the original value >passed by MPI_*_CREATE_KEYVAL. > - Allow the overriding RFC1918 and RFC3330 for the specification of >"private" networks, thereby influencing Open MPI's TCP >"reachability" computations. > - Improve flow control issues in the sm btl, by both tweaking the >shared memory progression rules and by enabling the "sync" > collective >to barrier every 1,000th collective. > - Various fixes for the IBM XL C/C++ v10.1 compiler. > - Allow explicit disabling of ptmalloc2 hooks at runtime (e.g., enable >support for Debain's builtroot system). Thanks to Manuel Prinz and >the rest of the Debian crew for helping identify and fix > this issue. > - Various minor fixes for the I/O forwarding subsystem. > - Big endian iWARP fixes in the Open Fabrics RDMA CM support. > - Update support for various OpenFabrics devices in the openib BTL's >.ini file. > - Fixed undefined symbol issue with Open MPI's parallel debugger >message queue support so it can be compiled by Sun Studio > compilers. > - Update MPI_SUBVERSION to 1 in the Fortran bindings. > - Fix MPI_GRAPH_CREATE Fortran 90 binding. > - Fix MPI_GROUP_COMPARE behavior with regards to MPI_IDENT. Thanks to >Geoffrey Irving for identifying the problem and supplying the fix. > - Silence gcc 4.1 compiler warnings about type punning. Thanks to >Number Cruncher for the fix. > - Added more Valgrind and other memory-cleanup fixes. Thanks to >various Open MPI users for help with these issues. > - Miscellaneous VampirTrace fixes. > - More fixes for openib credits in heavy-congestion scenarios. > - Slightly decrease the latency in the openib BTL in some conditions >(add "send immediate" support to the openib BTL). > - Ensure to allow MPI_REQUEST_GET_STATUS to accept an >MPI_STATUS_IGNORE parameter. Thanks to Shaun Jackman for the bug >report. > - Added Microsoft Windows support. See README.WINDOWS file for >details. > > ___ > announce mailing list > annou...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/announce > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released
Did not see any other email on the list wrt this topic. Thanks for your response. Jim > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Wednesday, July 15, 2009 4:26 PM > To: Open MPI Users > Subject: Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released > > I believe that was the intent, per other emails on that subject. > > However, I am not personally aware of people who have tested > it - though they may well exist. > > > > On Wed, Jul 15, 2009 at 2:18 PM, Jim Kress > wrote: > > >> Does use of 1.3.3 require recompilation of applications that > > were compiled using 1.3.2? > > > -Original Message- > > From: users-boun...@open-mpi.org > > [mailto:users-boun...@open-mpi.org] On Behalf Of jimkress_58 > > Sent: Tuesday, July 14, 2009 3:05 PM > > To: us...@open-mpi.org > > Subject: Re: [OMPI users] [Open MPI Announce] Open > MPI v1.3.3 released > > > > Does use of 1.3.3 require recompilation of applications that > > were compiled using 1.3.2? > > > > Jim > > > > > -Original Message- > > From: announce-boun...@open-mpi.org > > [mailto:announce-boun...@open-mpi.org] > > On Behalf Of Ralph Castain > > Sent: Tuesday, July 14, 2009 2:11 PM > > To: OpenMPI Announce > > Subject: [Open MPI Announce] Open MPI v1.3.3 released > > > > The Open MPI Team, representing a consortium of research, > > academic, and industry partners, is pleased to announce the > > release of Open MPI version 1.3.3. This release is mainly a > > bug fix release over the v1.3.3 release, but there are few > > new features, including support for Microsoft Windows. We > > strongly recommend that all users upgrade to version 1.3.3 if > > possible. > > > > Version 1.3.3 can be downloaded from the main Open MPI web > > site or any of its mirrors (mirrors will be updating shortly). > > > > Here is a list of changes in v1.3.3 as compared to v1.3.2: > > > > - Fix a number of issues with the openib BTL > (OpenFabrics) RDMA CM, > >including a memory corruption bug, a shutdown > deadlock, and a route > >timeout. Thanks to David McMillen and Hal > Rosenstock for help in > >tracking down the issues. > > - Change the behavior of the EXTRA_STATE parameter > that is passed to > >Fortran attribute callback functions: this value > is now stored > >internally in MPI -- it no longer references the > original value > >passed by MPI_*_CREATE_KEYVAL. > > - Allow the overriding RFC1918 and RFC3330 for the > specification of > >"private" networks, thereby influencing Open MPI's TCP > >"reachability" computations. > > - Improve flow control issues in the sm btl, by both > tweaking the > >shared memory progression rules and by enabling the "sync" > > collective > >to barrier every 1,000th collective. > > - Various fixes for the IBM XL C/C++ v10.1 compiler. > > - Allow explicit disabling of ptmalloc2 hooks at > runtime (e.g., enable > >support for Debain's builtroot system). Thanks to > Manuel Prinz and > >the rest of the Debian crew for helping identify and fix > > this issue. > > - Various minor fixes for the I/O forwarding subsystem. > > - Big endian iWARP fixes in the Open Fabrics RDMA CM support. > > - Update support for various OpenFabrics devices in > the openib BTL's > >.ini file. > > - Fixed undefined symbol issue with Open MPI's > parallel debugger > >message queue support so it can be compiled by Sun Studio > > compilers. > > - Update MPI_SUBVERSION to 1 in the Fortran bindings. > > - Fix MPI_GRAPH_CREATE Fortran 90 binding. > > - Fix MPI_GROUP_COMPARE behavior with regards to > MPI_IDENT. Thanks to > >Geoffrey Irving for identifying the problem and > supplying the fix. > > - Silence gcc 4.1 compiler warnings about type > punning. Thanks to > >Number Cruncher for the fix. > > - Added more Valgrind and other memory-cleanup fixes. > Thanks to > >
Re: [OMPI users] ifort and gfortran module
Why not generate an ifort version with a prefix of _intel And the gfortran version with a prefix of _gcc ? That's what I do and then use mpi-selector to switch between versions as required. Jim -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Martin Siegert Sent: Friday, July 17, 2009 3:29 PM To: Open MPI Users Subject: [OMPI users] ifort and gfortran module Hi, I am wondering whether it is possible to support both the Intel compiler ifort and gfortran within a single compiled version of openmpi. E.g., 1. compile openmpi ifort as the Fortran compiler and install it in /usr/local/openmpi-1.3.3 2. compile openmpi using gfortran, but do not install it; only copy mpi.mod to /usr/local/openmpi-1.3.3/include/gfortran Is there a way to cause mpif90 to include /usr/local/openmpi-1.3.3/include/gfortran before including /usr/local/openmpi-1.3.3/include if OMPI_FC is set to gfortran (more precisely if `basename $OMPI_FC` = gfortran)? Or is there another way of accomplishing this? Cheers, Martin -- Martin Siegert Head, Research Computing WestGrid Site Lead IT Servicesphone: 778 782-4691 Simon Fraser Universityfax: 778 782-4242 Burnaby, British Columbia email: sieg...@sfu.ca Canada V5A 1S6 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] single data/ mutilple processes
I need to use openMPI in a mode where the input and output data reside on one node of my cluster while all the other nodes are just used for computation and send data to/from the head node. All I can find in the documentation are cases showing how to use openMPI for cases where input and output data reside on all the nodes. Will anyone please show me the command line I need to use to accomplish my single data/ multiple node calculation? Thanks. Jim
Re: [OMPI users] single data/ mutilple processes
Hi Jody, I did not explain my problem very well. I have an application called mdrun. It was compiled and linked using openMPI. I want to run mdrun on 8 nodes of my cluster in parallel. Just me. Not multiple users. So I want to launch the openMPI version of mdrun so that it only uses the input files on the node from which it is launched (the head node) and does not look for any input files on the other seven nodes where it spawns the 7 other processes. With MPICH I use the procgroup capability and mdrun only looks for its input files on the node from which it is launched, using the same mdrun parameters as the ones I tried to use with openMPI So, how can I replicate the behavior of MPICH with openMPI? Thanks for your help. Jim On Sat, 2009-01-03 at 17:46 +0100, jody wrote: > Hi Jim > If all of your workers can mount a directory on your head node, > all can access the data there. > > Jody > > > On Sat, Jan 3, 2009 at 4:13 PM, Jim Kress wrote: > > I need to use openMPI in a mode where the input and output data reside > > on one node of my cluster while all the other nodes are just used for > > computation and send data to/from the head node. > > > > All I can find in the documentation are cases showing how to use openMPI > > for cases where input and output data reside on all the nodes. > > > > Will anyone please show me the command line I need to use to accomplish > > my single data/ multiple node calculation? > > > > Thanks. > > > > Jim > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > >
Re: [OMPI users] single data/ mutilple processes
Never mind. I figured it out for myself. Jim On Sat, 2009-01-03 at 13:51 -0500, Jim Kress wrote: > Hi Jody, > > I did not explain my problem very well. > > I have an application called mdrun. It was compiled and linked using > openMPI. I want to run mdrun on 8 nodes of my cluster in parallel. > Just me. Not multiple users. So I want to launch the openMPI version > of mdrun so that it only uses the input files on the node from which it > is launched (the head node) and does not look for any input files on the > other seven nodes where it spawns the 7 other processes. > > With MPICH I use the procgroup capability and mdrun only looks for its > input files on the node from which it is launched, using the same mdrun > parameters as the ones I tried to use with openMPI > > So, how can I replicate the behavior of MPICH with openMPI? > > Thanks for your help. > > Jim > > > > > > On Sat, 2009-01-03 at 17:46 +0100, jody wrote: > > Hi Jim > > If all of your workers can mount a directory on your head node, > > all can access the data there. > > > > Jody > > > > > > On Sat, Jan 3, 2009 at 4:13 PM, Jim Kress > > wrote: > > > I need to use openMPI in a mode where the input and output data reside > > > on one node of my cluster while all the other nodes are just used for > > > computation and send data to/from the head node. > > > > > > All I can find in the documentation are cases showing how to use openMPI > > > for cases where input and output data reside on all the nodes. > > > > > > Will anyone please show me the command line I need to use to accomplish > > > my single data/ multiple node calculation? > > > > > > Thanks. > > > > > > Jim > > > > > > > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > >
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
I assume you a referring to the openmpi-mca-params.conf file As I indicated previously, my first run was with the line btl=self,openib As the only entry in the openmpi-mca-params.conf file. This my default setting and was what I used, and it worked well, for v 1.2.8 Then I tried btl=self,openib mpi_yield_when_idle=0 As the only entries in the openmpi-mca-params.conf file. No difference in the results. Then I tried btl=self,openib mpi_yield_when_idle=0 As the only entries in the openmpi-mca-params.conf file and also set the environment variable OMPI_MCA_mpi_leave_pinned=0 No difference in the results. What else can I provide? By the way, did you read the message where I retracted my assumption about MPI traffic being forced over Ethernet? Jim > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Pavel Shamis (Pasha) > Sent: Tuesday, June 23, 2009 7:24 AM > To: Open MPI Users > Subject: Re: [OMPI users] 50% performance reduction due to > OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead > of using Infiniband > > Jim, > Can you please share with us you mca conf file. > > Pasha. > Jim Kress ORG wrote: > > For the app I am using, ORCA (a Quantum Chemistry program), when it > > was compiled using openMPI 1.2.8 and run under 1.2.8 with the > > following in the openmpi-mca-params.conf file: > > > > btl=self,openib > > > > the app ran fine with no traffic over my Ethernet network and all > > traffic over my Infiniband network. > > > > However, now that ORCA has been recompiled with openMPI > v1.3.2 and run > > under 1.3.2 (using the same openmpi-mca-params.conf file), the > > performance has been reduced by 50% and all the MPI traffic > is going > > over the Ethernet network. > > > > As a matter of fact, the openMPI v1.3.2 performance now > looks exactly > > like the performance I get if I use MPICH 1.2.7. > > > > Anyone have any ideas: > > > > 1) How could this have happened? > > > > 2) How can I fix it? > > > > a 50% reduction in performance is just not acceptable. Ideas/ > > suggestions would be appreciated. > > > > Jim > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
Are you speaking of the configure for the application or for OpenMPI? I have no control over the application since it is provided as an executable only. Jim > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa > Sent: Tuesday, June 23, 2009 2:01 PM > To: Open MPI Users > Subject: Re: [OMPI users] 50% performance reduction due to > OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead > of using Infiniband > > Hi Jim, list > > Have you checked if configure caught your IB libraries properly? > IIRR there has been some changes since 1.2.8 on how configure > searches for libraries (e.g. finding libnuma was a problem, > now fixed). > Chances are that if you used some old script or command line > to run configure, it may not have worked as you expected. > > Check the output of ompi_info -config. > It should show -lrdmacm -libverbs, otherwise it skipped IB. > In this case you can reconfigure, pointing to the IB library location. > > If you have a log of your configure step you can also search > it for openib, libverbs, etc, to see if it did what you expected. > > I hope this helps, > Gus Correa > - > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > - > > > Pavel Shamis (Pasha) wrote: > > Jim, > > Can you please share with us you mca conf file. > > > > Pasha. > > Jim Kress ORG wrote: > >> For the app I am using, ORCA (a Quantum Chemistry > program), when it > >> was compiled using openMPI 1.2.8 and run under 1.2.8 with the > >> following in the openmpi-mca-params.conf file: > >> > >> btl=self,openib > >> > >> the app ran fine with no traffic over my Ethernet network and all > >> traffic over my Infiniband network. > >> > >> However, now that ORCA has been recompiled with openMPI v1.3.2 and > >> run under 1.3.2 (using the same openmpi-mca-params.conf file), the > >> performance has been reduced by 50% and all the MPI > traffic is going > >> over the Ethernet network. > >> > >> As a matter of fact, the openMPI v1.3.2 performance now > looks exactly > >> like the performance I get if I use MPICH 1.2.7. > >> > >> Anyone have any ideas: > >> > >> 1) How could this have happened? > >> > >> 2) How can I fix it? > >> > >> a 50% reduction in performance is just not acceptable. Ideas/ > >> suggestions would be appreciated. > >> > >> Jim > >> > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead of using Infiniband
OK. I'll try that, too. Also, > BTW: did you set that mpi_show_mca_params option to ensure > the app is actually seeing these params? I'm working to get to a point where I can get some time to try that. Hopefully it will be before 5PM EDT. Jim > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Tuesday, June 23, 2009 2:43 PM > To: Open MPI Users > Subject: Re: [OMPI users] 50% performance reduction due to > OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead > of using Infiniband > > Assuming you aren't oversubscribing your nodes, set > mpi_paffinity_alone=1. > > BTW: did you set that mpi_show_mca_params option to ensure > the app is actually seeing these params? > > > > On Tue, Jun 23, 2009 at 12:35 PM, Jim Kress > wrote: > > > I assume you a referring to the openmpi-mca-params.conf file > > As I indicated previously, my first run was with the line > > btl=self,openib > > As the only entry in the openmpi-mca-params.conf file. > This my default > setting and was what I used, and it worked well, for v 1.2.8 > > Then I tried > > btl=self,openib > mpi_yield_when_idle=0 > > As the only entries in the openmpi-mca-params.conf > file. No difference in > the results. > > Then I tried > > btl=self,openib > mpi_yield_when_idle=0 > > As the only entries in the openmpi-mca-params.conf file > and also set the > environment variable OMPI_MCA_mpi_leave_pinned=0 > No difference in the results. > > What else can I provide? > > By the way, did you read the message where I retracted > my assumption about > MPI traffic being forced over Ethernet? > > Jim > > > > -Original Message- > > From: users-boun...@open-mpi.org > > [mailto:users-boun...@open-mpi.org] On Behalf Of > Pavel Shamis (Pasha) > > Sent: Tuesday, June 23, 2009 7:24 AM > > To: Open MPI Users > > Subject: Re: [OMPI users] 50% performance reduction due to > > OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead > > of using Infiniband > > > > Jim, > > Can you please share with us you mca conf file. > > > > Pasha. > > Jim Kress ORG wrote: > > > For the app I am using, ORCA (a Quantum Chemistry > program), when it > > > was compiled using openMPI 1.2.8 and run under > 1.2.8 with the > > > following in the openmpi-mca-params.conf file: > > > > > > btl=self,openib > > > > > > the app ran fine with no traffic over my Ethernet > network and all > > > traffic over my Infiniband network. > > > > > > However, now that ORCA has been recompiled with openMPI > > v1.3.2 and run > > > under 1.3.2 (using the same openmpi-mca-params.conf > file), the > > > performance has been reduced by 50% and all the MPI traffic > > is going > > > over the Ethernet network. > > > > > > As a matter of fact, the openMPI v1.3.2 performance now > > looks exactly > > > like the performance I get if I use MPICH 1.2.7. > > > > > > Anyone have any ideas: > > > > > > 1) How could this have happened? > > > > > > 2) How can I fix it? > > > > > > a 50% reduction in performance is just not > acceptable. Ideas/ > > > suggestions would be appreciated. > > > > > > Jim > > > > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead of using Infiniband
Noam, Gus and List, Did you statically link your openmpi when you built it? If you did (the default is NOT to do this) then that could explain the discrepancy. Jim > -Original Message- > From: users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein > Sent: Wednesday, June 24, 2009 9:38 AM > To: Open MPI Users > Subject: Re: [OMPI users] 50% performance reduction due to > OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead > of using Infiniband > > > On Jun 23, 2009, at 6:19 PM, Gus Correa wrote: > > > Hi Jim, list > > > > On my OpenMPI 1.3.2 ompi_info -config gives: > > > > Wrapper extra LIBS: -lrdmacm -libverbs -ltorque -lnuma -ldl -Wl,-- > > export-dynamic -lnsl -lutil -lm -ldl > > > > Yours doesn't seem to have the IB libraries: -lrdmacm -libverbs > > > > So, I would guess your OpenMPI 1.3.2 build doesn't have IB support. > > The second of these statements doesn't follow from the first. > > My "ompi_info -config" returns > > ompi_info -config | grep LIBS >Build LIBS: -lnsl -lutil -lm >Wrapper extra LIBS: -ldl -Wl,--export-dynamic > -lnsl -lutil - > lm -ldl > > But it does have openib > > ompi_info | grep openib > MCA btl: openib (MCA v2.0, API v2.0, > Component v1.3.2) > > and osu_bibw returns > > # OSU MPI Bi-Directional Bandwidth Test v3.0 > # Size Bi-Bandwidth (MB/s) > 41943041717.43 > > which it's sure not getting over ethernet. I think Jeff > Squyres' test (ompi_info | grep openib) must be more definitive. > > > Noam > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Infiniband requirements
Is it correct to assume that, when one is configuring openmpi v1.3.2 and if one leaves out the --with-openib=/dir from the ./configure command line, that InfiniBand support will NOT be built into openmpi v1.3.2? Then, if an Ethernet network is present that connects all the nodes, openmpi will use that network? Also, is it required to add --enable-static to the ./configure command line to make sure Infiniband support is available? If I do not then the ompi_info --config command yields Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl Note the absence of -lrdmacm and -libverbs which, I am told, are essential for Infiniband support. Whereas if --enable-static IS included, the ompi_info --config command yields Wrapper extra LIBS: -lrdmacm -libverbs -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl And -lrdmacm and -libverbs are now present. Thanks for your help. Jim
[OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
For the app I am using, ORCA (a Quantum Chemistry program), when it was compiled using openMPI 1.2.8 and run under 1.2.8 with the following in the openmpi-mca-params.conf file: btl=self,openib the app ran fine with no traffic over my Ethernet network and all traffic over my Infiniband network. However, now that ORCA has been recompiled with openMPI v1.3.2 and run under 1.3.2 (using the same openmpi-mca-params.conf file), the performance has been reduced by 50% and all the MPI traffic is going over the Ethernet network. As a matter of fact, the openMPI v1.3.2 performance now looks exactly like the performance I get if I use MPICH 1.2.7. Anyone have any ideas: 1) How could this have happened? 2) How can I fix it? a 50% reduction in performance is just not acceptable. Ideas/ suggestions would be appreciated. Jim
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
Thanks for the advice. Unfortunately the command line is internally generated by the app and then invoked so I can't see it. But, it doesn't matter anyway. It seems the Ethernet utilization "problem" I thought I had does not exist. So, I'm still looking for why my app using 1.2.8 is 50% faster than using 1.3.2. Jim On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote: > Sounds very strange, indeed. You might want to check that your app is > actually getting the MCA param that you think it is. Try adding: > > -mca mpi_show_mca_params file,env > > to your cmd line. This will cause rank=0 to output the MCA params it > thinks were set via the default files and/or environment (including > cmd line). > > Ralph > > On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote: > > > For the app I am using, ORCA (a Quantum Chemistry program), when it > > was > > compiled using openMPI 1.2.8 and run under 1.2.8 with the following in > > the openmpi-mca-params.conf file: > > > > btl=self,openib > > > > the app ran fine with no traffic over my Ethernet network and all > > traffic over my Infiniband network. > > > > However, now that ORCA has been recompiled with openMPI v1.3.2 and run > > under 1.3.2 (using the same openmpi-mca-params.conf file), the > > performance has been reduced by 50% and all the MPI traffic is going > > over the Ethernet network. > > > > As a matter of fact, the openMPI v1.3.2 performance now looks exactly > > like the performance I get if I use MPICH 1.2.7. > > > > Anyone have any ideas: > > > > 1) How could this have happened? > > > > 2) How can I fix it? > > > > a 50% reduction in performance is just not acceptable. Ideas/ > > suggestions would be appreciated. > > > > Jim > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
Is there an environment variable (or variables) I can set to do the equivalent? Jim On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote: > Sounds very strange, indeed. You might want to check that your app is > actually getting the MCA param that you think it is. Try adding: > > -mca mpi_show_mca_params file,env > > to your cmd line. This will cause rank=0 to output the MCA params it > thinks were set via the default files and/or environment (including > cmd line). > > Ralph > > On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote: > > > For the app I am using, ORCA (a Quantum Chemistry program), when it > > was > > compiled using openMPI 1.2.8 and run under 1.2.8 with the following in > > the openmpi-mca-params.conf file: > > > > btl=self,openib > > > > the app ran fine with no traffic over my Ethernet network and all > > traffic over my Infiniband network. > > > > However, now that ORCA has been recompiled with openMPI v1.3.2 and run > > under 1.3.2 (using the same openmpi-mca-params.conf file), the > > performance has been reduced by 50% and all the MPI traffic is going > > over the Ethernet network. > > > > As a matter of fact, the openMPI v1.3.2 performance now looks exactly > > like the performance I get if I use MPICH 1.2.7. > > > > Anyone have any ideas: > > > > 1) How could this have happened? > > > > 2) How can I fix it? > > > > a 50% reduction in performance is just not acceptable. Ideas/ > > suggestions would be appreciated. > > > > Jim > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
Ralph, I did the following: export OMPI_MCA_mpi_show_mca_params="file,env" then I checked and found it via the set command as OMPI_MCA_mpi_show_mca_params=file,env I then ran my application ./orca hexatriene_TDDFT_get_asa_input_parallel_1.inp > 1.2.8_test_crafted_input_file.out and got the expected ORCA output in the .out file but nothing at the command line or in the .out file about mca_params What did I do wrong? Jim On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote: > Sounds very strange, indeed. You might want to check that your app is > actually getting the MCA param that you think it is. Try adding: > > -mca mpi_show_mca_params file,env > > to your cmd line. This will cause rank=0 to output the MCA params it > thinks were set via the default files and/or environment (including > cmd line). > > Ralph > > On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote: > > > For the app I am using, ORCA (a Quantum Chemistry program), when it > > was > > compiled using openMPI 1.2.8 and run under 1.2.8 with the following in > > the openmpi-mca-params.conf file: > > > > btl=self,openib > > > > the app ran fine with no traffic over my Ethernet network and all > > traffic over my Infiniband network. > > > > However, now that ORCA has been recompiled with openMPI v1.3.2 and run > > under 1.3.2 (using the same openmpi-mca-params.conf file), the > > performance has been reduced by 50% and all the MPI traffic is going > > over the Ethernet network. > > > > As a matter of fact, the openMPI v1.3.2 performance now looks exactly > > like the performance I get if I use MPICH 1.2.7. > > > > Anyone have any ideas: > > > > 1) How could this have happened? > > > > 2) How can I fix it? > > > > a 50% reduction in performance is just not acceptable. Ideas/ > > suggestions would be appreciated. > > > > Jim > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
value) [master.org:12011] carto_auto_detect_priority=11 (default value) [master.org:12011] carto_file_path= (default value) [master.org:12011] carto_file_priority=10 (default value) [master.org:12011] opal_cr_verbose=0 (default value) [master.org:12011] ft_cr_enabled=0 (default value) [master.org:12011] opal_cr_enable_timer=0 (default value) [master.org:12011] opal_cr_enable_timer_barrier=0 (default value) [master.org:12011] opal_cr_timer_target_rank=0 (default value) [master.org:12011] opal_cr_is_tool=0 (default value) [master.org:12011] opal_cr_signal=10 (default value) [master.org:12011] opal_cr_debug_sigpipe=0 (default value) [master.org:12011] opal_cr_tmp_dir=/tmp (default value) [master.org:12011] orte_base_help_aggregate=1 (default value) [master.org:12011] orte_tmpdir_base= (default value) [master.org:12011] orte_no_session_dirs= (default value) [master.org:12011] orte_debug=0 (default value) [master.org:12011] orte_debug_verbose=-1 (default value) [master.org:12011] orte_debug_daemons=0 (default value) [master.org:12011] orte_debug_daemons_file=0 (default value) [master.org:12011] orte_leave_session_attached=0 (default value) [master.org:12011] orte_do_not_launch=0 (default value) [master.org:12011] orte_daemon_spin=0 (default value) [master.org:12011] orte_daemon_fail=-1 (default value) [master.org:12011] orte_daemon_fail_delay=0 (default value) [master.org:12011] orte_heartbeat_rate=0 (default value) [master.org:12011] orte_startup_timeout=0 (default value) [master.org:12011] orte_timing=0 (default value) [master.org:12011] orte_base_user_debugger=totalview @mpirun@ -a @mpirun_args@ : ddt -n @np@ -start @executable@ @executable_argv@ @single_app@ : fxp @mpirun@ -a @mpirun_args@ (default value) [master.org:12011] orte_abort_timeout=1 (default value) [master.org:12011] orte_timeout_step=1000 (default value) [master.org:12011] orte_default_hostfile= (default value) [master.org:12011] orte_keep_fqdn_hostnames=0 (default value) [master.org:12011] orte_contiguous_nodes=2147483647 (default value) [master.org:12011] orte_tag_output=0 (default value) [master.org:12011] orte_xml_output=0 (default value) [master.org:12011] orte_timestamp_output=0 (default value) [master.org:12011] orte_output_filename= (default value) [master.org:12011] orte_show_resolved_nodenames=0 (default value) [master.org:12011] orte_hetero_apps=0 (default value) [master.org:12011] orte_launch_agent=orted (default value) [master.org:12011] orte_allocation_required=0 (default value) [master.org:12011] orte_xterm= (default value) [master.org:12011] orte_forward_job_control=0 (default value) [master.org:12011] ess=env (environment) [master.org:12011] ess_base_verbose=0 (default value) [master.org:12011] ess_env_priority=0 (default value) [master.org:12011] orte_ess_jobid=3004366849 (environment) [master.org:12011] orte_ess_vpid=0 (environment) [master.org:12011] rml_wrapper= (default value) [master.org:12011] rml= (default value) [master.org:12011] rml_base_verbose=0 (default value) [master.org:12011] oob= (default value) [master.org:12011] oob_base_verbose=0 (default value) [master.org:12011] oob_tcp_verbose=0 (default value) [master.org:12011] oob_tcp_peer_limit=-1 (default value) [master.org:12011] oob_tcp_peer_retries=60 (default value) [master.org:12011] oob_tcp_debug=0 (default value) [master.org:12011] oob_tcp_sndbuf=131072 (default value) [master.org:12011] oob_tcp_rcvbuf=131072 (default value) [master.org:12011] oob_tcp_if_include= (default value) [master.org:12011] oob_tcp_if_exclude= (default value) [master.org:12011] oob_tcp_connect_sleep=1 (default value) [master.org:12011] oob_tcp_listen_mode=event (default value) [master.org:12011] oob_tcp_listen_thread_max_queue=10 (default value) [master.org:12011] oob_tcp_listen_thread_wait_time=10 (default value) [master.org:12011] oob_tcp_port_min_v4=0 (default value) [master.org:12011] oob_tcp_port_range_v4=65535 (default value) [master.org:12011] oob_tcp_disable_family=0 (default value) [master.org:12011] oob_tcp_port_min_v6=0 (default value) [master.org:12011] oob_tcp_port_range_v6=65535 (default value) [master.org:12011] oob_tcp_priority=0 (default value) ... Are there any useful clues here? Please note, the app launches a number of parallel programs in a sequence determined by the input file. The same input file was used for both runs. Jim On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote: > Sounds very strange, indeed. You might want to check that your app is > actually getting the MCA param that you think it is. Try adding: > > -mca mpi_show_mca_params file,env > > to your cmd line. This will cause rank=0 to output the MCA params it > thinks were set via the default files and/or environment (including > cmd line). > > Ralph > > On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote: > > > For the app I am using, ORCA (a Quantum Chemistry program), when it > > was > > compiled using openMPI 1.2.8 and run under 1
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
Sorry about the size of the last email. I wasn't aware the log file would be so lagre ... Jim On Tue, 2009-06-23 at 15:20 -0600, Ralph Castain wrote: > Hmmm...just to be clear - did you run this against OMPI 1.3.2, or > 1.2.8? I see a 1.2.8 in your app name, hence the question. > > This option only works with 1.3.2, I'm afraid - it was a new feature. > > Ralph > > On Jun 23, 2009, at 2:31 PM, Jim Kress ORG wrote: > > > Ralph, > > > > I did the following: > > > > export OMPI_MCA_mpi_show_mca_params="file,env" > > > > then I checked and found it via the set command as > > > > OMPI_MCA_mpi_show_mca_params=file,env > > > > I then ran my application > > > > ./orca hexatriene_TDDFT_get_asa_input_parallel_1.inp > > > 1.2.8_test_crafted_input_file.out > > > > and got the expected ORCA output in the .out file but nothing at the > > command line or in the .out file about mca_params > > > > What did I do wrong? > > > > Jim > > > > On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote: > >> Sounds very strange, indeed. You might want to check that your app is > >> actually getting the MCA param that you think it is. Try adding: > >> > >> -mca mpi_show_mca_params file,env > >> > >> to your cmd line. This will cause rank=0 to output the MCA params it > >> thinks were set via the default files and/or environment (including > >> cmd line). > >> > >> Ralph > >> > >> On Jun 22, 2009, at 6:14 PM, Jim Kress ORG wrote: > >> > >>> For the app I am using, ORCA (a Quantum Chemistry program), when it > >>> was > >>> compiled using openMPI 1.2.8 and run under 1.2.8 with the > >>> following in > >>> the openmpi-mca-params.conf file: > >>> > >>> btl=self,openib > >>> > >>> the app ran fine with no traffic over my Ethernet network and all > >>> traffic over my Infiniband network. > >>> > >>> However, now that ORCA has been recompiled with openMPI v1.3.2 and > >>> run > >>> under 1.3.2 (using the same openmpi-mca-params.conf file), the > >>> performance has been reduced by 50% and all the MPI traffic is going > >>> over the Ethernet network. > >>> > >>> As a matter of fact, the openMPI v1.3.2 performance now looks > >>> exactly > >>> like the performance I get if I use MPICH 1.2.7. > >>> > >>> Anyone have any ideas: > >>> > >>> 1) How could this have happened? > >>> > >>> 2) How can I fix it? > >>> > >>> a 50% reduction in performance is just not acceptable. Ideas/ > >>> suggestions would be appreciated. > >>> > >>> Jim > >>> > >>> ___ > >>> users mailing list > >>> us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
For v 1.3.2: Here is the ompi_info -config output and I've attached a copy of the config.log file which seems to clearly indicate it found the infiniband libraries. [root@master ~]# ompi_info -config Configured by: root Configured on: Sun Jun 21 22:02:59 EDT 2009 Configure host: master.org Built by: root Built on: Sun Jun 21 22:10:07 EDT 2009 Built host: master.org C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: gcc C compiler absolute: /usr/bin/gcc C char size: 1 C bool size: 1 C short size: 2 C int size: 4 C long size: 8 C float size: 4 C double size: 8 C pointer size: 8 C char align: 1 C bool align: 1 C int align: 4 C float align: 4 C double align: 8 C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/bin/gfortran Fortran90 compiler: gfortran Fortran90 compiler abs: /usr/bin/gfortran Fort integer size: 4 Fort logical size: 4 Fort logical value true: 1 Fort have integer1: yes Fort have integer2: yes Fort have integer4: yes Fort have integer8: yes Fort have integer16: no Fort have real4: yes Fort have real8: yes Fort have real16: no Fort have complex8: yes Fort have complex16: yes Fort have complex32: no Fort integer1 size: 1 Fort integer2 size: 2 Fort integer4 size: 4 Fort integer8 size: 8 Fort integer16 size: -1 Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: -1 Fort dbl prec size: 4 Fort cplx size: 4 Fort dbl cplx size: 4 Fort cplx8 size: 8 Fort cplx16 size: 16 Fort cplx32 size: -1 Fort integer align: 4 Fort integer1 align: 1 Fort integer2 align: 2 Fort integer4 align: 4 Fort integer8 align: 8 Fort integer16 align: -1 Fort real align: 4 Fort real4 align: 4 Fort real8 align: 8 Fort real16 align: -1 Fort dbl prec align: 4 Fort cplx align: 4 Fort dbl cplx align: 4 Fort cplx8 align: 4 Fort cplx16 align: 8 Fort cplx32 align: -1 C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: no Thread support: posix (mpi: no, progress: no) Sparse Groups: no Build CFLAGS: -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -pthread -fvisibility=hidden Build CXXFLAGS: -O3 -DNDEBUG -finline-functions -pthread Build FFLAGS: Build FCFLAGS: Build LDFLAGS: -export-dynamic Build LIBS: -lnsl -lutil -lm Wrapper extra CFLAGS: -pthread Wrapper extra CXXFLAGS: -pthread Wrapper extra FFLAGS: -pthread Wrapper extra FCFLAGS: -pthread Wrapper extra LDFLAGS: Wrapper extra LIBS: -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: yes Heterogeneous support: no mpirun default --prefix: no MPI I/O support: yes MPI_WTIME support: gettimeofday Symbol visibility support: yes FT Checkpoint support: no (checkpoint thread: no) [root@master ~]# On Tue, 2009-06-23 at 15:20 -0600, Ralph Castain wrote: > Hmmm...just to be clear - did you run this against OMPI 1.3.2, or > 1.2.8? I see a 1.2.8 in your app name, hence the question. > > This option only works with 1.3.2, I'm afraid - it was a new feature. > > Ralph > > On Jun 23, 2009, at 2:31 PM, Jim Kress ORG wrote: > > > Ralph, > > > > I did the following: > > > > export OMPI_MCA_mpi_show_mca_params="file,env" > > > > then I checked and found it via the set command as > > > > OMPI_MCA_mpi_show_mca_params=file,env > > > > I then ran my application > > > > ./orca hexatriene_TDDFT_get_asa_input_parallel_1.inp > > > 1.2.8_test_crafted_input_file.out > > > > and got the expected ORCA output in the .out file but nothing at the > > command line or in the .out file about mca_params > > > > What did I do wrong? > > > > Jim > > > > On Mon, 2009-06-22 at 19:40 -0600, Ralph Castain wrote: > >> Sounds very strange, indeed. You might want to check that your app is > >> actually getting the MCA param
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2 forcing all MPI traffic over Ethernet instead of using Infiniband
This is what I get [root@master ~]# ompi_info | grep openib MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.2) [root@master ~]# Jim On Tue, 2009-06-23 at 18:51 -0400, Jeff Squyres wrote: > openib (OpenFabrics) plugin is installed > and at least marginally opera
Re: [OMPI users] 50% performance reduction due toOpenMPI v 1.3.2 forcing all MPI traffic over Ethernet insteadof using Infiniband
According to the author(s) it was compiled/linked against v1.3.2 Jim On Tue, 2009-06-23 at 19:29 -0400, Jeff Squyres wrote: > You mentioned that you only have a binary for your executable. Was it > compiled / linked against v1.3.2? > > We did not introduce ABI compatibility until v1.3.2 -- if the > executable was compiled/linked against any version prior to that, it's > pure luck that it works with the 1.3.2 shared libraries at all. > > > On Jun 23, 2009, at 7:25 PM, Jim Kress ORG wrote: > > > This is what I get > > > > [root@master ~]# ompi_info | grep openib > > MCA btl: openib (MCA v2.0, API v2.0, Component > > v1.3.2) > > [root@master ~]# > > > > Jim > > > > > > On Tue, 2009-06-23 at 18:51 -0400, Jeff Squyres wrote: > > > openib (OpenFabrics) plugin is installed > > > and at least marginally opera > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > >
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead of using Infiniband
rote: > Hi Jim > > > Jim Kress wrote: > > Noam, Gus and List, > > > > Did you statically link your openmpi when you built it? If you did (the > > default is NOT to do this) then that could explain the discrepancy. > > > > Jim > > No, I didn't link statically. > > Did you link statically? > > Actually, I tried to do it, and it didn't work. > I wouldn't get OpenMPI with IB if I tried to > link statically (i.e. by passing -static or equivalent to CFLAGS, > FFLAGS, etc). > When I removed the "-static" I got OpenMPI with IB. > I always dump the configure output (and the make output, etc) to > log files to check these things out after it is done. > I really suggest you do this, it pays off, saves time, costs nothing. > I don't remember exactly what symptoms I found on the log, > whether the log definitely said that there was no IB support, > or if it didn't have the right flags (-libverbs, etc) like yours. > However, when I suppressed the "-static" from the compiler flags > then I've got all the IB goodies! :) > > Here is how I run configure (CFLAGS etc only have optimization flags, > no "-static"): > > ./configure \ > --prefix=/my/directory \ > --with-libnuma=/usr \ > --with-tm=/usr \ > --with-openib=/usr \ > --enable-static \ > 2>&1 configure.log > > Note, "--enable-static" means OpenMPI will build static libraries > (besides the shared ones). > OpenMPI is not being linked statically to system libraries, > or to IB libraries, etc. > > Some switches may not be needed, > in particularly the explicit use of /usr directory. > However, at some point the OpenMPI configure > would not work without being > told this (at least for libnuma). > > BTW, I didn't claim your OpenMPI doesn't have IB support. > Not a categorical syllogism like > "you don't have the -libverbs flag, hence you don't have IB". > It is hard to make definitive statements like this > in a complex environment like this (OpenMPI build, parallel programs), > and with limited information via email. > After all, the list is peer reviewed! :) > Hence, I only guessed, as I usually do in these exchanges. > However, considering all the trouble you've been through, who knows, > maybe it was a guess in the right direction. > > I wonder if there may still be a glitch in the OpenMPI configure > script, on how it searches for and uses libraries like IB, NUMA, etc, > which may be causing the problem. > Jeff: Is this possible? > > In any case, we have different "Wrapper extra LIBS". > I have -lrdmacm -libverbs, you and Noam don't have them. > (Noam: I am not saying you don't have IB support! :)) > My configure explicitly asks for ib support, Noam's (and maybe yours) > doesn't. > Somehow, slight differences in how one invokes > the configure script seems to produce different results. > > I hope this helps, > Gus Correa > - > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > - > > > >> -Original Message- > >> From: users-boun...@open-mpi.org > >> [mailto:users-boun...@open-mpi.org] On Behalf Of Noam Bernstein > >> Sent: Wednesday, June 24, 2009 9:38 AM > >> To: Open MPI Users > >> Subject: Re: [OMPI users] 50% performance reduction due to > >> OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead > >> of using Infiniband > >> > >> > >> On Jun 23, 2009, at 6:19 PM, Gus Correa wrote: > >> > >>> Hi Jim, list > >>> > >>> On my OpenMPI 1.3.2 ompi_info -config gives: > >>> > >>> Wrapper extra LIBS: -lrdmacm -libverbs -ltorque -lnuma -ldl -Wl,-- > >>> export-dynamic -lnsl -lutil -lm -ldl > >>> > >>> Yours doesn't seem to have the IB libraries: -lrdmacm -libverbs > >>> > >>> So, I would guess your OpenMPI 1.3.2 build doesn't have IB support. > >> The second of these statements doesn't follow from the first. > >> > >> My "ompi_info -config" returns > >> > >> ompi_info -config | grep LIBS > >>Build LIBS: -lnsl -lutil -lm > >>Wrapper extra LIBS: -ldl -Wl,--export-dynamic > >> -lnsl -lutil - > >> lm -ldl > >> > >> But it does have openib > >> > >>
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead of using Infiniband
Hey Gus. I was correct. If I did: ./configure --prefix=/my/dir --with-openib=/usr --enable-static make all install then reboot and use mpi-selector to choose openmpi-1.3.2, and then: [root@master ~]# ompi_info --config Configured by: root Configured on: Wed Jun 24 18:02:03 EDT 2009 Configure host: master.org Built by: root Built on: Wed Jun 24 18:17:29 EDT 2009 Built host: master.org C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: gcc C compiler absolute: /usr/bin/gcc C char size: 1 C bool size: 1 C short size: 2 C int size: 4 C long size: 8 C float size: 4 C double size: 8 C pointer size: 8 C char align: 1 C bool align: 1 C int align: 4 C float align: 4 C double align: 8 C++ compiler: g++ C++ compiler absolute: /usr/bin/g++ Fortran77 compiler: gfortran Fortran77 compiler abs: /usr/bin/gfortran Fortran90 compiler: gfortran Fortran90 compiler abs: /usr/bin/gfortran Fort integer size: 4 Fort logical size: 4 Fort logical value true: 1 Fort have integer1: yes Fort have integer2: yes Fort have integer4: yes Fort have integer8: yes Fort have integer16: no Fort have real4: yes Fort have real8: yes Fort have real16: no Fort have complex8: yes Fort have complex16: yes Fort have complex32: no Fort integer1 size: 1 Fort integer2 size: 2 Fort integer4 size: 4 Fort integer8 size: 8 Fort integer16 size: -1 Fort real size: 4 Fort real4 size: 4 Fort real8 size: 8 Fort real16 size: -1 Fort dbl prec size: 4 Fort cplx size: 4 Fort dbl cplx size: 4 Fort cplx8 size: 8 Fort cplx16 size: 16 Fort cplx32 size: -1 Fort integer align: 4 Fort integer1 align: 1 Fort integer2 align: 2 Fort integer4 align: 4 Fort integer8 align: 8 Fort integer16 align: -1 Fort real align: 4 Fort real4 align: 4 Fort real8 align: 8 Fort real16 align: -1 Fort dbl prec align: 4 Fort cplx align: 4 Fort dbl cplx align: 4 Fort cplx8 align: 4 Fort cplx16 align: 8 Fort cplx32 align: -1 C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: no Thread support: posix (mpi: no, progress: no) Sparse Groups: no Build CFLAGS: -O3 -DNDEBUG -finline-functions -fno-strict-aliasing -pthread -fvisibility=hidden Build CXXFLAGS: -O3 -DNDEBUG -finline-functions -pthread Build FFLAGS: Build FCFLAGS: Build LDFLAGS: -export-dynamic Build LIBS: -lnsl -lutil -lm Wrapper extra CFLAGS: -pthread Wrapper extra CXXFLAGS: -pthread Wrapper extra FFLAGS: -pthread Wrapper extra FCFLAGS: -pthread Wrapper extra LDFLAGS: Wrapper extra LIBS: -lrdmacm -libverbs -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: yes Heterogeneous support: no mpirun default --prefix: no MPI I/O support: yes MPI_WTIME support: gettimeofday Symbol visibility support: yes FT Checkpoint support: no (checkpoint thread: no) [root@master ~]# Magically, -lrdmacm -libverbs appear. Well, that's one mystery solved. Thanks for your help. Jim On Wed, 2009-06-24 at 17:22 -0400, Gus Correa wrote: > Hi Jim > > > Jim Kress wrote: > > Noam, Gus and List, > > > > Did you statically link your openmpi when you built it? If you did (the > > default is NOT to do this) then that could explain the discrepancy. > > > > Jim > > No, I didn't link statically. > > Did you link statically? > > Actually, I tried to do it, and it didn't work. > I wouldn't get OpenMPI with IB if I tried to > link statically (i.e. by passing -static or equivalent to CFLAGS, > FFLAGS, etc). > When I removed the "-static" I got OpenMPI with IB. > I always dump the configure output (and the make output, etc) to > log files to check these things out after it is done. > I really suggest you do this, it pays off, saves time, costs nothing. > I don't remember exactly what symptoms I found on the log, > whether the log definitely said that there was no IB support, > or if it didn't
Re: [OMPI users] 50% performance reduction due to OpenMPI v 1.3.2forcing all MPI traffic over Ethernet instead of using Infiniband
> Have you investigated Jeff's question on whether the code was > compiled/linked with the same OpenMPI version (1.3.2)? > I wonder if the underlying OFED libraries must be the same as well. I was told that 1.3.2 was used. However, I have not asked about which OFED libraries were used nor have I asked about the use of --enable-static for their 1.3.2 configurations. I will have to follow-up on that. Jim On Wed, 2009-06-24 at 19:30 -0400, Gus Correa wrote: > Hi Jim > > Jim Kress ORG wrote: > > Hey Gus. I was correct. > > > > If I did: > > > > ./configure --prefix=/my/dir --with-openib=/usr --enable-static > > make all install > > > ... > > Wrapper extra LIBS: -lrdmacm -libverbs -ldl > > -Wl,--export-dynamic -lnsl > > -lutil -lm -ldl > ... > > > > Magically, -lrdmacm -libverbs appear. > > > > > > Thank you for telling us! > I was too busylazy to try it once again myself. > I built OpenMPI a lot of times, different compilers, > versions, clusters ... > > In any case, the ORCA mystery remains, which is rather unsettling. > Have you investigated Jeff's question on whether the code was > compiled/linked with the same OpenMPI version (1.3.2)? > I wonder if the underlying OFED libraries must be the same as well. > > Gus > - > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > - > > > Jim Kress ORG wrote: > > Hey Gus. I was correct. > > > > If I did: > > > > ./configure --prefix=/my/dir --with-openib=/usr --enable-static > > make all install > > > > then reboot and use mpi-selector to choose openmpi-1.3.2, and then: > > > > [root@master ~]# ompi_info --config > >Configured by: root > >Configured on: Wed Jun 24 18:02:03 EDT 2009 > > Configure host: master.org > > Built by: root > > Built on: Wed Jun 24 18:17:29 EDT 2009 > > Built host: master.org > > C bindings: yes > > C++ bindings: yes > > Fortran77 bindings: yes (all) > > Fortran90 bindings: yes > > Fortran90 bindings size: small > > C compiler: gcc > > C compiler absolute: /usr/bin/gcc > > C char size: 1 > > C bool size: 1 > > C short size: 2 > > C int size: 4 > > C long size: 8 > > C float size: 4 > >C double size: 8 > > C pointer size: 8 > > C char align: 1 > > C bool align: 1 > > C int align: 4 > >C float align: 4 > > C double align: 8 > > C++ compiler: g++ > >C++ compiler absolute: /usr/bin/g++ > > Fortran77 compiler: gfortran > > Fortran77 compiler abs: /usr/bin/gfortran > > Fortran90 compiler: gfortran > > Fortran90 compiler abs: /usr/bin/gfortran > >Fort integer size: 4 > >Fort logical size: 4 > > Fort logical value true: 1 > > Fort have integer1: yes > > Fort have integer2: yes > > Fort have integer4: yes > > Fort have integer8: yes > > Fort have integer16: no > > Fort have real4: yes > > Fort have real8: yes > > Fort have real16: no > > Fort have complex8: yes > > Fort have complex16: yes > > Fort have complex32: no > > Fort integer1 size: 1 > > Fort integer2 size: 2 > > Fort integer4 size: 4 > > Fort integer8 size: 8 > > Fort integer16 size: -1 > > Fort real size: 4 > > Fort real4 size: 4 > > Fort real8 size: 8 > > Fort real16 size: -1 > > Fort dbl prec size: 4 > > Fort cplx size: 4 > > Fort dbl cplx size: 4 > > Fort cplx8 size: 8 > > Fort cplx16 size: 16 > > Fort cplx32 size: -1 > > Fort integer align: 4 > > Fort integer1 align: 1 > > Fort integer2 align: 2 > > Fort integer4 align: 4 > > Fort integer8 align: 8 > > Fort integer16 align: -1 > > Fort real align: 4 > > Fort real4 align: 4 > > Fort real8 align: 8 > >Fort real16 align: -1 > > Fort dbl prec