Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Gus Correa wrote: Hi Craig, George, list Here is a quick and dirty solution I used before for a similar problem. Link the Intel libraries statically, using the "-static-intel" flag. Other shared libraries continue to be dynamically linked. For instance: mipf90 -static-intel my_mpi_program.f90 What is not clear to me is why to use orted instead of mpirun/mpiexec/orterun, which has a mechanism to pass environment variables to the hosts with "-x LD_LIBRARY_PATH=/my/intel/lib". I hope this helps. Gus Correa I am not calling orted directly. I am using mpirun. Mpirun launches orted to each node. Orted will pass the LD_LIBRARY_PATH to the specified application. Mpirun does not pass LD_LIBRARY_PATH to Orted, so it doesn't launch. Craig -- Craig Tierney (craig.tier...@noaa.gov)
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov)
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/ mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort -- prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file- sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov)
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort -- prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file- sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Open MPI v1.2.8 released
The Open MPI Team, representing a consortium of research, academic, and industry partners, is pleased to announce the release of Open MPI version 1.2.8. This release is mainly a bug fix release over the v1.2.7 release, but there are few new features. We strongly recommend that all users upgrade to version 1.2.8 if possible. Version 1.2.8 can be downloaded from the main Open MPI web site or any of its mirrors (mirrors will be updating shortly). Here is a list of changes in v1.2.8 as compared to v1.2.7: - Tweaked one memory barrier in the openib component to be more conservative. May fix a problem observed on PPC machines. See ticket #1532. - Fix OpenFabrics IB partition support. See ticket #1557. - Restore v1.1 feature that sourced .profile on remote nodes if the default shell will not do so (e.g. /bin/sh and /bin/ksh). See ticket #1560. - Fix segfault in MPI_Init_thread() if ompi_mpi_init() fails. See ticket #1562. - Adjust SLURM support to first look for $SLURM_JOB_CPUS_PER_NODE instead of the deprecated $SLURM_TASKS_PER_NODE environment variable. This change may be *required* when using SLURM v1.2 and above. See ticket #1536. - Fix the MPIR_Proctable to be in process rank order. See ticket #1529. - Fix a regression introduced in 1.2.6 for the IBM eHCA. See ticket #1526. -- Tim Mattox, Ph.D. Open Systems Lab Indiana University
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort -- prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file- sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort -- prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file- sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov)
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
In torque/pbs using the #PBS -V command pushes the environment variables out to the nodes. I don't know if that is what was happening with slurm. Doug Reeder On Oct 14, 2008, at 12:33 PM, Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort -- prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with- file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mai
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. This is the problem. We use modules to switch compilers/MPI stacks. When a job is launched, whatever LD_LIBRARY_PATH that is used in the current environment is not passed to orted for its use (just to orted to pass the launching executable). Craig Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph We are using ssh (we do not use tight integration in SGE). Craig On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Craig
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
You might want to look at something like the mpi-selector project that is part of OFED (but is easily separable; it's a small package); it might be helpful to you...? http://www.openfabrics.org/git/?p=~jsquyres/mpi- selector.git;a=summary On Oct 14, 2008, at 5:18 PM, Craig Tierney wrote: Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph We are using ssh (we do not use tight integration in SGE). Craig On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort -- prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with- file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this pro
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Am 14.10.2008 um 23:18 schrieb Craig Tierney: Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph We are using ssh (we do not use tight integration in SGE). Hi Craig, may I ask why? You compiled Open MPI without SGE support, as in 1.2.7 it's in by default AFAIK? - Reuti Craig On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with- file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error while loading shared libraries: libintlc.so.5: cannot open shared object file: No such file or directory How do others solve this problem? Thanks, Craig -- Craig Tierney (craig.tier...@noaa.gov) ___ users mailing li
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Reuti wrote: Am 14.10.2008 um 23:18 schrieb Craig Tierney: Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph We are using ssh (we do not use tight integration in SGE). Hi Craig, may I ask why? You compiled Open MPI without SGE support, as in 1.2.7 it's in by default AFAIK? - Reuti Only because we don't have it on. When we first started using SGE around 2002, we hadn't used it. It is on our list of things to do, but it is not trivial to just turn on and validate. We compiled all versions of OpenMPI we have used (1.2.4,1.2.6, and 1.2.7) with --without-gridengine. Craig Craig On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-romio-flags=--with-file-sys=nfs+ufs --with-openib=/opt/hjet/ofed/1.3.1 When I launch a job, I run the module command for the right compiler/MPI version to set the paths correctly. Mpirun passes LD_LIBRARY_PATH to the executable I am launching, but not orted. When orted is launched on the remote system, the LD_LIBRARY_PATH doesn't come with, and the Intel 10.1 libraries can't be found. /opt/openmpi/1.2.7-intel-10.1/bin/orted: error w
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Am 14.10.2008 um 23:39 schrieb Craig Tierney: Reuti wrote: Am 14.10.2008 um 23:18 schrieb Craig Tierney: Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph We are using ssh (we do not use tight integration in SGE). Hi Craig, may I ask why? You compiled Open MPI without SGE support, as in 1.2.7 it's in by default AFAIK? - Reuti Only because we don't have it on. When we first started using SGE around 2002, we hadn't used it. It is on our list of things to This was still Codine 5.3 - or already SGE? do, but it is not trivial to just turn on and validate. We compiled all versions of OpenMPI we have used (1.2.4,1.2.6, and 1.2.7) with --without- gridengine. It's built-in and you don't need any special start- or stop_proc_args, just /bin/true will do. It could even be, that copying the CODINE_* to SGE_* might make Open MPI usable with Codine. If you want to set some things for ssh login, you can put the necessary things in ~/.bashrc. -- Reuti Craig Craig On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F77=ifort FC=ifort F90=ifort --prefix=/opt/openmpi/1.2.7-intel-10.1 --without- gridengine --enable-io-romio --with-io-rom
Re: [OMPI users] Passing LD_LIBRARY_PATH to orted
Reuti wrote: Am 14.10.2008 um 23:39 schrieb Craig Tierney: Reuti wrote: Am 14.10.2008 um 23:18 schrieb Craig Tierney: Ralph Castain wrote: I -think- there is...at least here, it does seem to behave that way on our systems. Not sure if there is something done locally to make it work. Also, though, I have noted that LD_LIBRARY_PATH does seem to be getting forwarded on the 1.3 branch in some environments. OMPI isn't doing it directly to the best of my knowledge, but I think the base environment might be. Specifically, I noticed it on slurm earlier today. I'll check the others as far as I can. Craig: what environment are you using? ssh? Ralph We are using ssh (we do not use tight integration in SGE). Hi Craig, may I ask why? You compiled Open MPI without SGE support, as in 1.2.7 it's in by default AFAIK? - Reuti Only because we don't have it on. When we first started using SGE around 2002, we hadn't used it. It is on our list of things to This was still Codine 5.3 - or already SGE? We started with SGE 5.3. It was definitely SGE. do, but it is not trivial to just turn on and validate. We compiled all versions of OpenMPI we have used (1.2.4,1.2.6, and 1.2.7) with --without-gridengine. It's built-in and you don't need any special start- or stop_proc_args, just /bin/true will do. It could even be, that copying the CODINE_* to SGE_* might make Open MPI usable with Codine. If you want to set some things for ssh login, you can put the necessary things in ~/.bashrc. Thanks for the tips, I will try it out when I get a chance (and now we are very off topic). Craig -- Reuti Craig Craig On Oct 14, 2008, at 1:18 PM, George Bosilca wrote: I use modules too, but they only work locally. Or is there a feature in "module" to automatically load the list of currently loaded local modules remotely ? george. On Oct 14, 2008, at 3:03 PM, Ralph Castain wrote: You might consider using something like "module" - we use that system for exactly this reason. Works quite well and solves the multiple compiler issue. Ralph On Oct 14, 2008, at 12:56 PM, Craig Tierney wrote: George Bosilca wrote: The option to expand the remote LD_LIBRARY_PATH, in such a way that Open MPI related applications have their dependencies satisfied, is in the trunk. The fact that the compiler requires some LD_LIBRARY_PATH is out of the scope of an MPI implementation, and I don't think we should take care of it. Passing the local LD_LIBRARY_PATH to the remote nodes doesn't make much sense. There are plenty of environment, where the head node have a different configuration than the compute nodes. Again, in this case my original solution seems not that bad. If you copy (or make a link if you prefer) in the Open MPI lib directory to the compiler shared libraries, this will work. george. This does work. It just increases maintenance for each new version of OpenMPI. How often does a head node have a different configuration than the compute node? It would see that this would even more support the passing of LD_LIBRARY_PATH for OpenMPI tools to support a heterogeneous configuration as you described. Thanks, Craig On Oct 14, 2008, at 12:11 PM, Craig Tierney wrote: George Bosilca wrote: Craig, This is a problem with the Intel libraries and not the Open MPI ones. You have to somehow make these libraries available on the compute nodes. What I usually do (but it's not the best way to solve this problem) is to copy these libraries somewhere on my home area and to add the directory to my LD_LIBRARY_PATH. george. This is ok when you only ever use one compiler, but it isn't very flexible. I want to keep it as simple as possible for my users, while having a maintainable system. The libraries are on the compute nodes, the problem deals with supporting multiple versions of compilers. I can't just list all of the lib paths in ld.so.conf, because then the user will never get the correct one. I can't specify a static LD_LIBRARY_PATH for the same reason. I would prefer not to build my system libraries static. To the OpenMPI developers, what is your opinion on changing orterun/mpirun to pass LD_LIBRARY_PATH to the remote hosts when starting OpenMPI processes? By hand, all that would be done is: env LD_LIBRARY_PATH=$LD_LIBRARY_PATH $OPMIPATH/orted This would ensure that orted is launched correctly. Or is it better to just build the OpenMPI tools statically? We also use other compilers (PGI, Lahey) so I need a solution that works for all of them. Thanks, Craig On Oct 10, 2008, at 6:17 PM, Craig Tierney wrote: I am having problems launching openmpi jobs on my system. I support multiple versions of MPI and compilers using GNU Modules. For the default compiler, everything is fine. For non-default, I am having problems. I built Openmpi-1.2.6 (and 1.2.7) with the following configure options: # module load intel/10.1 # ./configure CC=icc CXX=icpc F7