Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
Hello Davide, On 11/28/17 15:58, Vanzo, Davide wrote: I am having a very weird problem with mpifort that I cannot understand. I am building OpenMPI 1.10.3 with GCC 5.4.0 with EasyBuild and everything goes fine until I try to use mpifort to build any MPI Fortran code, which fails with the error log you see below. The thing I do not understand is why the linker searches for libgfortran.so in /usr/lib64 instead of the actual GCC installation path, where the file is actually located. Both LD_LIBRARY_PATH and LIBRARY_PATH contain the correct path to the library. The gfortran compiler works correctly for compiling serial code. Attached you can also find the ourput of the same command with LD_DEBUG=1. Any idea what is going wrong? The place the link-time link editor ld searches for still seems to be /usr/lib64 first. $ ll /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/libgfortran.so lrwxrwxrwx 1 buildbot buildbot 20 Nov 27 18:44 /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/libgfortran.so -> libgfortran.so.3.0.0 So I assume you mean to invoke a gfortran binary located at /accre/arch/easybuild/software/Core/GCCcore/5.4.0/bin/gfortran ? $ mpifort multitask_mpi.f90 What does "mpifort -show" report what gfortran command is used? If the first part is simply gfortran, what does "which gfortran" print? /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/binutils/2.26/bin/ld.gold: error: cannot open /usr/lib64/libgfortran.so: No such file or directory so obviously your build finds something named libgfortran.so in /usr/lib64 but cannot use it. Is it a defunct symbolic link? /tmp/ccpSxqE6.o:multitask_mpi.f90:function timestamp_: error: undefined > reference to '_gfortran_date_and_time' [...] These are just followup errors from not being able to find the correct libgfortran. Regards, Thomas smime.p7s Description: S/MIME Cryptographic Signature ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] OMPI 2.1.2 and SLURM compatibility
Hi Bennet I suspect the problem here lies in the slurm PMIx plugin. Slurm 17.11 supports PMIx v2.0 as well as (I believe) PMIx v1.2. I’m not sure if slurm is somehow finding one of those on your system and building the plugin or not, but it looks like OMPI is picking up signs of PMIx being active and trying to use it - and hitting an incompatibility. You can test this out by adding --mpi=pmi2 to your srun cmd line and see if that solves the problem (you may also need to add OMPI_MCA_pmix=s2 to your environment as slurm has a tendency to publish envars even when they aren’t being used). > On Nov 29, 2017, at 5:44 AM, Bennet Fauber wrote: > > Howard, > > Thanks very much for the help identifying what information I should provide. > > This is some information about our SLURM version > > $ srun --mpi list > srun: MPI types are... > srun: pmi2 > srun: pmix_v1 > srun: openmpi > srun: pmix > srun: none > > $ srun --version > slurm 17.11.0-0rc3 > > This is the output from my build script, which should show all the > configure options I used. > > Checking compilers and things > OMPI is ompi > COMP_NAME is gcc_4_8_5 > SRC_ROOT is /sw/src/arcts > PREFIX_ROOT is /sw/arcts/centos7/apps > PREFIX is /sw/arcts/centos7/apps/gcc_4_8_5/openmpi/2.1.2 > CONFIGURE_FLAGS are --disable-dlopen --enable-shared > COMPILERS are CC=gcc CXX=g++ FC=gfortran F77=gfortran > No modules loaded > gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11) > Copyright (C) 2015 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > ./configure > --prefix=/sw/arcts/centos7/apps/gcc_4_8_5/openmpi/2.1.2 > --mandir=/sw/arcts/centos7/apps/gcc_4_8_5/openmpi/2.1.2/share/man > --with-slurm > --with-pmi > --with-lustre > --with-verbs > --disable-dlopen --enable-shared > CC=gcc CXX=g++ FC=gfortran F77=gfortran > > I remove the build directory and re-expand from the source tarball for > each build, so there should not be lingering configuration files from > prior trials. > > Here is the output of > > ompi_info | grep pmix > >MCA pmix: s2 (MCA v2.1.0, API v2.0.0, Component v2.1.2) >MCA pmix: s1 (MCA v2.1.0, API v2.0.0, Component v2.1.2) >MCA pmix: pmix112 (MCA v2.1.0, API v2.0.0, Component v2.1.2) > MCA pmix base: --- > MCA pmix base: parameter "pmix" (current value: "", data > source: default, level: 2 user/detail, type: string) > Default selection set of components for the > pmix framework ( means use all components that can be found) > MCA pmix base: --- > MCA pmix base: parameter "pmix_base_verbose" (current > value: "error", data source: default, level: 8 dev/detail, type: int) > Verbosity level for the pmix framework (default: 0) > MCA pmix base: parameter "pmix_base_async_modex" (current > value: "false", data source: default, level: 9 dev/all, type: bool) > MCA pmix base: parameter "pmix_base_collect_data" (current > value: "true", data source: default, level: 9 dev/all, type: bool) > MCA pmix s2: --- > MCA pmix s2: parameter "pmix_s2_priority" (current value: > "20", data source: default, level: 9 dev/all, type: int) > Priority of the pmix s2 component (default: 20) > MCA pmix s1: --- > MCA pmix s1: parameter "pmix_s1_priority" (current value: > "10", data source: default, level: 9 dev/all, type: int) > Priority of the pmix s1 component (default: 10) > > I also attach the hello-mpi.c file I am using as a test. I compiled it using > > $ mpicc -o hello-mpi hello-mpi.c > > and this is the information about the actual compile command > > $ mpicc --showme -o hello-mpi hello-mpi.c > gcc -o hello-mpi hello-mpi.c > -I/sw/arcts/centos7/apps/gcc_4_8_5/openmpi/2.1.2/include -pthread > -L/usr/lib64 -Wl,-rpath -Wl,/usr/lib64 -Wl,-rpath > -Wl,/sw/arcts/centos7/apps/gcc_4_8_5/openmpi/2.1.2/lib > -Wl,--enable-new-dtags > -L/sw/arcts/centos7/apps/gcc_4_8_5/openmpi/2.1.2/lib -lmpi > > > I use some variation on the following submit script > > test.slurm > - > $ cat test.slurm > #!/bin/bash > #SBATCH -J JOBNAME > #SBATCH --mail-user=ben...@umich.edu > #SBATCH --mail-type=NONE > > #SBATCH -N 2 > #SBATCH --ntasks-per-node=1 > #SBATCH --mem-per-cpu=1g > #SBATCH --cpus-per-task=1 > #SBATCH -A hpcstaff > #SBATCH -p standard > > #Your code here > > cd /home/bennet/hello > srun ./hello-mpi > - > > The results are attached as slurm-114.out, where it looks to me like > it is try
Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
Thank you Åke, Reuti and Thomas for your replies. Just to clarify. The reason why /usr/lib64/libgfortran.so does not exist is intentional because on our cluster we use a minimal CentOS installation and all libraries are provided through the software stack built via EasyBuild on a non-system path. The output of `mpifort -show` shows the correct executable for `gfortran` ( I checked with `which gfortran`): $ mpifort multitask_mpi.f90 -show gfortran multitask_mpi.f90 -I/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/include -I/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib -L/usr/lib64 -Wl,-rpath -Wl,/usr/lib64 -Wl,-rpath -Wl,/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib -Wl,--enable-new-dtags -L/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi If I try to compile the source code with the explicit command after removing the `-L/usr/lib64` flag it compiles, links and executes correctly. For the sake of testing I have built OpenMPI in the same exact way on another machine with the same minimal OS. The only difference (at least that I am aware of) is the root prefix of the installation path. In this case it works correctly even without removing `-L/usr/lib64`. To try to better understand, I have launched `mpifort -v`. Below you can find the paths in COMPILER_PATH and LIBRARY_PATH for both systems. Take into account that on the broken system `/accre/arch` is a symlink to `/gpfs22/accre/optimized/haswell`. But I do not see how this can be the problem... Working system: COMPILER_PATH /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/ LIBRARY_PATH /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/../lib64/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/../lib64/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../lib64/ /lib/../lib64/ /usr/lib/../lib64/ /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib/ /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/hwloc/1.11.3/lib/ /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/numactl/2.0.11/lib/ /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/binutils/2.26/lib/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../ /lib/ /usr/lib/ Broken system: COMPILER_PATH /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../libexec/gcc/ LIBRARY_PATH /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/../lib64/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/../lib64/ /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../lib64/ /lib/../lib64/ /usr/lib/../lib64/ /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib/ /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/hwloc/1.11.3/lib/ /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/numactl/2.0.11/lib/ /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/binutils/2.26/lib/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/ /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/ /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../ /lib/ /usr/lib/ -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 www.accre.vanderbilt.edu ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
FWIW, adding -L/usr/lib or -L/usr/lib64 is generally considered Bad, because it may usurp the default linker path order. I note that you're using Open MPI 1.10.3 -- if you're unwilling/unable to upgrade to Open MPI 3.0.x, could you upgrade to Open MPI 1.10.7? We may well have fixed the issue in that time (i.e., do not have mpifort add -L/usr/lib64 to the command line). > On Nov 29, 2017, at 11:30 AM, Vanzo, Davide > wrote: > > Thank you Åke, Reuti and Thomas for your replies. > > Just to clarify. The reason why /usr/lib64/libgfortran.so does not exist is > intentional because on our cluster we use a minimal CentOS installation and > all libraries are provided through the software stack built via EasyBuild on > a non-system path. > > The output of `mpifort -show` shows the correct executable for `gfortran` ( I > checked with `which gfortran`): > > $ mpifort multitask_mpi.f90 -show > gfortran multitask_mpi.f90 > -I/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/include > -I/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib > -L/usr/lib64 -Wl,-rpath -Wl,/usr/lib64 -Wl,-rpath > -Wl,/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib > -Wl,--enable-new-dtags > -L/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib > -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > > > If I try to compile the source code with the explicit command after removing > the `-L/usr/lib64` flag it compiles, links and executes correctly. > > For the sake of testing I have built OpenMPI in the same exact way on another > machine with the same minimal OS. The only difference (at least that I am > aware of) is the root prefix of the installation path. In this case it works > correctly even without removing `-L/usr/lib64`. > > To try to better understand, I have launched `mpifort -v`. Below you can find > the paths in COMPILER_PATH and LIBRARY_PATH for both systems. > Take into account that on the broken system `/accre/arch` is a symlink to > `/gpfs22/accre/optimized/haswell`. But I do not see how this can be the > problem... > > > Working system: > > COMPILER_PATH > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/ > > LIBRARY_PATH > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/../lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/../lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../lib64/ > /lib/../lib64/ > /usr/lib/../lib64/ > /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib/ > /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/hwloc/1.11.3/lib/ > /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/numactl/2.0.11/lib/ > /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/binutils/2.26/lib/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../ > /lib/ > /usr/lib/ > > > Broken system: > > COMPILER_PATH > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../libexec/gcc/ > > LIBRARY_PATH > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/../lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/../lib64/ > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../lib64/ > /lib/../lib64/ > /usr/lib/../lib64/ > /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib/ > /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/hwloc/1.11.3/lib/ > /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/numactl/2.0.11/lib/ > /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/binutils/2.26/lib/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/ > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../ > /lib/ > /usr/lib/ > > > > -- > Davide Vanzo, PhD > Applic
Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
Jeff, Thanks for your feedback. Although tempting, changing the version of OpenMPI would mean a significant amount of changes in our software stack. Hence I would like to find out what the problem is and hopefully its solution. Where is the -L/usr/lib64 injected? Is there a way to patch the code so that it does not get added to the list of options to gfortran? -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 www.accre.vanderbilt.edu On 2017-11-29 14:31:48-06:00 Jeff Squyres (jsquyres) wrote: FWIW, adding -L/usr/lib or -L/usr/lib64 is generally considered Bad, because it may usurp the default linker path order. I note that you're using Open MPI 1.10.3 -- if you're unwilling/unable to upgrade to Open MPI 3.0.x, could you upgrade to Open MPI 1.10.7? We may well have fixed the issue in that time (i.e., do not have mpifort add -L/usr/lib64 to the command line). > On Nov 29, 2017, at 11:30 AM, Vanzo, Davidewrote: > > Thank you Åke, Reuti and Thomas for your replies. > > Just to clarify. The reason why /usr/lib64/libgfortran.so does not exist is intentional because on our cluster we use a minimal CentOS installation and all libraries are provided through the software stack built via EasyBuild on a non-system path. > > The output of `mpifort -show` shows the correct executable for `gfortran` ( I checked with `which gfortran`): > > $ mpifort multitask_mpi.f90 -show > gfortran multitask_mpi.f90 -I/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/include -I/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib -L/usr/lib64 -Wl,-rpath -Wl,/usr/lib64 -Wl,-rpath -Wl,/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib -Wl,--enable-new-dtags -L/accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > > > If I try to compile the source code with the explicit command after removing the `-L/usr/lib64` flag it compiles, links and executes correctly. > > For the sake of testing I have built OpenMPI in the same exact way on another machine with the same minimal OS. The only difference (at least that I am aware of) is the root prefix of the installation path. In this case it works correctly even without removing `-L/usr/lib64`. > > To try to better understand, I have launched `mpifort -v`. Below you can find the paths in COMPILER_PATH and LIBRARY_PATH for both systems. > Take into account that on the broken system `/accre/arch` is a symlink to `/gpfs22/accre/optimized/haswell`. But I do not see how this can be the problem... > > > Working system: > > COMPILER_PATH > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/libexec/gcc/x86_64-unknown-linux-gnu/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/ > > LIBRARY_PATH > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/../lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/../lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../../lib64/ > /lib/../lib64/ > /usr/lib/../lib64/ > /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/OpenMPI/1.10.3/lib/ > /accre/arch/easybuild/software/Compiler/GCC/5.4.0-2.26/hwloc/1.11.3/lib/ > /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/numactl/2.0.11/lib/ > /accre/arch/easybuild/software/Compiler/GCCcore/5.4.0/binutils/2.26/lib/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib/gcc/x86_64-unknown-linux-gnu/5.4.0/../../../ > /lib/ > /usr/lib/ > > > Broken system: > > COMPILER_PATH > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../libexec/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../libexec/gcc/ > > LIBRARY_PATH > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/x86_64-unknown-linux-gnu/5.4.0/ > /gpfs22/accre/optimized/haswell/easybuild/software/Core/GCCcore/5.4.0/bin/../lib/gcc/ > /accre/arch/easybuild/software/Core/GCCcore/5.4.0/lib64/../lib64/ >
Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
On Nov 29, 2017, at 4:51 PM, Vanzo, Davide wrote: > > Although tempting, changing the version of OpenMPI would mean a significant > amount of changes in our software stack. Understood. FWIW: the only differences between 1.10.3 and 1.10.7 were bug fixes (including, I'm assuming -- I haven't tested myself -- this -L issue). Hypothetically, it should be a fairly painless upgrade. > Hence I would like to find out what the problem is and hopefully its solution. > > Where is the -L/usr/lib64 injected? Is there a way to patch the code so that > it does not get added to the list of options to gfortran? It's injected pretty deep inside configure. We might be able to spelunk through the git logs to find the commit that fixes this issue and you could apply that as a patch, but it might be easier to just manually patch up the wrapper compiler data file after the build. Specifically, it looks like OMPI 1.10.3 is installing faulty values $prefix/share/openmpi/*-wrapper-data.txt. You can easily edit these files directly and remove the erroneous -L/usr/lib64. If you're unable to upgrade to 1.10.7, patching the installed *-wrapper-data.txt files is probably your best bet. -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
Jeff, Thanks for pointing me in the right direction. I have finally figured out what the problem is. On the cluster we install Slurm via RPMs and the PMI/PMI2 libraries are in /usr/lib64. Hence the -L/usr/lib64 flag is the effect of the --with-pmi=/usr configure flag. The good thing is that even by omitting it the final binary is correctly linked to the PMI libraries. And the reason why in the other system I tested the build it was working is because there is no Slurm installed in it. -- Davide Vanzo, PhD Application Developer Adjunct Assistant Professor of Chemical and Biomolecular Engineering Advanced Computing Center for Research and Education (ACCRE) Vanderbilt University - Hill Center 201 (615)-875-9137 www.accre.vanderbilt.edu On 2017-11-29 16:07:04-06:00 Jeff Squyres (jsquyres) wrote: On Nov 29, 2017, at 4:51 PM, Vanzo, Davidewrote: > > Although tempting, changing the version of OpenMPI would mean a significant amount of changes in our software stack. Understood. FWIW: the only differences between 1.10.3 and 1.10.7 were bug fixes (including, I'm assuming -- I haven't tested myself -- this -L issue). Hypothetically, it should be a fairly painless upgrade. > Hence I would like to find out what the problem is and hopefully its solution. > > Where is the -L/usr/lib64 injected? Is there a way to patch the code so that it does not get added to the list of options to gfortran? It's injected pretty deep inside configure. We might be able to spelunk through the git logs to find the commit that fixes this issue and you could apply that as a patch, but it might be easier to just manually patch up the wrapper compiler data file after the build. Specifically, it looks like OMPI 1.10.3 is installing faulty values $prefix/share/openmpi/*-wrapper-data.txt. You can easily edit these files directly and remove the erroneous -L/usr/lib64. If you're unable to upgrade to 1.10.7, patching the installed *-wrapper-data.txt files is probably your best bet. -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.open-mpi.org%2Fmailman%2Flistinfo%2Fusers&data=02%7C01%7Cdavide.vanzo%40vanderbilt.edu%7C11f5b064a08144e0e4bc08d5377584eb%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636475900235203296&sdata=O94%2FKc7jajpw5%2BdCuRxvkjrdoR9ESR0DLB61C30%2BBp0%3D&reserved=0 ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] mpifort cannot find libgfortran.so at the correct path
Ah. You might also try just --with-pmi (instead of --with-pmi=/usr). That might avoid adding the errant -L/usr/lib64 to the wrapper data files. Our configure nomenclature is (from README): - Note that for many of Open MPI's --with- options, Open MPI will, by default, search for header files and/or libraries for . If the relevant files are found, Open MPI will built support for ; if they are not found, Open MPI will skip building support for . However, if you specify --with- on the configure command line and Open MPI is unable to find relevant support for , configure will assume that it was unable to provide a feature that was specifically requested and will abort so that a human can resolve out the issue. - > On Nov 29, 2017, at 5:25 PM, Vanzo, Davide > wrote: > > Jeff, > > Thanks for pointing me in the right direction. I have finally figured out > what the problem is. > > On the cluster we install Slurm via RPMs and the PMI/PMI2 libraries are in > /usr/lib64. Hence the -L/usr/lib64 flag is the effect of the --with-pmi=/usr > configure flag. The good thing is that even by omitting it the final binary > is correctly linked to the PMI libraries. > > And the reason why in the other system I tested the build it was working is > because there is no Slurm installed in it. > > -- > Davide Vanzo, PhD > Application Developer > Adjunct Assistant Professor of Chemical and Biomolecular Engineering > Advanced Computing Center for Research and Education (ACCRE) > Vanderbilt University - Hill Center 201 > (615)-875-9137 > www.accre.vanderbilt.edu > > On 2017-11-29 16:07:04-06:00 Jeff Squyres (jsquyres) wrote: > > On Nov 29, 2017, at 4:51 PM, Vanzo, Davide >wrote: > > > > Although tempting, changing the version of OpenMPI would mean a > significant amount of changes in our software stack. > > Understood. > > FWIW: the only differences between 1.10.3 and 1.10.7 were bug fixes > (including, I'm assuming -- I haven't tested myself -- this -L issue). > Hypothetically, it should be a fairly painless upgrade. > > > Hence I would like to find out what the problem is and hopefully its > solution. > > > > Where is the -L/usr/lib64 injected? Is there a way to patch the code > so that it does not get added to the list of options to gfortran? > > It's injected pretty deep inside configure. > > We might be able to spelunk through the git logs to find the commit that > fixes this issue and you could apply that as a patch, but it might be easier > to just manually patch up the wrapper compiler data file after the build. > > Specifically, it looks like OMPI 1.10.3 is installing faulty values > $prefix/share/openmpi/*-wrapper-data.txt. You can easily edit these files > directly and remove the erroneous -L/usr/lib64. If you're unable to upgrade > to 1.10.7, patching the installed *-wrapper-data.txt files is probably your > best bet. > > -- > Jeff Squyres > jsquy...@cisco.com > > ___ > users mailing list > users@lists.open-mpi.org > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.open-mpi.org%2Fmailman%2Flistinfo%2Fusers&data=02%7C01%7Cdavide.vanzo%40vanderbilt.edu%7C11f5b064a08144e0e4bc08d5377584eb%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636475900235203296&sdata=O94%2FKc7jajpw5%2BdCuRxvkjrdoR9ESR0DLB61C30%2BBp0%3D&reserved=0 > > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users