Re: [OMPI users] Segfault on any MPI communication on head node

2011-09-24 Thread Jeff Squyres
Are you running the same OS version and Open MPI version between the head node and regular nodes? On Sep 23, 2011, at 5:27 PM, Vassenkov, Phillip wrote: > Hey all, > I’ve been racking my brains over this for several days and was hoping anyone > could enlighten me. I’ll describe only the relevan

Re: [OMPI users] PATH settings

2011-09-24 Thread Jeff Squyres
On Sep 22, 2011, at 11:06 PM, Martin Siegert wrote: > I am trying to figure out how openmpi (1.4.3) sets its PATH > for executables. From the man page: > > Locating Files >If no relative or absolute path is specified for a file, Open MPI will >first look for files by searching the d

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Jeff Squyres
How does the target application compile / link itself? Try running "file" on the Open MPI libraries and/or your target application .o files to see what their bitness is, etc. On Sep 22, 2011, at 3:15 PM, Dmitry N. Mikushin wrote: > Hi Jeff, > > You're right because I also tried 1.4.3, and it'

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Dmitry N. Mikushin
Hi Jeff, Today I've verified this application on the Feroda 15 x86_64, where I'm usually building OpenMPI from source using the same method. Result: no link errors there! So, the issue is likely ubuntu-specific. Target application is compiled linked with mpif90 pointing to /opt/openmpi_gcc-1.5.4/

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Jeff Squyres
Can you compile / link simple OMPI applications without this problem? On Sep 24, 2011, at 7:54 AM, Dmitry N. Mikushin wrote: > Hi Jeff, > > Today I've verified this application on the Feroda 15 x86_64, where > I'm usually building OpenMPI from source using the same method. > Result: no link erro

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Dmitry N. Mikushin
Compile and link - yes, but it turns out there was some unnoticed compilation error because ./hellompi: error while loading shared libraries: libmpi_f77.so.1: cannot open shared object file: No such file or directory and this library does not exist. Hm. 2011/9/24 Jeff Squyres : > Can you compil

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-24 Thread Jeff Squyres
Check the output from when you ran Open MPI's configure and "make all" -- did it decide to build the F77 interface? Also check that gcc and gfortran output .o files of the same bitness / type. On Sep 24, 2011, at 8:07 AM, Dmitry N. Mikushin wrote: > Compile and link - yes, but it turns out the

Re: [OMPI users] on cluster job slowdown near end

2011-09-24 Thread Jeff Squyres
You might want to run some profiling / timing to see what parts of the application start running slower over time. Also check for memory leaks. On Sep 22, 2011, at 5:44 PM, Tom Hilinski wrote: > Hi, A job I am running slows down as it approaches the end. I'd > appreciate any ideas you may have

Re: [OMPI users] problems with Intel 12.x compilers and OpenMPI (1.4.3)

2011-09-24 Thread Jeff Squyres
As a pure guess, it might actually be this one: - Fix to detect and avoid overlapping memcpy(). Thanks to Francis Pellegrini for identifying the issue. We're actually very close to releasing 1.4.4 -- using the latest RC should be pretty safe. On Sep 23, 2011, at 5:51 AM, Paul Kapinos wr

Re: [OMPI users] Trouble compiling 1.4.3 with PGI 10.9 compilers

2011-09-24 Thread Jeff Squyres
Just out of curiosity, does Open MPI 1.5.4 build properly? We've seen problems with the PGI compiler suite before -- it *does* look like a problem with libtool-building issues; e.g., a switch is too old or is missing or something. Meaning: it looks like PGI thinks it's trying to build an appli

Re: [OMPI users] custom sparse collective non-reproducible deadlock, MPI_Sendrecv, MPI_Isend/MPI_Irecv or MPI_Send/MPI_Recv question

2011-09-24 Thread Jeff Squyres
Some random points: 1. Are your counts ever 0? In principle, method 1 should be fine, I think. But with blocking, I *think* you should be fine, but I haven't thought hard about this -- I have a nagging feeling that there might be a possibility of deadlock in there, but I could be wrong. 2. I

Re: [OMPI users] freezing in mpi_allreduce operation

2011-09-24 Thread Jeff Squyres
Holy crimminey, I'm totally lost in your Fortran syntax. :-) What you describe might be a bug in our MPI_IN_PLACE handling for MPI_ALLREDUCE. Could you possible make a small test case that a) we can run, and b) uses straightforward Fortran? (avoid using terms like "assumed shape" and "assumed

Re: [OMPI users] PATH settings

2011-09-24 Thread Martin Siegert
Thanks, Jeff, for the details! On Sat, Sep 24, 2011 at 07:26:49AM -0400, Jeff Squyres wrote: > On Sep 22, 2011, at 11:06 PM, Martin Siegert wrote: > > > I am trying to figure out how openmpi (1.4.3) sets its PATH > > for executables. From the man page: > > > > Locating Files > >If no relativ