Hi Jeff, I just tried with 1.2b4r13690 and the problem is still present. Only nottable differance is that CTRL-C gave me orterun: killing job... but stuck there untill I hit CTRL-\..if it has any bearing on the issue. Again, the command line was:
orterun -np 11 ./perftest-1.3c/mpptest -max_run_time 1800 -bisect -size 0 4096 1 -gnuplot -fname HyperTransport/Global_bisect_0_4096_1.gpl (only difference is that I had 11 procs instead of 9 available) Le vendredi 16 février 2007 06:50, Jeff Squyres a écrit : > Could you try one of the later nightly 1.2 tarballs? We just fixed a > shared memory race condition, for example: > > http://www.open-mpi.org/nightly/v1.2/ > > > On Feb 16, 2007, at 12:12 AM, Eric Thibodeau wrote: > > > Hello devs, > > > > Thought I would let you know there seems to be a problem with > > 1.2b3r13112 when running the "bisection" test on a Tyan VX50 > > machine (the 8 DualCore model with 32Gigs of RAM). > > > > OpenMPI was compiled with (as seen from config.log): > > configure:116866: running /bin/sh './configure' CFLAGS="-O3 - > > DNDEBUG -finline-functions -fno-strict-aliasing -pthread" > > CPPFLAGS=" " FFLAGS="" LDFLAGS=" " --enable-shared --disable- > > static --prefix=/export/livia/home/parallel/eric/openmpi_x86_64 -- > > with-mpi=open_mpi --cache-file=/dev/null --srcdir=. > > > > MPPTEST (1.3c) was compiled with: > > ./configure --with-mpi=$HOME/openmpi_`uname -m` > > > > ...which, for some reason, works fine on that system that doesn't > > have any other MPI implementation (ie: doesn't have LAM-MPI as per > > this thread). > > > > Then I ran a few tests but this one ran for over it's allowed time > > (1800 seconds and was going over 50minutes...) and was up to 16Gigs > > of RAM: > > > > orterun -np 9 ./perftest-1.3c/mpptest -max_run_time 1800 -bisect - > > size 0 4096 1 -gnuplot -fname HyperTransport/ > > Global_bisect_0_4096_1.gpl > > > > I had to CTRL-\ the process as CTRL-C wasn't sufficient. 2 mpptest > > processes and 1 orterun process were using 100% CPU ou of of the 16 > > cores. > > > > If any of this can be indicative of an OpenMPI bug and if I can > > help in tracking it down, don't hesitate to ask for details. > > > > And, finally, Anthony, thanks for the MPICC and --with-mpich > > pointers, I will try those to simplify the build process! > > > > Eric > > > > Le jeudi 15 février 2007 19:51, Anthony Chan a écrit : > >> > >> As long as mpicc is working, try configuring mpptest as > >> > >> mpptest/configure MPICC=<OpenMPI-install-dir>/bin/mpicc > >> > >> or > >> > >> mpptest/configure --with-mpich=<OpenMPI-install-dir> > >> > >> A.Chan > >> > >> On Thu, 15 Feb 2007, Eric Thibodeau wrote: > >> > >>> Hi Jeff, > >>> > >>> Thanks for your response, I eventually figured it out, here is the > >>> only way I got mpptest to compile: > >>> > >>> export LD_LIBRARY_PATH="$HOME/openmpi_`uname -m`/lib" > >>> CC="$HOME/openmpi_`uname -m`/bin/mpicc" ./configure --with- > >>> mpi="$HOME/openmpi_`uname -m`" > >>> > >>> And, yes I know I should use the mpicc wrapper and all (I do > >>> RTFM :P ) but > >>> mpptest is less than cooperative and hasn't been updated lately > >>> AFAIK. > >>> > >>> I'll keep you posted on some results as I get some results out > >>> (testing > >>> TCP/IP as well as the HyperTransport on a Tyan Beast). Up to now, > >>> LAM-MPI > >>> seems less efficient at async communications and shows no > >>> improovments > >>> with persistant communications under TCP/IP. OpenMPI, on the > >>> other hand, > >>> seems more efficient using persistant communications when in a > >>> HyperTransport (shmem) environment... I know I am crossing many test > >>> boudaries but I will post some PNGs of my results (as well as how > >>> I got to > >>> them ;) > >>> > >>> Eric > >>> > >>> On Thu, 15 Feb 2007, Jeff Squyres wrote: > >>> > >>>> I think you want to add $HOME/openmpi_`uname -m`/lib to your > >>>> LD_LIBRARY_PATH. This should allow executables created by mpicc > >>>> (or > >>>> any derivation thereof, such as extracting flags via showme) to > >>>> find > >>>> the Right shared libraries. > >>>> > >>>> Let us know if that works for you. > >>>> > >>>> FWIW, we do recommend using the wrapper compilers over > >>>> extracting the > >>>> flags via --showme whenever possible (it's just simpler and > >>>> should do > >>>> what you need). > >>>> > >>>> > >>>> On Feb 15, 2007, at 3:38 PM, Eric Thibodeau wrote: > >>>> > >>>>> Hello all, > >>>>> > >>>>> > >>>>> I have been attempting to compile mpptest on my nodes in vain. > >>>>> Here > >>>>> is my current setup: > >>>>> > >>>>> > >>>>> Openmpi is in "$HOME/openmpi_`uname -m`" which translates to "/ > >>>>> export/home/eric/openmpi_i686/". I tried the following approaches > >>>>> (you can see some of these were out of desperation): > >>>>> > >>>>> > >>>>> CFLAGS=`mpicc --showme:compile` LDFLAGS=`mpicc --showme:link` ./ > >>>>> configure > >>>>> > >>>>> > >>>>> Configure fails on: > >>>>> > >>>>> checking whether the C compiler works... configure: error: cannot > >>>>> run C compiled programs. > >>>>> > >>>>> > >>>>> The log shows that: > >>>>> > >>>>> ./a.out: error while loading shared libraries: liborte.so.0: > >>>>> cannot > >>>>> open shared object file: No such file or directory > >>>>> > >>>>> > >>>>> > >>>>> CC="/export/home/eric/openmpi_i686/bin/mpicc" ./configure --with- > >>>>> mpi=$HOME/openmpi_`uname -m` > >>>>> > >>>>> Same problems as above... > >>>>> > >>>>> > >>>>> LDFLAGS="$HOME/openmpi_`uname -m`/lib" ./configure --with-mpi= > >>>>> $HOME/ > >>>>> openmpi_`uname -m` > >>>>> > >>>>> > >>>>> Configure fails on: > >>>>> > >>>>> checking for C compiler default output file name... configure: > >>>>> error: C compiler cannot create executables > >>>>> > >>>>> > >>>>> And...finally (not that all of this was done in the presented > >>>>> order): > >>>>> > >>>>> ./configure --with-mpi=$HOME/openmpi_`uname -m` > >>>>> > >>>>> > >>>>> Which ends with: > >>>>> > >>>>> > >>>>> checking for library containing MPI_Init... no > >>>>> > >>>>> configure: error: Could not find MPI library > >>>>> > >>>>> > >>>>> Anyone can help me with this one...? > >>>>> > >>>>> > >>>>> Note that LAM-MPI is also installed on these systems... > >>>>> > >>>>> > >>>>> Eric Thibodeau > >>>>> > >>>>> > >>>>> _______________________________________________ > >>>>> users mailing list > >>>>> us...@open-mpi.org > >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>>> > >>>> > >>>> > >>> > >>> _______________________________________________ > >>> users mailing list > >>> us...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/users > >>> > >>> > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > > > > -- > > Eric Thibodeau > > Neural Bucket Solutions Inc. > > T. (514) 736-1436 > > C. (514) 710-0517 > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > -- Eric Thibodeau Neural Bucket Solutions Inc. T. (514) 736-1436 C. (514) 710-0517