Try running a dynamic version of your process through valgrind, or another memory-checking debugger and see if anything shows up.
On Jan 30, 2012, at 2:50 PM, Ilias Miroslav wrote: > Well, > > the simplest program, > > program main > implicit none > include 'mpif.h' > integer ierr, rank, size > call MPI_INIT(ierr) > call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) > call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) > print *, "Hello, world, I am ", rank, " of ", size > call MPI_FINALIZE(ierr) > end > > is running, may be my larger application "dirac.x" is hurting static > OpenMPI... > > ilias@194.160.135.47:/tmp/ilias/test/simplest/./home/ilias/bin/ompi_ilp64_static/bin/mpif90 > -fdefault-integer-8 hello.f > ilias@194.160.135.47:/tmp/ilias/test/simplest/.mpirun -v -np 2 ./a.out > Hello, world, I am 1 of 2 > Hello, world, I am 0 of 2 > ilias@194.160.135.47:/tmp/ilias/test/simplest/.mpirun -v -np 1 ./a.out > Hello, world, I am 0 of 1 > ilias@194.160.135.47:/tmp/ilias/test/simplest/.mpirun -v -np 4 ./a.out > Hello, world, I am 3 of 4 > Hello, world, I am 1 of 4 > Hello, world, I am 0 of 4 > Hello, world, I am 2 of 4 > ilias@194.160.135.47:/tmp/ilias/test/simplest/. > > ________________________________________ > From: Ilias Miroslav > Sent: Monday, January 30, 2012 8:40 PM > To: us...@open-mpi.org > Subject: Re: [OMPI users] pure static "mpirun" launcher (Jeff Squyres) - now > testing > > Hi, > > what segfaulted ? I am not sure...maybe application is bug showing up with > static OpenMPI. > > I try to compile & run simplest MPI example and I shall let you know. > > In betweem I am attaching debugger output would help to track this bug: > > Backtrace for this error: > + function __restore_rt (0x255B110) > from file sigaction.c > > > slave (mpi processes are from #10): > (gdb) where > #0 0x00000000023622db in sm_fifo_read (fifo=0x7f77cc908300) at btl_sm.h:324 > #1 0x000000000236309b in mca_btl_sm_component_progress () at > btl_sm_component.c:612 > #2 0x0000000002304f26 in opal_progress () at runtime/opal_progress.c:207 > #3 0x00000000023c8a77 in opal_condition_wait (c=0xf78bf80, m=0xf78c000) at > ../../../../opal/threads/condition.h:100 > #4 0x00000000023c8eb7 in ompi_request_wait_completion (req=0x10602f00) at > ../../../../ompi/request/request.h:378 > #5 0x00000000023ca661 in mca_pml_ob1_send (buf=0xefbb2a0, count=1000, > datatype=0x2901180, dst=1, tag=-17, > sendmode=MCA_PML_BASE_SEND_STANDARD, comm=0xf772b20) at pml_ob1_isend.c:125 > #6 0x000000000236e978 in ompi_coll_tuned_bcast_intra_split_bintree > (buffer=0xefb9360, count=2000, datatype=0x2901180, root=0, > comm=0xf772b20, module=0x1060b7f0, segsize=1024) at coll_tuned_bcast.c:590 > #7 0x0000000002370834 in ompi_coll_tuned_bcast_intra_dec_fixed > (buff=0xefb9360, count=2000, datatype=0x2901180, root=0, > comm=0xf772b20, module=0x1060b7f0) at coll_tuned_decision_fixed.c:262 > #8 0x0000000002371c52 in mca_coll_sync_bcast (buff=0xefb9360, count=2000, > datatype=0x2901180, root=0, comm=0xf772b20, > module=0x1060b590) at coll_sync_bcast.c:44 > #9 0x0000000002249662 in PMPI_Bcast (buffer=0xefb9360, count=2000, > datatype=0x2901180, root=0, comm=0xf772b20) at pbcast.c:110 > #10 0x000000000221744a in mpi_bcast_f ( > buffer=0xefb9360 > "\026\372`\031\033\336O@\005\031\001\025\216\260&@\301\343ۻ\006\375\003@\251L1\aAG\344?\301\343ۻ\006\375\003@\251L1\aAG\344?HN&n\025\233\\@\252\325WW\005^4@8\333ܘ\236`\027@\025\253\006an\267\367?Ih˹\024W\324?8\021\375\332\372\351\273?8\333ܘ\236`\027@\025\253\006an\267\367?Ih˹\024W\324?8\021\375\332\372\351\273?\301\343ۻ\006\375\003@\251L1\aAG\344?\026\372`\031\033\336O@\005\031\001\025\216\260&@\301\343ۻ\006\375\003@\251L1\aAG\344?\301\343ۻ\006\375\003@\251L1\aAG\344?8\333ܘ\236`\027@"..., > count=0x2624818, datatype=0x26037a0, root=0xf73ba90, comm=0x26247a0, > ierr=0x7fffbb34ec78) at pbcast_f.c:70 > #11 0x000000000041ab68 in interface_to_mpi::interface_mpi_bcast_r1 (x=<value > optimized out>, ndim=2000, root_proc=0, communicator=0) > at > /home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/interface_mpi/interface_to_mpi.F90:446 > #12 0x0000000000e95a37 in get_primitf () at > /home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/abacus/herpar.F:1464 > #13 0x0000000000e99ff7 in sdinit (dmat=..., ndmat=2, irepdm=..., ifctyp=..., > itype=9, maxdif=<value optimized out>, iatom=0, > nodv=.TRUE., nopv=.TRUE., nocont=.FALSE., tktime=.FALSE., retur=.FALSE., > i2typ=1, icedif=3, screen=9.9999999999999998e-13, > gabrao=..., dmrao=..., dmrso=...) at > /home/ilias/qch_work/qch_software/dirac_git/dirac-git-repo/abacus/herpar.F:566 > #14 0x0000000000e9af35 in her_pardrv (work=..., lwork=<value optimized out>, > fmat=..., dmat=..., ndmat=2, irepdm=..., ifctyp=..., > . > . > . > and master: > (gdb) where > #0 0x000000000058de18 in poll () > #1 0x0000000000496f58 in poll_dispatch () > #2 0x0000000000471649 in opal_libevent2013_event_base_loop () > #3 0x00000000004016ea in orterun (argc=4, argv=0x7fff484b6478) at > orterun.c:866 > #4 0x00000000004005d4 in main (argc=4, argv=0x7fff484b6478) at main.c:13 > > > ________________________________________ > From: Ilias Miroslav > Sent: Monday, January 30, 2012 7:24 PM > To: us...@open-mpi.org > Subject: Re: pure static "mpirun" launcher (Jeff Squyres) - now testing > > Hi Jeff, > > thanks for the fix; > > I downloaded the Open MPI trunk and have built it up, > > the (most recent) revision 25818 is giving this error and hangs: > > /home/ilias/bin/ompi_ilp64_static/bin/mpirun -np 2 ./dirac.x > . > . > Program received signal 11 (SIGSEGV): Segmentation fault. > > Backtrace for this error: > + function __restore_rt (0x255B110) > from file sigaction.c > > The configuration: > $ ./configure --prefix=/home/ilias/bin/ompi_ilp64_static > --without-memory-manager LDFLAGS=--static --disable-shared --enable-static > CXX=g++ CC=gcc F77=gfortran FC=gfortran FFLAGS=-m64 -fdefault-integer-8 > FCFLAGS=-m64 -fdefault-integer-8 CFLAGS=-m64 CXXFLAGS=-m64 > --enable-ltdl-convenience --no-create --no-recursion > > The "dirac.x" static executable was obtained with this static openmpi: > write(lupri, '(a)') ' System | Linux-2.6.30-1-amd64' > write(lupri, '(a)') ' Processor | x86_64' > write(lupri, '(a)') ' Internal math | ON' > write(lupri, '(a)') ' 64-bit integers | ON' > write(lupri, '(a)') ' MPI | ON' > write(lupri, '(a)') ' Fortran compiler | > /home/ilias/bin/ompi_ilp64_static/bin/mpif90' > write(lupri, '(a)') ' Fortran compiler version | GNU Fortran (Debian > 4.6.2-9) 4.6.2' > write(lupri, '(a)') ' Fortran flags | -g -fcray-pointer > -fbacktrace -DVAR_GFORTRAN -DVAR' > write(lupri, '(a)') ' | _MFDS -fno-range-check > -static -fdefault-integer-8' > write(lupri, '(a)') ' | -O3 -funroll-all-loops' > write(lupri, '(a)') ' C compiler | > /home/ilias/bin/ompi_ilp64_static/bin/mpicc' > write(lupri, '(a)') ' C compiler version | gcc (Debian 4.6.2-9) > 4.6.2' > write(lupri, '(a)') ' C flags | -g -static -fpic -O2 > -Wno-unused' > write(lupri, '(a)') ' static libraries linking | ON' > > ldd dirac.x > not a dynamic executable > > > Any help, please ? How to include MPI-debug statements ? > > > > > > > > 1. Re: pure static "mpirun" launcher (Jeff Squyres) > ---------------------------------------------------------------------- > Message: 1 > Date: Fri, 27 Jan 2012 13:44:49 -0500 > From: Jeff Squyres <jsquy...@cisco.com> > Subject: Re: [OMPI users] pure static "mpirun" launcher > To: Open MPI Users <us...@open-mpi.org> > Message-ID: <be6dbe92-784c-4594-8f4a-397a19c55...@cisco.com> > Content-Type: text/plain; charset=us-ascii > > Ah ha, I think I got it. There was actually a bug about disabling the memory > manager in trunk/v1.5.x/v1.4.x. I fixed it on the trunk and scheduled it for > v1.6 (since we're trying very hard to get v1.5.5 out the door) and v1.4.5. > > On the OMPI trunk on RHEL 5 with gcc 4.4.6, I can do this: > > ./configure --without-memory-manager LDFLAGS=--static --disable-shared > --enable-static > > And get a fully static set of OMPI executables. For example: > > ----- > [10:41] svbu-mpi:~ % cd $prefix/bin > [10:41] svbu-mpi:/home/jsquyres/bogus/bin % ldd * > mpic++: > not a dynamic executable > mpicc: > not a dynamic executable > mpiCC: > not a dynamic executable > mpicxx: > not a dynamic executable > mpiexec: > not a dynamic executable > mpif77: > not a dynamic executable > mpif90: > not a dynamic executable > mpirun: > not a dynamic executable > ompi-clean: > not a dynamic executable > ompi_info: > not a dynamic executable > ompi-ps: > not a dynamic executable > ompi-server: > not a dynamic executable > ompi-top: > not a dynamic executable > opal_wrapper: > not a dynamic executable > ortec++: > not a dynamic executable > ortecc: > not a dynamic executable > orteCC: > not a dynamic executable > orte-clean: > not a dynamic executable > orted: > not a dynamic executable > orte-info: > not a dynamic executable > orte-ps: > not a dynamic executable > orterun: > not a dynamic executable > orte-top: > not a dynamic executable > ----- > > So I think the answer here is: it depends on a few factors: > > 1. Need that bug fix that I just committed. > 2. Libtool is stripping out -static (and/or --static?). So you have to find > some other flags to make your compiler/linker do static. > 3. Your OS has to support static builds. For example, RHEL6 doesn't install > libc.a by default (it's apparently on the optional DVD, which I don't have). > My RHEL 5.5 install does have it, though. > > > On Jan 27, 2012, at 11:16 AM, Jeff Squyres wrote: > >> I've tried a bunch of variations on this, but I'm actually getting stymied >> by my underlying OS not supporting static linking properly. :-\ >> >> I do see that Libtool is stripping out the "-static" standalone flag that >> you passed into LDFLAGS. Yuck. What's -Wl,-E? Can you try "-Wl,-static" >> instead? >> >> >> On Jan 25, 2012, at 1:24 AM, Ilias Miroslav wrote: >> >>> Hello again, >>> >>> I need own static "mpirun" for porting (together with the static >>> executable) onto various (unknown) grid servers. In grid computing one can >>> not expect OpenMPI-ILP64 installtion on each computing element. >>> >>> Jeff: I tried LDFLAGS in configure >>> >>> ilias@194.160.135.47:~/bin/ompi-ilp64_full_static/openmpi-1.4.4/../configure >>> --prefix=/home/ilias/bin/ompi-ilp64_full_static -without-memory-manager >>> --without-libnuma --enable-static --disable-shared CXX=g++ CC=gcc >>> F77=gfortran FC=gfortran FFLAGS="-m64 -fdefault-integer-8 -static" >>> FCFLAGS="-m64 -fdefault-integer-8 -static" CFLAGS="-m64 -static" >>> CXXFLAGS="-m64 -static" LDFLAGS="-static -Wl,-E" >>> >>> but still got dynamic, not static "mpirun": >>> ilias@194.160.135.47:~/bin/ompi-ilp64_full_static/bin/.ldd ./mpirun >>> linux-vdso.so.1 => (0x00007fff6090c000) >>> libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fd7277cf000) >>> libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007fd7275b7000) >>> libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fd7273b3000) >>> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd727131000) >>> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 >>> (0x00007fd726f15000) >>> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd726b90000) >>> /lib64/ld-linux-x86-64.so.2 (0x00007fd7279ef000) >>> >>> Any help please ? config.log is here: >>> >>> https://docs.google.com/open?id=0B8qBHKNhZAipNTNkMzUxZDEtNjJmZi00YzY3LWI4MmYtY2RkZDVkMjhiOTM1 >>> >>> Best, Miro >>> ------------------------------ >>> Message: 10 >>> Date: Tue, 24 Jan 2012 11:55:21 -0500 >>> From: Jeff Squyres <jsquy...@cisco.com> >>> Subject: Re: [OMPI users] pure static "mpirun" launcher >>> To: Open MPI Users <us...@open-mpi.org> >>> Message-ID: <a86d3721-9bf8-4a7d-b745-32e606521...@cisco.com> >>> Content-Type: text/plain; charset=windows-1252 >>> >>> Ilias: Have you simply tried building Open MPI with flags that force static >>> linking? E.g., something like this: >>> >>> ./configure --enable-static --disable-shared LDFLAGS=-Wl,-static >>> >>> I.e., put in LDFLAGS whatever flags your compiler/linker needs to force >>> static linking. These LDFLAGS will be applied to all of Open MPI's >>> executables, including mpirun. >>> >>> >>> On Jan 24, 2012, at 10:28 AM, Ralph Castain wrote: >>> >>>> Good point! I'm traveling this week with limited resources, but will try >>>> to address when able. >>>> >>>> Sent from my iPad >>>> >>>> On Jan 24, 2012, at 7:07 AM, Reuti <re...@staff.uni-marburg.de> wrote: >>>> >>>>> Am 24.01.2012 um 15:49 schrieb Ralph Castain: >>>>> >>>>>> I'm a little confused. Building procs static makes sense as libraries >>>>>> may not be available on compute nodes. However, mpirun is only executed >>>>>> in one place, usually the head node where it was built. So there is less >>>>>> reason to build it purely static. >>>>>> >>>>>> Are you trying to move mpirun somewhere? Or is it the daemons that >>>>>> mpirun launches that are the real problem? >>>>> >>>>> This depends: if you have a queuing system, the master node of a parallel >>>>> job may be one of the slave nodes already where the jobscript runs. >>>>> Nevertheless I have the nodes uniform, but I saw places where it wasn't >>>>> the case. >>>>> >>>>> An option would be to have a special queue, which will execute the >>>>> jobscript always on the headnode (i.e. without generating any load) and >>>>> use only non-local granted slots for mpirun. For this it might be >>>>> necssary to have a high number of slots on the headnode for this queue, >>>>> and request always one slot on this machine in addition to the necessary >>>>> ones on the computing node. >>>>> >>>>> -- Reuti >>>>> >>>>> >>>>>> Sent from my iPad >>>>>> >>>>>> On Jan 24, 2012, at 5:54 AM, Ilias Miroslav <miroslav.il...@umb.sk> >>>>>> wrote: >>>>>> >>>>>>> Dear experts, >>>>>>> >>>>>>> following http://www.open-mpi.org/faq/?category=building#static-build I >>>>>>> successfully build static OpenMPI library. >>>>>>> Using such prepared library I succeeded in building parallel static >>>>>>> executable - dirac.x (ldd dirac.x-not a dynamic executable). >>>>>>> >>>>>>> The problem remains, however, with the mpirun (orterun) launcher. >>>>>>> While on the local machine, where I compiled both static OpenMPI & >>>>>>> static dirac.x I am able to launch parallel job >>>>>>> <OpenMPI_static>/mpirun -np 2 dirac.x , >>>>>>> I can not lauch it elsewhere, because "mpirun" is dynamically linked, >>>>>>> thus machine dependent: >>>>>>> >>>>>>> ldd mpirun: >>>>>>> linux-vdso.so.1 => (0x00007fff13792000) >>>>>>> libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f40f8cab000) >>>>>>> libnsl.so.1 => /lib/x86_64-linux-gnu/libnsl.so.1 (0x00007f40f8a93000) >>>>>>> libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f40f888f000) >>>>>>> libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f40f860d000) >>>>>>> libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 >>>>>>> (0x00007f40f83f1000) >>>>>>> libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f40f806c000) >>>>>>> /lib64/ld-linux-x86-64.so.2 (0x00007f40f8ecb000) >>>>>>> >>>>>>> Please how to I build "pure" static mpirun launcher, usable (in my case >>>>>>> together with static dirac.x) also on other computers ? >>>>>>> >>>>>>> Thanks, Miro >>>>>>> >>>>>>> -- >>>>>>> RNDr. Miroslav Ilia?, PhD. >>>>>>> >>>>>>> Katedra ch?mie >>>>>>> Fakulta pr?rodn?ch vied >>>>>>> Univerzita Mateja Bela >>>>>>> Tajovsk?ho 40 >>>>>>> 97400 Bansk? Bystrica >>>>>>> tel: +421 48 446 7351 >>>>>>> email : miroslav.il...@umb.sk >>>>>>> >>>>>>> Department of Chemistry >>>>>>> Faculty of Natural Sciences >>>>>>> Matej Bel University >>>>>>> Tajovsk?ho 40 >>>>>>> 97400 Banska Bystrica >>>>>>> Slovakia >>>>>>> tel: +421 48 446 7351 >>>>>>> email : miroslav.il...@umb.sk >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> us...@open-mpi.org >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> us...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> End of users Digest, Vol 2133, Issue 1 >>> ************************************** >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/