Luca, your email mentions openmpi 1.6.5 but gdb output points to openmpi 1.8.1.
could the root cause be a mix of versions that does not occur with root account ? which openmpi version are you expecting ? you can run pmap <pid> when your binary is running and/or under gdb to confirm the openmpi library that is really used Cheers, Gilles On Wed, Dec 10, 2014 at 7:21 PM, Luca Fini <lf...@arcetri.astro.it> wrote: > I've a problem running a well tested MPI based application. > > The program has been used for years with no problems. Suddenly the > executable which was run many times with no problems crashed with > SIGSEGV. The very same executable if run with root privileges works > OK. The same happens with other executables and across various > recompilation attempts. > > We could not find any relevant difference in the O.S. since a few days > ago when the program worked also under unprivileged user ID. Actually > about in the same span of time we changed the GID of the user > experiencing the fault, but we think this is not relevant because the > same SIGSEGV happens to another user which was not modified. Moreover > we cannot see how that change can affect the running executabe (we > checked all file permissions in the directory tree where the program > is used). > > Running the program under GDB we get the trace reported below. The > segfault happens at the very beginning during MPI initialization. > > We can use the program with sudo, but I'd like to find out what > happened to go back to "normal" usage. > > I'd appreciate any hint on the issue. > > Many thanks, > > Luca Fini > > ============================== > Here follows a few environment details: > > Program started with: mpirun -debug -debugger gdb -np 1 > > /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD > > OPEN-MPI 1.6.5 > > Linux 2.6.32-431.29.2.2.6.32-431.29.2.el6.x86_64 > > Intel fortran Compiler: 2011.7.256 > > ========================= > Here follows the stack trace: > > Starting program: > > /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD > > /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD > [Thread debugging using libthread_db enabled] > > Program received signal SIGSEGV, Segmentation fault. > 0x00002aaaaaf652c7 in mca_base_component_find (directory=0x0, > type=0x3b914a7fb5 "rte", static_components=0x3b916cb040, > requested_component_names=0x0, include_mode=128, found_components=0x1, > open_dso_components=16) > at mca_base_component_find.c:162 > 162 OBJ_CONSTRUCT(found_components, opal_list_t); > Missing separate debuginfos, use: debuginfo-install > glibc-2.12-1.149.el6.x86_64 libgcc-4.4.7-11.el6.x86_64 > libgfortran-4.4.7-11.el6.x86_64 libtool-ltdl-2.2.6-15.5.el6.x86_64 > openmpi-1.8.1-1.el6.x86_64 > (gdb) where > #0 0x00002aaaaaf652c7 in mca_base_component_find (directory=0x0, > type=0x3b914a7fb5 "rte", static_components=0x3b916cb040, > requested_component_names=0x0, include_mode=128, found_components=0x1, > open_dso_components=16) > at mca_base_component_find.c:162 > #1 0x0000003b90c4870a in mca_base_framework_components_register () > from /usr/lib64/openmpi/lib/libopen-pal.so.6 > #2 0x0000003b90c48c06 in mca_base_framework_register () from > /usr/lib64/openmpi/lib/libopen-pal.so.6 > #3 0x0000003b90c48def in mca_base_framework_open () from > /usr/lib64/openmpi/lib/libopen-pal.so.6 > #4 0x0000003b914407e7 in ompi_mpi_init () from > /usr/lib64/openmpi/lib/libmpi.so.1 > #5 0x0000003b91463200 in PMPI_Init () from > /usr/lib64/openmpi/lib/libmpi.so.1 > #6 0x00002aaaaacd9295 in mpi_init_f (ierr=0x7fffffffd268) at pinit_f.c:75 > #7 0x00000000005bb159 in MODE_MNH_WORLD::init_nmnh_comm_world > (kinfo_ll=Cannot access memory at address 0x0 > ) at > /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/MASTER/spll_mode_mnh_world.f90:45 > #8 0x00000000005939d3 in MODE_IO_LL::initio_ll () at > > /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/MASTER/spll_mode_io_ll.f90:107 > #9 0x000000000049d02f in prep_pgd () at > > /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/MASTER/spll_prep_pgd.f90:130 > #10 0x000000000049cf8c in main () > > -- > Luca Fini. INAF - Oss. Astrofisico di Arcetri > L.go E.Fermi, 5. 50125 Firenze. Italy > Tel: +39 055 2752 307 Fax: +39 055 2752 292 > Skype: l.fini > Web: http://www.arcetri.inaf.it/~lfini > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/25945.php >