Siegmar -- This looks like the typical type of alignment error that we used to see when testing regularly on SPARC. :-\
It looks like the error was happening in mca_db_hash.so. Could you get a stack trace / file+line number where it was failing in mca_db_hash? (i.e., the actual bad code will likely be under opal/mca/db/hash somewhere) On Jul 25, 2014, at 2:08 AM, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > I have installed openmpi-1.8.2rc2 with gcc-4.9.0 on Solaris > 10 Sparc and I receive a bus error, if I run a small program. > > tyr hello_1 105 mpiexec -np 2 a.out > [tyr:29164] *** Process received signal *** > [tyr:29164] Signal: Bus Error (10) > [tyr:29164] Signal code: Invalid address alignment (1) > [tyr:29164] Failing at address: ffffffff7fffd1c4 > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:opal_backtrace_print+0x2c > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:0xccfd0 > /lib/sparcv9/libc.so.1:0xd8b98 > /lib/sparcv9/libc.so.1:0xcc70c > /lib/sparcv9/libc.so.1:0xcc918 > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/openmpi/mca_db_hash.so:0x3ee8 > [ Signal 10 (BUS)] > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:opal_db_base_store+0xc8 > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_util_decode_pidmap+0x798 > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_util_nidmap_init+0x3cc > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/openmpi/mca_ess_env.so:0x226c > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_init+0x308 > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libmpi.so.1.5.2:ompi_mpi_init+0x31c > /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libmpi.so.1.5.2:PMPI_Init+0x2a8 > /home/fd1026/work/skripte/master/parallel/prog/mpi/hello_1/a.out:main+0x20 > /home/fd1026/work/skripte/master/parallel/prog/mpi/hello_1/a.out:_start+0x7c > [tyr:29164] *** End of error message *** > ... > > > I get the following output if I run the program in "dbx". > > ... > RTC: Enabling Error Checking... > RTC: Running program... > Write to unallocated (wua) on thread 1: > Attempting to write 1 byte at address 0xffffffff79f04000 > t@1 (l@1) stopped in _readdir at 0xffffffff55174da0 > 0xffffffff55174da0: _readdir+0x0064: call > _PROCEDURE_LINKAGE_TABLE_+0x2380 [PLT] ! 0xffffffff55342a80 > (dbx) > > > Hopefully the above output helps to fix the error. Can I provide > anything else? Thank you very much for any help in advance. > > > Kind regards > > Siegmar > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/07/24869.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/