This is the error you get when an invalid communicator handle is passed to a MPI function, the handle is deferenced so you may or may not get a SEGV from it depending on the value you pass.
The 0x440000a0 address is an offset from 0x44000000, the value of MPI_COMM_WORLD in mpich2, my guess would be you are either picking up a mpich2 mpi.h or the mpich2 mpicc. Ashley, On Tue, 2009-07-07 at 11:05 +0100, Catalin David wrote: > Hello, all! > > Just installed Valgrind (since this seems like a memory issue) and got > this interesting output (when running the test program): > > ==4616== Syscall param sched_setaffinity(mask) points to unaddressable byte(s) > ==4616== at 0x43656BD: syscall (in /lib/tls/libc-2.3.2.so) > ==4616== by 0x4236A75: opal_paffinity_linux_plpa_init (plpa_runtime.c:37) > ==4616== by 0x423779B: > opal_paffinity_linux_plpa_have_topology_information (plpa_map.c:501) > ==4616== by 0x4235FEE: linux_module_init (paffinity_linux_module.c:119) > ==4616== by 0x447F114: opal_paffinity_base_select > (paffinity_base_select.c:64) > ==4616== by 0x444CD71: opal_init (opal_init.c:292) > ==4616== by 0x43CE7E6: orte_init (orte_init.c:76) > ==4616== by 0x4067A50: ompi_mpi_init (ompi_mpi_init.c:342) > ==4616== by 0x40A3444: PMPI_Init (pinit.c:80) > ==4616== by 0x804875C: main (test.cpp:17) > ==4616== Address 0x0 is not stack'd, malloc'd or (recently) free'd > ==4616== > ==4616== Invalid read of size 4 > ==4616== at 0x4095772: ompi_comm_invalid (communicator.h:261) > ==4616== by 0x409581E: PMPI_Comm_size (pcomm_size.c:46) > ==4616== by 0x8048770: main (test.cpp:18) > ==4616== Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd > [denali:04616] *** Process received signal *** > [denali:04616] Signal: Segmentation fault (11) > [denali:04616] Signal code: Address not mapped (1) > [denali:04616] Failing at address: 0x440000a0 > [denali:04616] [ 0] /lib/tls/libc.so.6 [0x42b4de0] > [denali:04616] [ 1] > /users/cluster/cdavid/local/lib/libmpi.so.0(MPI_Comm_size+0x6f) > [0x409581f] > [denali:04616] [ 2] ./test(__gxx_personality_v0+0x12d) [0x8048771] > [denali:04616] [ 3] /lib/tls/libc.so.6(__libc_start_main+0xf8) [0x42a2768] > [denali:04616] [ 4] ./test(__gxx_personality_v0+0x3d) [0x8048681] > [denali:04616] *** End of error message *** > ==4616== > ==4616== Invalid read of size 4 > ==4616== at 0x4095782: ompi_comm_invalid (communicator.h:261) > ==4616== by 0x409581E: PMPI_Comm_size (pcomm_size.c:46) > ==4616== by 0x8048770: main (test.cpp:18) > ==4616== Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd > > > The problem is that, now, I don't know where the issue comes from (is > it libc that is too old and incompatible with g++ 4.4/OpenMPI? is libc > broken?). > > Any help would be highly appreciated. > > Thanks, > Catalin > > > On Mon, Jul 6, 2009 at 3:36 PM, Catalin David<catalindavid2...@gmail.com> > wrote: > > On Mon, Jul 6, 2009 at 3:26 PM, jody<jody....@gmail.com> wrote: > >> Hi > >> Are you also sure that you have the same version of Open-MPI > >> on every machine of your cluster, and that it is the mpicxx of this > >> version that is called when you run your program? > >> I ask because you mentioned that there was an old version of Open-MPI > >> present... die you remove this? > >> > >> Jody > > > > Hi > > > > I have just logged in a few other boxes and they all mount my home > > folder. When running `echo $LD_LIBRARY_PATH` and other commands, I get > > what I expect to get, but this might be because I have set these > > variables in the .bashrc file. So, I tried compiling/running like this > > ~/local/bin/mpicxx [stuff] and ~/local/bin/mpirun -np 4 ray-trace, > > but I get the same errors. > > > > As for the previous version, I don't have root access, therefore I was > > not able to remove it. I was just trying to outrun it by setting the > > $PATH variable to point first at my local installation. > > > > > > Catalin > > > > > > -- > > > > ****************************** > > Catalin David > > B.Sc. Computer Science 2010 > > Jacobs University Bremen > > > > Phone: +49-(0)1577-49-38-667 > > > > College Ring 4, #343 > > Bremen, 28759 > > Germany > > ****************************** > > > > > -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk