Hello, all!

Just installed Valgrind (since this seems like a memory issue) and got
this interesting output (when running the test program):

==4616== Syscall param sched_setaffinity(mask) points to unaddressable byte(s)
==4616==    at 0x43656BD: syscall (in /lib/tls/libc-2.3.2.so)
==4616==    by 0x4236A75: opal_paffinity_linux_plpa_init (plpa_runtime.c:37)
==4616==    by 0x423779B:
opal_paffinity_linux_plpa_have_topology_information (plpa_map.c:501)
==4616==    by 0x4235FEE: linux_module_init (paffinity_linux_module.c:119)
==4616==    by 0x447F114: opal_paffinity_base_select
(paffinity_base_select.c:64)
==4616==    by 0x444CD71: opal_init (opal_init.c:292)
==4616==    by 0x43CE7E6: orte_init (orte_init.c:76)
==4616==    by 0x4067A50: ompi_mpi_init (ompi_mpi_init.c:342)
==4616==    by 0x40A3444: PMPI_Init (pinit.c:80)
==4616==    by 0x804875C: main (test.cpp:17)
==4616==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==4616==
==4616== Invalid read of size 4
==4616==    at 0x4095772: ompi_comm_invalid (communicator.h:261)
==4616==    by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
==4616==    by 0x8048770: main (test.cpp:18)
==4616==  Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd
[denali:04616] *** Process received signal ***
[denali:04616] Signal: Segmentation fault (11)
[denali:04616] Signal code: Address not mapped (1)
[denali:04616] Failing at address: 0x440000a0
[denali:04616] [ 0] /lib/tls/libc.so.6 [0x42b4de0]
[denali:04616] [ 1]
/users/cluster/cdavid/local/lib/libmpi.so.0(MPI_Comm_size+0x6f)
[0x409581f]
[denali:04616] [ 2] ./test(__gxx_personality_v0+0x12d) [0x8048771]
[denali:04616] [ 3] /lib/tls/libc.so.6(__libc_start_main+0xf8) [0x42a2768]
[denali:04616] [ 4] ./test(__gxx_personality_v0+0x3d) [0x8048681]
[denali:04616] *** End of error message ***
==4616==
==4616== Invalid read of size 4
==4616==    at 0x4095782: ompi_comm_invalid (communicator.h:261)
==4616==    by 0x409581E: PMPI_Comm_size (pcomm_size.c:46)
==4616==    by 0x8048770: main (test.cpp:18)
==4616==  Address 0x440000a0 is not stack'd, malloc'd or (recently) free'd


The problem is that, now, I don't know where the issue comes from (is
it libc that is too old and incompatible with g++ 4.4/OpenMPI? is libc
broken?).

Any help would be highly appreciated.

Thanks,
Catalin


On Mon, Jul 6, 2009 at 3:36 PM, Catalin David<catalindavid2...@gmail.com> wrote:
> On Mon, Jul 6, 2009 at 3:26 PM, jody<jody....@gmail.com> wrote:
>> Hi
>> Are you also sure that you have the same version of Open-MPI
>> on every machine of your cluster, and that it is the mpicxx of this
>> version that is called when you run your program?
>> I ask because you mentioned that there was an old version of Open-MPI
>> present... die you remove this?
>>
>> Jody
>
> Hi
>
> I have just logged in a few other boxes and they all mount my home
> folder. When running `echo $LD_LIBRARY_PATH` and other commands, I get
> what I expect to get, but this might be because I have set these
> variables in the .bashrc file. So, I tried compiling/running like this
>  ~/local/bin/mpicxx [stuff] and ~/local/bin/mpirun -np 4 ray-trace,
> but I get the same errors.
>
> As for the previous version, I don't have root access, therefore I was
> not able to remove it. I was just trying to outrun it by setting the
> $PATH variable to point first at my local installation.
>
>
> Catalin
>
>
> --
>
> ******************************
> Catalin David
> B.Sc. Computer Science 2010
> Jacobs University Bremen
>
> Phone: +49-(0)1577-49-38-667
>
> College Ring 4, #343
> Bremen, 28759
> Germany
> ******************************
>



-- 

******************************
Catalin David
B.Sc. Computer Science 2010
Jacobs University Bremen

Phone: +49-(0)1577-49-38-667

College Ring 4, #343
Bremen, 28759
Germany
******************************

Reply via email to