[OMPI users] problem with groups and communicators in openmpi-1.6.4rc2

2013-01-19 Thread Siegmar Gross
Hi

I have installed openmpi-1.6.4rc2 and have the following problem.

tyr strided_vector 110 ompi_info | grep "Open MPI:"
Open MPI: 1.6.4rc2r27861
tyr strided_vector 111 mpicc -showme
gcc -I/usr/local/openmpi-1.6.4_64_gcc/include -fexceptions -pthread -m64 
-L/usr/local/openmpi-1.6.4_64_gcc/lib64 -lmpi -lm -lkstat -llgrp -lsocket -lnsl 
-lrt -lm


tyr strided_vector 112 mpiexec -np 4 data_type_4
Process 2 of 4 running on tyr.informatik.hs-fulda.de
Process 0 of 4 running on tyr.informatik.hs-fulda.de
Process 3 of 4 running on tyr.informatik.hs-fulda.de
Process 1 of 4 running on tyr.informatik.hs-fulda.de

original matrix:

 1 2 3 4 5 6 7 8 910
11121314151617181920
21222324252627282930
31323334353637383940
41424344454647484950
51525354555657585960

result matrix:
  elements are sqared in columns:
 0   1   2   6   7
  elements are multiplied with 2 in columns:
 3   4   5   8   9

 1 4 9 8101249641820
   121   144   169283032   289   3243840
   441   484   529485052   729   7845860
   961  1024  1089687072  1369  14447880
  1681  1764  1849889092  2209  230498   100
  2601  2704  2809   108   110   112  3249  3364   118   120

Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) (comm->c_remote_group)
)->obj_magic_id, file ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
, line 412
[tyr:18578] *** Process received signal ***
[tyr:18578] Signal: Abort (6)
[tyr:18578] Signal code:  (-1)
Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) (comm->c_remote_group)
)->obj_magic_id, file ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
, line 412
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:opal_backtr
ace_print+0x20
[tyr:18580] *** Process received signal ***
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0x2c1bc4
[tyr:18580] Signal: Abort (6)
[tyr:18580] Signal code:  (-1)
/lib/sparcv9/libc.so.1:0xd88a4
/lib/sparcv9/libc.so.1:0xcc418
/lib/sparcv9/libc.so.1:0xcc624
/lib/sparcv9/libc.so.1:__lwp_kill+0x8 [ Signal 6 (ABRT)]
/lib/sparcv9/libc.so.1:abort+0xd0
/lib/sparcv9/libc.so.1:_assert+0x74
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa4c58
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa2430
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_comm_f
inalize+0x168
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_mpi_fi
nalize+0xa60
/export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:MPI_Finaliz
e+0x90
/home/fd1026/SunOS/sparc/bin/data_type_4:main+0x588
/home/fd1026/SunOS/sparc/bin/data_type_4:_start+0x7c
[tyr:18578] *** End of error message ***
...



Everything works fine with LAM-MPI (even in a heterogeneous environment
with little-endian and big-endian machines) so that it is probably an
error in Open MPI (but you never know).


tyr strided_vector 125 mpicc -showme
gcc -I/usr/local/lam-6.5.9_64_gcc/include -L/usr/local/lam-6.5.9_64_gcc/lib 
-llamf77mpi -lmpi -llam -lsocket -lnsl 
tyr strided_vector 126 lamboot -v hosts.lam-mpi

LAM 6.5.9/MPI 2 C++ - Indiana University

Executing hboot on n0 (tyr.informatik.hs-fulda.de - 2 CPUs)...
Executing hboot on n1 (sunpc1.informatik.hs-fulda.de - 4 CPUs)...
topology done  

tyr strided_vector 127 mpirun -v app_data_type_4.lam-mpi
22894 data_type_4 running on local
22895 data_type_4 running on n0 (o)
21998 data_type_4 running on n1
22896 data_type_4 running on n0 (o)
Process 1 of 4 running on tyr.informatik.hs-fulda.de
Process 3 of 4 running on tyr.informatik.hs-fulda.de
Process 2 of 4 running on sunpc1
Process 0 of 4 running on tyr.informatik.hs-fulda.de

original matrix:

 1 2 3 4 5 6 7 8 910
11121314151617181920
21222324252627282930
31323334353637383940
41424344454647484950
51525354555657585960

result matrix:
  elements are sqared in columns:
 0   1   2   6   7
  elements are multiplied with 2 in columns:
 3   4   5   8   9

 1 4 9 8101249641820
   121   144   169283032   289   3243840
   441   484   529485052   729   7845860
   961  1024  1089687072  1369  14447880
  1681  1764  1849889092  2209  230498   100
  2601  2704  2809   108   110   112  3249  3364   118   120

tyr strided_vector 128 lamhalt

LAM 6.5.9/MPI 2 C++ - Indiana University



I woul

[OMPI users] problem with rankfile in openmpi-1.6.4rc2

2013-01-19 Thread Siegmar Gross
Hi

I have installed openmpi-1.6.4rc2 and have still a problem with my
rankfile.

linpc1 rankfiles 113 ompi_info | grep "Open MPI:"
Open MPI: 1.6.4rc2r27861

linpc1 rankfiles 114 cat rf_linpc1 
rank 0=linpc1 slot=0:0-1,1:0-1

linpc1 rankfiles 115 mpiexec -report-bindings -np 1 \
  -rf rf_linpc1 hostname

We were unable to successfully process/set the requested processor
affinity settings:

Specified slot list: 0:0-1,1:0-1
Error: Error

This could mean that a non-existent processor was specified, or
that the specification had improper syntax.


mpiexec was unable to start the specified application as it
  encountered an error:

Error name: Error
Node: linpc1

when attempting to start process rank 0.



Everything works fine with the following command.

linpc1 rankfiles 116 mpiexec -report-bindings -np 1 -cpus-per-proc 4 \
  -bycore -bind-to-core hostname
[linpc1:20140] MCW rank 0 bound to socket 0[core 0-1]
  socket 1[core 0-1]: [B B][B B]
linpc1


I would be grateful if somebody could fix the problem. Thank you very
much for any help in advance.


Kind regards

Siegmar



Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-19 Thread Lee Eric
Any heads up? Thanks.

On Fri, Jan 18, 2013 at 5:28 AM, Jeff Squyres (jsquyres)
 wrote:
> On Jan 16, 2013, at 6:41 AM, Leif Lindholm  wrote:
>
>> That isn't, technically speaking, correct for the Raspberry Pi - but it is a 
>> workaround if you know you will never actually use the asm implementations 
>> of the atomics, but only the inline C ones..
>>
>> This sort of hides the problem that the dedicated barrier instructions were 
>> not available in ARMv6 (it used "system control coprocessor operations" 
>> instead.
>>
>> If you ever executed the asm implementation, you would trigger an undefined 
>> instruction exception on the Pi.
>
> Hah; sweet.  Ok.
>
> So what's the right answer?  Would it be acceptable to use a no-op for this 
> operation on such architectures?
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] problem with groups and communicators in openmpi-1.6.4rc2

2013-01-19 Thread Ralph Castain
I used your test code to confirm it also fails on our trunk - it looks like 
someone got the reference count wrong when creating/destructing groups.

Afraid I'll have to defer to the authors of that code area...


On Jan 19, 2013, at 1:27 AM, Siegmar Gross 
 wrote:

> Hi
> 
> I have installed openmpi-1.6.4rc2 and have the following problem.
> 
> tyr strided_vector 110 ompi_info | grep "Open MPI:"
>Open MPI: 1.6.4rc2r27861
> tyr strided_vector 111 mpicc -showme
> gcc -I/usr/local/openmpi-1.6.4_64_gcc/include -fexceptions -pthread -m64 
> -L/usr/local/openmpi-1.6.4_64_gcc/lib64 -lmpi -lm -lkstat -llgrp -lsocket 
> -lnsl 
> -lrt -lm
> 
> 
> tyr strided_vector 112 mpiexec -np 4 data_type_4
> Process 2 of 4 running on tyr.informatik.hs-fulda.de
> Process 0 of 4 running on tyr.informatik.hs-fulda.de
> Process 3 of 4 running on tyr.informatik.hs-fulda.de
> Process 1 of 4 running on tyr.informatik.hs-fulda.de
> 
> original matrix:
> 
> 1 2 3 4 5 6 7 8 910
>11121314151617181920
>21222324252627282930
>31323334353637383940
>41424344454647484950
>51525354555657585960
> 
> result matrix:
>  elements are sqared in columns:
> 0   1   2   6   7
>  elements are multiplied with 2 in columns:
> 3   4   5   8   9
> 
> 1 4 9 8101249641820
>   121   144   169283032   289   3243840
>   441   484   529485052   729   7845860
>   961  1024  1089687072  1369  14447880
>  1681  1764  1849889092  2209  230498   100
>  2601  2704  2809   108   110   112  3249  3364   118   120
> 
> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
> (comm->c_remote_group)
> )->obj_magic_id, file 
> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
> , line 412
> [tyr:18578] *** Process received signal ***
> [tyr:18578] Signal: Abort (6)
> [tyr:18578] Signal code:  (-1)
> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
> (comm->c_remote_group)
> )->obj_magic_id, file 
> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
> , line 412
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:opal_backtr
> ace_print+0x20
> [tyr:18580] *** Process received signal ***
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0x2c1bc4
> [tyr:18580] Signal: Abort (6)
> [tyr:18580] Signal code:  (-1)
> /lib/sparcv9/libc.so.1:0xd88a4
> /lib/sparcv9/libc.so.1:0xcc418
> /lib/sparcv9/libc.so.1:0xcc624
> /lib/sparcv9/libc.so.1:__lwp_kill+0x8 [ Signal 6 (ABRT)]
> /lib/sparcv9/libc.so.1:abort+0xd0
> /lib/sparcv9/libc.so.1:_assert+0x74
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa4c58
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa2430
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_comm_f
> inalize+0x168
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_mpi_fi
> nalize+0xa60
> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:MPI_Finaliz
> e+0x90
> /home/fd1026/SunOS/sparc/bin/data_type_4:main+0x588
> /home/fd1026/SunOS/sparc/bin/data_type_4:_start+0x7c
> [tyr:18578] *** End of error message ***
> ...
> 
> 
> 
> Everything works fine with LAM-MPI (even in a heterogeneous environment
> with little-endian and big-endian machines) so that it is probably an
> error in Open MPI (but you never know).
> 
> 
> tyr strided_vector 125 mpicc -showme
> gcc -I/usr/local/lam-6.5.9_64_gcc/include -L/usr/local/lam-6.5.9_64_gcc/lib 
> -llamf77mpi -lmpi -llam -lsocket -lnsl 
> tyr strided_vector 126 lamboot -v hosts.lam-mpi
> 
> LAM 6.5.9/MPI 2 C++ - Indiana University
> 
> Executing hboot on n0 (tyr.informatik.hs-fulda.de - 2 CPUs)...
> Executing hboot on n1 (sunpc1.informatik.hs-fulda.de - 4 CPUs)...
> topology done  
> 
> tyr strided_vector 127 mpirun -v app_data_type_4.lam-mpi
> 22894 data_type_4 running on local
> 22895 data_type_4 running on n0 (o)
> 21998 data_type_4 running on n1
> 22896 data_type_4 running on n0 (o)
> Process 1 of 4 running on tyr.informatik.hs-fulda.de
> Process 3 of 4 running on tyr.informatik.hs-fulda.de
> Process 2 of 4 running on sunpc1
> Process 0 of 4 running on tyr.informatik.hs-fulda.de
> 
> original matrix:
> 
> 1 2 3 4 5 6 7 8 910
>11121314151617181920
>21222324252627282930
>31323334353637383940
>41424344454647484950
>51525354555657585960
> 
> result matrix:
>  elements are sqared in columns:
> 0   1   2   6   7
>  elements are multiplied with 2

Re: [OMPI users] problem with groups and communicators in openmpi-1.6.4rc2

2013-01-19 Thread Edgar Gabriel
I'll look into that next week.
Edgar

On 1/19/2013 8:44 AM, Ralph Castain wrote:
> I used your test code to confirm it also fails on our trunk - it looks like 
> someone got the reference count wrong when creating/destructing groups.
> 
> Afraid I'll have to defer to the authors of that code area...
> 
> 
> On Jan 19, 2013, at 1:27 AM, Siegmar Gross 
>  wrote:
> 
>> Hi
>>
>> I have installed openmpi-1.6.4rc2 and have the following problem.
>>
>> tyr strided_vector 110 ompi_info | grep "Open MPI:"
>>Open MPI: 1.6.4rc2r27861
>> tyr strided_vector 111 mpicc -showme
>> gcc -I/usr/local/openmpi-1.6.4_64_gcc/include -fexceptions -pthread -m64 
>> -L/usr/local/openmpi-1.6.4_64_gcc/lib64 -lmpi -lm -lkstat -llgrp -lsocket 
>> -lnsl 
>> -lrt -lm
>>
>>
>> tyr strided_vector 112 mpiexec -np 4 data_type_4
>> Process 2 of 4 running on tyr.informatik.hs-fulda.de
>> Process 0 of 4 running on tyr.informatik.hs-fulda.de
>> Process 3 of 4 running on tyr.informatik.hs-fulda.de
>> Process 1 of 4 running on tyr.informatik.hs-fulda.de
>>
>> original matrix:
>>
>> 1 2 3 4 5 6 7 8 910
>>11121314151617181920
>>21222324252627282930
>>31323334353637383940
>>41424344454647484950
>>51525354555657585960
>>
>> result matrix:
>>  elements are sqared in columns:
>> 0   1   2   6   7
>>  elements are multiplied with 2 in columns:
>> 3   4   5   8   9
>>
>> 1 4 9 8101249641820
>>   121   144   169283032   289   3243840
>>   441   484   529485052   729   7845860
>>   961  1024  1089687072  1369  14447880
>>  1681  1764  1849889092  2209  230498   100
>>  2601  2704  2809   108   110   112  3249  3364   118   120
>>
>> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
>> (comm->c_remote_group)
>> )->obj_magic_id, file 
>> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
>> , line 412
>> [tyr:18578] *** Process received signal ***
>> [tyr:18578] Signal: Abort (6)
>> [tyr:18578] Signal code:  (-1)
>> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
>> (comm->c_remote_group)
>> )->obj_magic_id, file 
>> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
>> , line 412
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:opal_backtr
>> ace_print+0x20
>> [tyr:18580] *** Process received signal ***
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0x2c1bc4
>> [tyr:18580] Signal: Abort (6)
>> [tyr:18580] Signal code:  (-1)
>> /lib/sparcv9/libc.so.1:0xd88a4
>> /lib/sparcv9/libc.so.1:0xcc418
>> /lib/sparcv9/libc.so.1:0xcc624
>> /lib/sparcv9/libc.so.1:__lwp_kill+0x8 [ Signal 6 (ABRT)]
>> /lib/sparcv9/libc.so.1:abort+0xd0
>> /lib/sparcv9/libc.so.1:_assert+0x74
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa4c58
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa2430
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_comm_f
>> inalize+0x168
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_mpi_fi
>> nalize+0xa60
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:MPI_Finaliz
>> e+0x90
>> /home/fd1026/SunOS/sparc/bin/data_type_4:main+0x588
>> /home/fd1026/SunOS/sparc/bin/data_type_4:_start+0x7c
>> [tyr:18578] *** End of error message ***
>> ...
>>
>>
>>
>> Everything works fine with LAM-MPI (even in a heterogeneous environment
>> with little-endian and big-endian machines) so that it is probably an
>> error in Open MPI (but you never know).
>>
>>
>> tyr strided_vector 125 mpicc -showme
>> gcc -I/usr/local/lam-6.5.9_64_gcc/include -L/usr/local/lam-6.5.9_64_gcc/lib 
>> -llamf77mpi -lmpi -llam -lsocket -lnsl 
>> tyr strided_vector 126 lamboot -v hosts.lam-mpi
>>
>> LAM 6.5.9/MPI 2 C++ - Indiana University
>>
>> Executing hboot on n0 (tyr.informatik.hs-fulda.de - 2 CPUs)...
>> Executing hboot on n1 (sunpc1.informatik.hs-fulda.de - 4 CPUs)...
>> topology done  
>>
>> tyr strided_vector 127 mpirun -v app_data_type_4.lam-mpi
>> 22894 data_type_4 running on local
>> 22895 data_type_4 running on n0 (o)
>> 21998 data_type_4 running on n1
>> 22896 data_type_4 running on n0 (o)
>> Process 1 of 4 running on tyr.informatik.hs-fulda.de
>> Process 3 of 4 running on tyr.informatik.hs-fulda.de
>> Process 2 of 4 running on sunpc1
>> Process 0 of 4 running on tyr.informatik.hs-fulda.de
>>
>> original matrix:
>>
>> 1 2 3 4 5 6 7 8 910
>>11121314151617181920
>>21222324252627282930
>>31323334353637383940
>>4142434445464

Re: [OMPI users] problem with groups and communicators in openmpi-1.6.4rc2

2013-01-19 Thread George Bosilca
On Jan 19, 2013, at 15:44 , Ralph Castain  wrote:

> I used your test code to confirm it also fails on our trunk - it looks like 
> someone got the reference count wrong when creating/destructing groups.

No, the code is not MPI compliant.

The culprit is line 254 in the test code where Siegmar manually copied the 
group_comm_world into group_worker. This is correct as long as you remember 
that group_worker is not directly an MPI generated group, and as a result you 
are not allowed to free it.

Now if you replace the:

group_worker = group_comm_world

by an MPI operation that create a copy of the original group such as

MPI_Comm_group (MPI_COMM_WORLD, &group_worker);

your code become MPI valid, and works without any issue in Open MPI.

  George.


> 
> Afraid I'll have to defer to the authors of that code area...
> 
> 
> On Jan 19, 2013, at 1:27 AM, Siegmar Gross 
>  wrote:
> 
>> Hi
>> 
>> I have installed openmpi-1.6.4rc2 and have the following problem.
>> 
>> tyr strided_vector 110 ompi_info | grep "Open MPI:"
>>   Open MPI: 1.6.4rc2r27861
>> tyr strided_vector 111 mpicc -showme
>> gcc -I/usr/local/openmpi-1.6.4_64_gcc/include -fexceptions -pthread -m64 
>> -L/usr/local/openmpi-1.6.4_64_gcc/lib64 -lmpi -lm -lkstat -llgrp -lsocket 
>> -lnsl 
>> -lrt -lm
>> 
>> 
>> tyr strided_vector 112 mpiexec -np 4 data_type_4
>> Process 2 of 4 running on tyr.informatik.hs-fulda.de
>> Process 0 of 4 running on tyr.informatik.hs-fulda.de
>> Process 3 of 4 running on tyr.informatik.hs-fulda.de
>> Process 1 of 4 running on tyr.informatik.hs-fulda.de
>> 
>> original matrix:
>> 
>>1 2 3 4 5 6 7 8 910
>>   11121314151617181920
>>   21222324252627282930
>>   31323334353637383940
>>   41424344454647484950
>>   51525354555657585960
>> 
>> result matrix:
>> elements are sqared in columns:
>>0   1   2   6   7
>> elements are multiplied with 2 in columns:
>>3   4   5   8   9
>> 
>>1 4 9 8101249641820
>>  121   144   169283032   289   3243840
>>  441   484   529485052   729   7845860
>>  961  1024  1089687072  1369  14447880
>> 1681  1764  1849889092  2209  230498   100
>> 2601  2704  2809   108   110   112  3249  3364   118   120
>> 
>> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
>> (comm->c_remote_group)
>> )->obj_magic_id, file 
>> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
>> , line 412
>> [tyr:18578] *** Process received signal ***
>> [tyr:18578] Signal: Abort (6)
>> [tyr:18578] Signal code:  (-1)
>> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
>> (comm->c_remote_group)
>> )->obj_magic_id, file 
>> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
>> , line 412
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:opal_backtr
>> ace_print+0x20
>> [tyr:18580] *** Process received signal ***
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0x2c1bc4
>> [tyr:18580] Signal: Abort (6)
>> [tyr:18580] Signal code:  (-1)
>> /lib/sparcv9/libc.so.1:0xd88a4
>> /lib/sparcv9/libc.so.1:0xcc418
>> /lib/sparcv9/libc.so.1:0xcc624
>> /lib/sparcv9/libc.so.1:__lwp_kill+0x8 [ Signal 6 (ABRT)]
>> /lib/sparcv9/libc.so.1:abort+0xd0
>> /lib/sparcv9/libc.so.1:_assert+0x74
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa4c58
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa2430
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_comm_f
>> inalize+0x168
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_mpi_fi
>> nalize+0xa60
>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:MPI_Finaliz
>> e+0x90
>> /home/fd1026/SunOS/sparc/bin/data_type_4:main+0x588
>> /home/fd1026/SunOS/sparc/bin/data_type_4:_start+0x7c
>> [tyr:18578] *** End of error message ***
>> ...
>> 
>> 
>> 
>> Everything works fine with LAM-MPI (even in a heterogeneous environment
>> with little-endian and big-endian machines) so that it is probably an
>> error in Open MPI (but you never know).
>> 
>> 
>> tyr strided_vector 125 mpicc -showme
>> gcc -I/usr/local/lam-6.5.9_64_gcc/include -L/usr/local/lam-6.5.9_64_gcc/lib 
>> -llamf77mpi -lmpi -llam -lsocket -lnsl 
>> tyr strided_vector 126 lamboot -v hosts.lam-mpi
>> 
>> LAM 6.5.9/MPI 2 C++ - Indiana University
>> 
>> Executing hboot on n0 (tyr.informatik.hs-fulda.de - 2 CPUs)...
>> Executing hboot on n1 (sunpc1.informatik.hs-fulda.de - 4 CPUs)...
>> topology done  
>> 
>> tyr strided_vector 127 mpirun -v app_data_type_4.lam-mpi
>> 22894 data_type_4 running on local
>> 22895 data_type_4 running on n0 (o)
>> 21998 data_type_4 running on n1
>> 22896 data_type_4 running on n0 (o

Re: [OMPI users] problem with groups and communicators in openmpi-1.6.4rc2

2013-01-19 Thread Ralph Castain
Ah - cool! Thanks!

On Jan 19, 2013, at 7:19 AM, George Bosilca  wrote:

> On Jan 19, 2013, at 15:44 , Ralph Castain  wrote:
> 
>> I used your test code to confirm it also fails on our trunk - it looks like 
>> someone got the reference count wrong when creating/destructing groups.
> 
> No, the code is not MPI compliant.
> 
> The culprit is line 254 in the test code where Siegmar manually copied the 
> group_comm_world into group_worker. This is correct as long as you remember 
> that group_worker is not directly an MPI generated group, and as a result you 
> are not allowed to free it.
> 
> Now if you replace the:
> 
> group_worker = group_comm_world
> 
> by an MPI operation that create a copy of the original group such as
> 
> MPI_Comm_group (MPI_COMM_WORLD, &group_worker);
> 
> your code become MPI valid, and works without any issue in Open MPI.
> 
>  George.
> 
> 
>> 
>> Afraid I'll have to defer to the authors of that code area...
>> 
>> 
>> On Jan 19, 2013, at 1:27 AM, Siegmar Gross 
>>  wrote:
>> 
>>> Hi
>>> 
>>> I have installed openmpi-1.6.4rc2 and have the following problem.
>>> 
>>> tyr strided_vector 110 ompi_info | grep "Open MPI:"
>>>  Open MPI: 1.6.4rc2r27861
>>> tyr strided_vector 111 mpicc -showme
>>> gcc -I/usr/local/openmpi-1.6.4_64_gcc/include -fexceptions -pthread -m64 
>>> -L/usr/local/openmpi-1.6.4_64_gcc/lib64 -lmpi -lm -lkstat -llgrp -lsocket 
>>> -lnsl 
>>> -lrt -lm
>>> 
>>> 
>>> tyr strided_vector 112 mpiexec -np 4 data_type_4
>>> Process 2 of 4 running on tyr.informatik.hs-fulda.de
>>> Process 0 of 4 running on tyr.informatik.hs-fulda.de
>>> Process 3 of 4 running on tyr.informatik.hs-fulda.de
>>> Process 1 of 4 running on tyr.informatik.hs-fulda.de
>>> 
>>> original matrix:
>>> 
>>>   1 2 3 4 5 6 7 8 910
>>>  11121314151617181920
>>>  21222324252627282930
>>>  31323334353637383940
>>>  41424344454647484950
>>>  51525354555657585960
>>> 
>>> result matrix:
>>> elements are sqared in columns:
>>>   0   1   2   6   7
>>> elements are multiplied with 2 in columns:
>>>   3   4   5   8   9
>>> 
>>>   1 4 9 8101249641820
>>> 121   144   169283032   289   3243840
>>> 441   484   529485052   729   7845860
>>> 961  1024  1089687072  1369  14447880
>>> 1681  1764  1849889092  2209  230498   100
>>> 2601  2704  2809   108   110   112  3249  3364   118   120
>>> 
>>> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
>>> (comm->c_remote_group)
>>> )->obj_magic_id, file 
>>> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
>>> , line 412
>>> [tyr:18578] *** Process received signal ***
>>> [tyr:18578] Signal: Abort (6)
>>> [tyr:18578] Signal code:  (-1)
>>> Assertion failed: OPAL_OBJ_MAGIC_ID == ((opal_object_t *) 
>>> (comm->c_remote_group)
>>> )->obj_magic_id, file 
>>> ../../openmpi-1.6.4rc2r27861/ompi/communicator/comm_init.c
>>> , line 412
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:opal_backtr
>>> ace_print+0x20
>>> [tyr:18580] *** Process received signal ***
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0x2c1bc4
>>> [tyr:18580] Signal: Abort (6)
>>> [tyr:18580] Signal code:  (-1)
>>> /lib/sparcv9/libc.so.1:0xd88a4
>>> /lib/sparcv9/libc.so.1:0xcc418
>>> /lib/sparcv9/libc.so.1:0xcc624
>>> /lib/sparcv9/libc.so.1:__lwp_kill+0x8 [ Signal 6 (ABRT)]
>>> /lib/sparcv9/libc.so.1:abort+0xd0
>>> /lib/sparcv9/libc.so.1:_assert+0x74
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa4c58
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:0xa2430
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_comm_f
>>> inalize+0x168
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:ompi_mpi_fi
>>> nalize+0xa60
>>> /export2/prog/SunOS_sparc/openmpi-1.6.4_64_gcc/lib64/libmpi.so.1.0.7:MPI_Finaliz
>>> e+0x90
>>> /home/fd1026/SunOS/sparc/bin/data_type_4:main+0x588
>>> /home/fd1026/SunOS/sparc/bin/data_type_4:_start+0x7c
>>> [tyr:18578] *** End of error message ***
>>> ...
>>> 
>>> 
>>> 
>>> Everything works fine with LAM-MPI (even in a heterogeneous environment
>>> with little-endian and big-endian machines) so that it is probably an
>>> error in Open MPI (but you never know).
>>> 
>>> 
>>> tyr strided_vector 125 mpicc -showme
>>> gcc -I/usr/local/lam-6.5.9_64_gcc/include -L/usr/local/lam-6.5.9_64_gcc/lib 
>>> -llamf77mpi -lmpi -llam -lsocket -lnsl 
>>> tyr strided_vector 126 lamboot -v hosts.lam-mpi
>>> 
>>> LAM 6.5.9/MPI 2 C++ - Indiana University
>>> 
>>> Executing hboot on n0 (tyr.informatik.hs-fulda.de - 2 CPUs)...
>>> Executing hboot on n1 (sunpc1.informatik.hs-fulda.de - 4 CPUs)...
>>> topology done  
>>>

Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-19 Thread Lee Eric
Hi,

The cross-compile issue I fixed. Check following source code:
opal_config_asm.m4:897: [AC_MSG_ERROR([No atomic primitives available
for $host])])

It seems that checks the toolchain's tuple is one of: armv7* or armv6*
or armv5*. I have recompiled my toolchain and no such error occurred.
However, I hit another issue about fortran as configure running.

*** Fortran 90/95 compiler
checking for armv6-rpi-linux-gnueabi-gfortran...
armv6-rpi-linux-gnueabi-gfortran
checking whether we are using the GNU Fortran compiler... yes
checking whether armv6-rpi-linux-gnueabi-gfortran accepts -g... yes
checking if Fortran 77 compiler works... links (cross compiling)
checking armv6-rpi-linux-gnueabi-gfortran external symbol
convention... single underscore
checking if C and Fortran 77 are link compatible... yes
checking to see if F77 compiler likes the C++ exception flags...
skipped (no C++ exceptions flags)
checking to see if mpif77/mpif90 compilers need additional linker flags... none
checking if Fortran 77 compiler supports CHARACTER... yes
checking size of Fortran 77 CHARACTER... configure: error: Can not
determine size of CHARACTER when cross-compiling

Any hint? Thanks.

Eric

On Sat, Jan 19, 2013 at 10:08 PM, Lee Eric  wrote:
> Any heads up? Thanks.
>
> On Fri, Jan 18, 2013 at 5:28 AM, Jeff Squyres (jsquyres)
>  wrote:
>> On Jan 16, 2013, at 6:41 AM, Leif Lindholm  wrote:
>>
>>> That isn't, technically speaking, correct for the Raspberry Pi - but it is 
>>> a workaround if you know you will never actually use the asm 
>>> implementations of the atomics, but only the inline C ones..
>>>
>>> This sort of hides the problem that the dedicated barrier instructions were 
>>> not available in ARMv6 (it used "system control coprocessor operations" 
>>> instead.
>>>
>>> If you ever executed the asm implementation, you would trigger an undefined 
>>> instruction exception on the Pi.
>>
>> Hah; sweet.  Ok.
>>
>> So what's the right answer?  Would it be acceptable to use a no-op for this 
>> operation on such architectures?
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] OMPI 1.6.3, InfiniBand and MTL MXM; unable to make it work!

2013-01-19 Thread Mike Dubman
Hi Francesco,
Can you please provide complete output from ibv_devinfo -v command?
Also, it seems that you have Centos 5.8 with mxm/centos5.7 installed, will
check if there is a distro version incompatibilities which may cause it and
update you.

Alina/Josh - please follow.

Regards
M

On Thu, Jan 17, 2013 at 4:09 PM, Francesco Simula <
francesco.sim...@roma1.infn.it> wrote:

> I tried building from OMPI 1.6.3 tarball with the following ./configure:
> ./configure 
> --prefix=/apotto/home1/**homedirs/fsimula/Lavoro/**openmpi-1.6.3/install/
> \
> --disable-mpi-io \
> --disable-io-romio \
> --enable-dependency-tracking \
> --without-slurm \
> --with-platform=optimized \
> --disable-mpi-f77 \
> --disable-mpi-f90 \
> --with-openib \
> --disable-static \
> --enable-shared \
> --disable-vt \
> --enable-pty-support \
> --enable-mca-no-build=btl-**ofud,pml-bfo \
> --with-mxm=/opt/mellanox/mxm \
> --with-mxm-libdir=/opt/**mellanox/mxm/lib
>
> As you can see from the last two lines, I want to enable the MXM transport
> layer on a cluster made of SuperMicro X8DTG-D boards with dual Xeons and
> Mellanox MT26428 HCAs; the OS is CentOS 5.8.
>
> I tried with two different .rpm's for MXM, either
> 'mxm-1.1.ad085ef-1.x86_64-**centos5u7.rpm' taken from here:
> http://www.mellanox.com/**downloads/hpc/mxm/v1.1/mxm-**latest.tar
>
> and 'mxm-1.5.f583875-1.x86_64-**centos5u7.rpm' taken from here:
> http://www.mellanox.com/**downloads/hpc/mxm/v1.5/mxm-**latest.tar
>
> With both, even if the compilation concludes successfully, a simple test
> (osu_bw from the OSU Micro-Benchmarks 3.8) fails with the sort of message
> reported below; the lines:
>
> rdma_dev.c:122  MXM DEBUG Port 1 on mlx4_0 has a link layer different from
> IB. Skipping it
> rdma_dev.c:155  MXM ERROR An active IB port on a Mellanox device, with lid
> [any] gid [any] not found
>
> make it seem like it cannot access the HW for the HCA: is that so? The
> very same test works when using '-mca pml ob1' (thus using the openib BTL).
>
> I'm quite ready to start pulling my hair; any suggestions?
>
> The output of /usr/bin/ibv_devinfo for the two cluster nodes follows:
> [cut]
> hca_id: mlx4_0
> transport:  InfiniBand (0)
> fw_ver: 2.7.000
> node_guid:  0025:90ff:ff07:0ac4
> sys_image_guid: 0025:90ff:ff07:0ac7
> vendor_id:  0x02c9
> vendor_part_id: 26428
> hw_ver: 0xB0
> board_id:   SM_106101000
> phys_port_cnt:  1
> port:   1
> state:  PORT_ACTIVE (4)
> max_mtu:2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 4
> port_lid:   6
> port_lmc:   0x00
> [/cut]
>
> [cut]
> hca_id: mlx4_0
> transport:  InfiniBand (0)
> fw_ver: 2.7.000
> node_guid:  0025:90ff:ff07:0acc
> sys_image_guid: 0025:90ff:ff07:0acf
> vendor_id:  0x02c9
> vendor_part_id: 26428
> hw_ver: 0xB0
> board_id:   SM_106101000
> phys_port_cnt:  1
> port:   1
> state:  PORT_ACTIVE (4)
> max_mtu:2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 4
> port_lid:   8
> port_lmc:   0x00
> [/cut]
>
> The complete output of the failing test follows:
>
> [fsimula@agape5 osu-micro-benchmarks-3.8]$ mpirun -x MXM_LOG_LEVEL=poll
> -mca pml cm -mca mtl_mxm_np 1 -np 2 -host agape4,agape5
> install/libexec/osu-micro-**benchmarks/mpi/pt2pt/osu_bw H H
> [1358430343.266782] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> [1358430343.266815] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_HANDLE_ERRORS=bt
> [1358430343.266826] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_GDB_PATH=/usr/bin/gdb
> [1358430343.266838] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_DUMP_SIGNO=1
> [1358430343.266851] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_DUMP_LEVEL=conn
> [1358430343.266924] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_ASYNC_MODE=THREAD
> [1358430343.266936] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_TIME_ACCURACY=0.1
> [1358430343.266956] [agape5:8596 

Re: [OMPI users] OMPI 1.6.3, InfiniBand and MTL MXM; unable to make it work!

2013-01-19 Thread Mike Dubman
Also, what MOFED/OFED version do you have?
MXM is compiled per OFED/MOFED version, is there match between active ofed
and mxm.rpm selected?

On Thu, Jan 17, 2013 at 4:09 PM, Francesco Simula <
francesco.sim...@roma1.infn.it> wrote:

> I tried building from OMPI 1.6.3 tarball with the following ./configure:
> ./configure 
> --prefix=/apotto/home1/**homedirs/fsimula/Lavoro/**openmpi-1.6.3/install/
> \
> --disable-mpi-io \
> --disable-io-romio \
> --enable-dependency-tracking \
> --without-slurm \
> --with-platform=optimized \
> --disable-mpi-f77 \
> --disable-mpi-f90 \
> --with-openib \
> --disable-static \
> --enable-shared \
> --disable-vt \
> --enable-pty-support \
> --enable-mca-no-build=btl-**ofud,pml-bfo \
> --with-mxm=/opt/mellanox/mxm \
> --with-mxm-libdir=/opt/**mellanox/mxm/lib
>
> As you can see from the last two lines, I want to enable the MXM transport
> layer on a cluster made of SuperMicro X8DTG-D boards with dual Xeons and
> Mellanox MT26428 HCAs; the OS is CentOS 5.8.
>
> I tried with two different .rpm's for MXM, either
> 'mxm-1.1.ad085ef-1.x86_64-**centos5u7.rpm' taken from here:
> http://www.mellanox.com/**downloads/hpc/mxm/v1.1/mxm-**latest.tar
>
> and 'mxm-1.5.f583875-1.x86_64-**centos5u7.rpm' taken from here:
> http://www.mellanox.com/**downloads/hpc/mxm/v1.5/mxm-**latest.tar
>
> With both, even if the compilation concludes successfully, a simple test
> (osu_bw from the OSU Micro-Benchmarks 3.8) fails with the sort of message
> reported below; the lines:
>
> rdma_dev.c:122  MXM DEBUG Port 1 on mlx4_0 has a link layer different from
> IB. Skipping it
> rdma_dev.c:155  MXM ERROR An active IB port on a Mellanox device, with lid
> [any] gid [any] not found
>
> make it seem like it cannot access the HW for the HCA: is that so? The
> very same test works when using '-mca pml ob1' (thus using the openib BTL).
>
> I'm quite ready to start pulling my hair; any suggestions?
>
> The output of /usr/bin/ibv_devinfo for the two cluster nodes follows:
> [cut]
> hca_id: mlx4_0
> transport:  InfiniBand (0)
> fw_ver: 2.7.000
> node_guid:  0025:90ff:ff07:0ac4
> sys_image_guid: 0025:90ff:ff07:0ac7
> vendor_id:  0x02c9
> vendor_part_id: 26428
> hw_ver: 0xB0
> board_id:   SM_106101000
> phys_port_cnt:  1
> port:   1
> state:  PORT_ACTIVE (4)
> max_mtu:2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 4
> port_lid:   6
> port_lmc:   0x00
> [/cut]
>
> [cut]
> hca_id: mlx4_0
> transport:  InfiniBand (0)
> fw_ver: 2.7.000
> node_guid:  0025:90ff:ff07:0acc
> sys_image_guid: 0025:90ff:ff07:0acf
> vendor_id:  0x02c9
> vendor_part_id: 26428
> hw_ver: 0xB0
> board_id:   SM_106101000
> phys_port_cnt:  1
> port:   1
> state:  PORT_ACTIVE (4)
> max_mtu:2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 4
> port_lid:   8
> port_lmc:   0x00
> [/cut]
>
> The complete output of the failing test follows:
>
> [fsimula@agape5 osu-micro-benchmarks-3.8]$ mpirun -x MXM_LOG_LEVEL=poll
> -mca pml cm -mca mtl_mxm_np 1 -np 2 -host agape4,agape5
> install/libexec/osu-micro-**benchmarks/mpi/pt2pt/osu_bw H H
> [1358430343.266782] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> [1358430343.266815] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_HANDLE_ERRORS=bt
> [1358430343.266826] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_GDB_PATH=/usr/bin/gdb
> [1358430343.266838] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_DUMP_SIGNO=1
> [1358430343.266851] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_DUMP_LEVEL=conn
> [1358430343.266924] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_ASYNC_MODE=THREAD
> [1358430343.266936] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_TIME_ACCURACY=0.1
> [1358430343.266956] [agape5:8596 :0] config_parser.c:168  MXM DEBUG
> default: MXM_PTLS=self,shm,rdma
> [1358430343.267249] [agape5:8596 :0] mpool.c:265  MXM DEBUG mpool
> 'p

Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-19 Thread Lee Eric
Hi,

I just use --disable-mpif77 and --disable-mpif90 to let configure run
well. However, I know it's only tough workround. After configured
well, I encounter following error when run make:

Making all in config
make[1]: Entering directory `/home/huli/Projects/openmpi-1.6.3/config'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/config'
Making all in contrib
make[1]: Entering directory `/home/huli/Projects/openmpi-1.6.3/contrib'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/contrib'
Making all in opal
make[1]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal'
Making all in include
make[2]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
make  all-am
make[3]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
make[3]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
make[2]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
Making all in libltdl
make[2]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/libltdl'
make  all-am
make[3]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/libltdl'
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT dlopen.lo -MD -MP -MF .deps/dlopen.Tpo
-c -o dlopen.lo `test -f 'loaders/dlopen.c' || echo
'./'`loaders/dlopen.c
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-preopen.lo -MD -MP -MF
.deps/libltdlc_la-preopen.Tpo -c -o libltdlc_la-preopen.lo `test -f
'loaders/preopen.c' || echo './'`loaders/preopen.c
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-lt__alloc.lo -MD -MP -MF
.deps/libltdlc_la-lt__alloc.Tpo -c -o libltdlc_la-lt__alloc.lo `test
-f 'lt__alloc.c' || echo './'`lt__alloc.c
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-lt_dlloader.lo -MD -MP -MF
.deps/libltdlc_la-lt_dlloader.Tpo -c -o libltdlc_la-lt_dlloader.lo
`test -f 'lt_dlloader.c' || echo './'`lt_dlloader.c
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-lt_error.lo -MD -MP -MF
.deps/libltdlc_la-lt_error.Tpo -c -o libltdlc_la-lt_error.lo `test -f
'lt_error.c' || echo './'`lt_error.c
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-ltdl.lo -MD -MP -MF
.deps/libltdlc_la-ltdl.Tpo -c -o libltdlc_la-ltdl.lo `test -f 'ltdl.c'
|| echo './'`ltdl.c
/bin/sh ./libtool  --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-slist.lo -MD -MP -MF
.deps/libltdlc_la-slist.Tpo -c -o libltdlc_la-slist.lo `test -f
'slist.c' || echo './'`slist.c
/bin/sh ./libtool --tag=CC   --mode=compile
armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.
-DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
-I./libltdl 
-I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
  -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
-mfpu=vfp -mfloat-abi=hard -MT lt__strl.lo -MD -MP -MF
.deps/lt__strl.Tpo -c -o lt__strl.lo lt__strl

Re: [OMPI users] Help: OpenMPI Compilation in Raspberry Pi

2013-01-19 Thread Lee Eric
Hi,

The above issue fixed w/ this patch I used:
https://raw.github.com/sebhtml/patches/master/openmpi/Raspberry-Pi-openmpi-1.6.2.patch

Is that possible OpenMPI can contain this patch in the future?

Thanks.

On Sun, Jan 20, 2013 at 3:13 AM, Lee Eric  wrote:
> Hi,
>
> I just use --disable-mpif77 and --disable-mpif90 to let configure run
> well. However, I know it's only tough workround. After configured
> well, I encounter following error when run make:
>
> Making all in config
> make[1]: Entering directory `/home/huli/Projects/openmpi-1.6.3/config'
> make[1]: Nothing to be done for `all'.
> make[1]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/config'
> Making all in contrib
> make[1]: Entering directory `/home/huli/Projects/openmpi-1.6.3/contrib'
> make[1]: Nothing to be done for `all'.
> make[1]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/contrib'
> Making all in opal
> make[1]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal'
> Making all in include
> make[2]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
> make  all-am
> make[3]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
> make[3]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
> make[2]: Leaving directory `/home/huli/Projects/openmpi-1.6.3/opal/include'
> Making all in libltdl
> make[2]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/libltdl'
> make  all-am
> make[3]: Entering directory `/home/huli/Projects/openmpi-1.6.3/opal/libltdl'
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT dlopen.lo -MD -MP -MF .deps/dlopen.Tpo
> -c -o dlopen.lo `test -f 'loaders/dlopen.c' || echo
> './'`loaders/dlopen.c
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-preopen.lo -MD -MP -MF
> .deps/libltdlc_la-preopen.Tpo -c -o libltdlc_la-preopen.lo `test -f
> 'loaders/preopen.c' || echo './'`loaders/preopen.c
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-lt__alloc.lo -MD -MP -MF
> .deps/libltdlc_la-lt__alloc.Tpo -c -o libltdlc_la-lt__alloc.lo `test
> -f 'lt__alloc.c' || echo './'`lt__alloc.c
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-lt_dlloader.lo -MD -MP -MF
> .deps/libltdlc_la-lt_dlloader.Tpo -c -o libltdlc_la-lt_dlloader.lo
> `test -f 'lt_dlloader.c' || echo './'`lt_dlloader.c
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-lt_error.lo -MD -MP -MF
> .deps/libltdlc_la-lt_error.Tpo -c -o libltdlc_la-lt_error.lo `test -f
> 'lt_error.c' || echo './'`lt_error.c
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-ltdl.lo -MD -MP -MF
> .deps/libltdlc_la-ltdl.Tpo -c -o libltdlc_la-ltdl.lo `test -f 'ltdl.c'
> || echo './'`ltdl.c
> /bin/sh ./libtool  --tag=CC   --mode=compile
> armv6-rpi-linux-gnueabi-gcc -DHAVE_CONFIG_H -I.  -DLTDLOPEN=libltdlc
> -DLT_CONFIG_H='' -DLTDL -I. -I. -Ilibltdl -I./libltdl
> -I./libltdl 
> -I/home/huli/Projects/openmpi-1.6.3/opal/mca/hwloc/hwloc132/hwloc/include
>   -I/usr/include/infiniband -I/usr/include/infiniband   -Ofast
> -mfpu=vfp -mfloat-abi=hard -MT libltdlc_la-slist.lo -MD -MP -MF
> .deps/libltdlc_la-slist.Tpo -c -o libltdlc_la-slist.lo `t