If you wouldn't mind, yes - let's see if it is a problem with icc. We know some 
versions have bugs, though this may not be the issue here

On May 26, 2014, at 7:39 AM, Alain Miniussi <alain.miniu...@oca.eu> wrote:

> 
> Hi,
> 
> Did that too, with the same result:
> 
> [alainm@tagir mpi]$ mpirun -n 1 ./a.out
> [tagir:05123] *** Process received signal ***
> [tagir:05123] Signal: Floating point exception (8)
> [tagir:05123] Signal code: Integer divide-by-zero (1)
> [tagir:05123] Failing at address: 0x2adb507b3d9f
> [tagir:05123] [ 0] /lib64/libpthread.so.0[0x30f920f710]
> [tagir:05123] [ 1] 
> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_btl_openib.so(mca_btl_openib_add_procs+0xe9f)[0x2adb507b3d9f]
> [tagir:05123] [ 2] 
> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_bml_r2.so(+0x1481)[0x2adb505a7481]
> [tagir:05123] [ 3] 
> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xa8)[0x2adb51af02f8]
> [tagir:05123] [ 4] 
> /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(ompi_mpi_init+0x9f6)[0x2adb4b78b236]
> [tagir:05123] [ 5] 
> /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(MPI_Init+0xef)[0x2adb4b7ad74f]
> [tagir:05123] [ 6] ./a.out[0x400dd1]
> [tagir:05123] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30f8a1ed1d]
> [tagir:05123] [ 8] ./a.out[0x400cc9]
> [tagir:05123] *** End of error message ***
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 5123 on node tagir exited on 
> signal 13 (Broken pipe).
> --------------------------------------------------------------------------
> [alainm@tagir mpi]$
> 
> 
> do you want me to try a gcc build ?
> 
> Alain
> 
> On 26/05/2014 16:09, Ralph Castain wrote:
>> Strange - I note that you are running these as singletons. Can you try 
>> running it under mpirun?
>> 
>> mpirun -n 1 ./a.out
>> 
>> just to see if it is the singleton that is causing the problem, or something 
>> in the openib btl itself.
>> 
>> 
>> On May 26, 2014, at 6:59 AM, Alain Miniussi <alain.miniu...@oca.eu> wrote:
>> 
>>> Hi,
>>> 
>>> I have a failure with the following minimalistic testcase:
>>> $: more ./test.c
>>> #include "mpi.h"
>>> 
>>> int main(int argc, char* argv[]) {
>>>    MPI_Init(&argc,&argv);
>>>    MPI_Finalize();
>>>    return 0;
>>> }
>>> $: mpicc -v
>>> icc version 13.1.1 (gcc version 4.4.7 compatibility)
>>> $: mpicc ./test.c
>>> $: ./a.out
>>> [tagir:02855] *** Process received signal ***
>>> [tagir:02855] Signal: Floating point exception (8)
>>> [tagir:02855] Signal code: Integer divide-by-zero (1)
>>> [tagir:02855] Failing at address: 0x2aef6e5b2d9f
>>> [tagir:02855] [ 0] /lib64/libpthread.so.0[0x30f920f710]
>>> [tagir:02855] [ 1] 
>>> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_btl_openib.so(mca_btl_openib_add_procs+0xe9f)[0x2aef6e5b2d9f]
>>> [tagir:02855] [ 2] 
>>> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_bml_r2.so(+0x1481)[0x2aef6e3a6481]
>>> [tagir:02855] [ 3] 
>>> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xa8)[0x2aef6f8ef2f8]
>>> [tagir:02855] [ 4] 
>>> /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(ompi_mpi_init+0x9f6)[0x2aef69572236]
>>> [tagir:02855] [ 5] 
>>> /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(MPI_Init+0xef)[0x2aef6959474f]
>>> [tagir:02855] [ 6] ./a.out[0x400dd1]
>>> [tagir:02855] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30f8a1ed1d]
>>> [tagir:02855] [ 8] ./a.out[0x400cc9]
>>> [tagir:02855] *** End of error message ***
>>> $:
>>> 
>>> Versions info:
>>> $: mpicc -v
>>> icc version 13.1.1 (gcc version 4.4.7 compatibility)
>>> $: ldd ./a.out
>>>    linux-vdso.so.1 =>  (0x00007fffbb197000)
>>>    libmpi.so.1 => /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1 
>>> (0x00002b20262ee000)
>>>    libm.so.6 => /lib64/libm.so.6 (0x00000030f8e00000)
>>>    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000030ff200000)
>>>    libpthread.so.0 => /lib64/libpthread.so.0 (0x00000030f9200000)
>>>    libc.so.6 => /lib64/libc.so.6 (0x00000030f8a00000)
>>>    libdl.so.2 => /lib64/libdl.so.2 (0x00000030f9600000)
>>>    libopen-rte.so.7 => /softs/openmpi-1.8.1-intel13/lib/libopen-rte.so.7 
>>> (0x00002b202660d000)
>>>    libopen-pal.so.6 => /softs/openmpi-1.8.1-intel13/lib/libopen-pal.so.6 
>>> (0x00002b20268a1000)
>>>    libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00002b2026ba6000)
>>>    librt.so.1 => /lib64/librt.so.1 (0x00000030f9e00000)
>>>    libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003109800000)
>>>    libutil.so.1 => /lib64/libutil.so.1 (0x000000310aa00000)
>>>    libimf.so => 
>>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libimf.so 
>>> (0x00002b2026db0000)
>>>    libsvml.so => 
>>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libsvml.so 
>>> (0x00002b202726d000)
>>>    libirng.so => 
>>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libirng.so 
>>> (0x00002b2027c37000)
>>>    libintlc.so.5 => 
>>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5 
>>> (0x00002b2027e3e000)
>>>    /lib64/ld-linux-x86-64.so.2 (0x00000030f8600000)
>>> $:
>>> 
>>> I tried to goole the issue, and saw something regarding an old 
>>> vectorization bug with intel compiler, but that was a lonng time ago and 
>>> seemed to be fixed for 1.6.x.
>>> Also, "make check" went fine ???
>>> 
>>> Any idea ?
>>> 
>>> Cheers
>>> 
>>> -- 
>>> ---
>>> Alain
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> ---
> Alain
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to