If you wouldn't mind, yes - let's see if it is a problem with icc. We know some versions have bugs, though this may not be the issue here
On May 26, 2014, at 7:39 AM, Alain Miniussi <alain.miniu...@oca.eu> wrote: > > Hi, > > Did that too, with the same result: > > [alainm@tagir mpi]$ mpirun -n 1 ./a.out > [tagir:05123] *** Process received signal *** > [tagir:05123] Signal: Floating point exception (8) > [tagir:05123] Signal code: Integer divide-by-zero (1) > [tagir:05123] Failing at address: 0x2adb507b3d9f > [tagir:05123] [ 0] /lib64/libpthread.so.0[0x30f920f710] > [tagir:05123] [ 1] > /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_btl_openib.so(mca_btl_openib_add_procs+0xe9f)[0x2adb507b3d9f] > [tagir:05123] [ 2] > /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_bml_r2.so(+0x1481)[0x2adb505a7481] > [tagir:05123] [ 3] > /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xa8)[0x2adb51af02f8] > [tagir:05123] [ 4] > /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(ompi_mpi_init+0x9f6)[0x2adb4b78b236] > [tagir:05123] [ 5] > /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(MPI_Init+0xef)[0x2adb4b7ad74f] > [tagir:05123] [ 6] ./a.out[0x400dd1] > [tagir:05123] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30f8a1ed1d] > [tagir:05123] [ 8] ./a.out[0x400cc9] > [tagir:05123] *** End of error message *** > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 5123 on node tagir exited on > signal 13 (Broken pipe). > -------------------------------------------------------------------------- > [alainm@tagir mpi]$ > > > do you want me to try a gcc build ? > > Alain > > On 26/05/2014 16:09, Ralph Castain wrote: >> Strange - I note that you are running these as singletons. Can you try >> running it under mpirun? >> >> mpirun -n 1 ./a.out >> >> just to see if it is the singleton that is causing the problem, or something >> in the openib btl itself. >> >> >> On May 26, 2014, at 6:59 AM, Alain Miniussi <alain.miniu...@oca.eu> wrote: >> >>> Hi, >>> >>> I have a failure with the following minimalistic testcase: >>> $: more ./test.c >>> #include "mpi.h" >>> >>> int main(int argc, char* argv[]) { >>> MPI_Init(&argc,&argv); >>> MPI_Finalize(); >>> return 0; >>> } >>> $: mpicc -v >>> icc version 13.1.1 (gcc version 4.4.7 compatibility) >>> $: mpicc ./test.c >>> $: ./a.out >>> [tagir:02855] *** Process received signal *** >>> [tagir:02855] Signal: Floating point exception (8) >>> [tagir:02855] Signal code: Integer divide-by-zero (1) >>> [tagir:02855] Failing at address: 0x2aef6e5b2d9f >>> [tagir:02855] [ 0] /lib64/libpthread.so.0[0x30f920f710] >>> [tagir:02855] [ 1] >>> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_btl_openib.so(mca_btl_openib_add_procs+0xe9f)[0x2aef6e5b2d9f] >>> [tagir:02855] [ 2] >>> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_bml_r2.so(+0x1481)[0x2aef6e3a6481] >>> [tagir:02855] [ 3] >>> /softs/openmpi-1.8.1-intel13/lib/openmpi/mca_pml_ob1.so(mca_pml_ob1_add_procs+0xa8)[0x2aef6f8ef2f8] >>> [tagir:02855] [ 4] >>> /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(ompi_mpi_init+0x9f6)[0x2aef69572236] >>> [tagir:02855] [ 5] >>> /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1(MPI_Init+0xef)[0x2aef6959474f] >>> [tagir:02855] [ 6] ./a.out[0x400dd1] >>> [tagir:02855] [ 7] /lib64/libc.so.6(__libc_start_main+0xfd)[0x30f8a1ed1d] >>> [tagir:02855] [ 8] ./a.out[0x400cc9] >>> [tagir:02855] *** End of error message *** >>> $: >>> >>> Versions info: >>> $: mpicc -v >>> icc version 13.1.1 (gcc version 4.4.7 compatibility) >>> $: ldd ./a.out >>> linux-vdso.so.1 => (0x00007fffbb197000) >>> libmpi.so.1 => /softs/openmpi-1.8.1-intel13/lib/libmpi.so.1 >>> (0x00002b20262ee000) >>> libm.so.6 => /lib64/libm.so.6 (0x00000030f8e00000) >>> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000030ff200000) >>> libpthread.so.0 => /lib64/libpthread.so.0 (0x00000030f9200000) >>> libc.so.6 => /lib64/libc.so.6 (0x00000030f8a00000) >>> libdl.so.2 => /lib64/libdl.so.2 (0x00000030f9600000) >>> libopen-rte.so.7 => /softs/openmpi-1.8.1-intel13/lib/libopen-rte.so.7 >>> (0x00002b202660d000) >>> libopen-pal.so.6 => /softs/openmpi-1.8.1-intel13/lib/libopen-pal.so.6 >>> (0x00002b20268a1000) >>> libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x00002b2026ba6000) >>> librt.so.1 => /lib64/librt.so.1 (0x00000030f9e00000) >>> libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003109800000) >>> libutil.so.1 => /lib64/libutil.so.1 (0x000000310aa00000) >>> libimf.so => >>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libimf.so >>> (0x00002b2026db0000) >>> libsvml.so => >>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libsvml.so >>> (0x00002b202726d000) >>> libirng.so => >>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libirng.so >>> (0x00002b2027c37000) >>> libintlc.so.5 => >>> /softs/intel/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5 >>> (0x00002b2027e3e000) >>> /lib64/ld-linux-x86-64.so.2 (0x00000030f8600000) >>> $: >>> >>> I tried to goole the issue, and saw something regarding an old >>> vectorization bug with intel compiler, but that was a lonng time ago and >>> seemed to be fixed for 1.6.x. >>> Also, "make check" went fine ??? >>> >>> Any idea ? >>> >>> Cheers >>> >>> -- >>> --- >>> Alain >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > --- > Alain > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users