Hi OpenMPI Users,

Has anyone successfully tested OpenMPI 1.10.6 with PGI 17.1.0 on POWER8 with 
the LSF scheduler (—with-lsf=..)?

I am getting this error when the code hits MPI_Finalize. It causes the job to 
abort (i.e. exit the LSF session) when I am running interactively.

Are there any materials we can supply to aid debugging/problem isolation?

[white23:58788] *** Process received signal ***
[white23:58788] Signal: Segmentation fault (11)
[white23:58788] Signal code: Invalid permissions (2)
[white23:58788] Failing at address: 0x1000008e0810
[white23:58788] [ 0] [0x100000050478]
[white23:58788] [ 1] [0x0]
[white23:58788] [ 2] 
/home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libopen-rte.so.12(+0x1b6b0)[0x10000071b6b0]
[white23:58788] [ 3] 
/home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libopen-rte.so.12(orte_finalize+0x70)[0x10000071b5b8]
[white23:58788] [ 4] 
/home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libmpi.so.12(ompi_mpi_finalize+0x760)[0x100000121dc8]
[white23:58788] [ 5] 
/home/projects/pwr8-rhel73-lsf/openmpi/1.10.6/pgi/17.1.0/cuda/none/lib/libmpi.so.12(PMPI_Finalize+0x6c)[0x100000153154]
[white23:58788] [ 6] ./IMB-MPI1[0x100028dc]
[white23:58788] [ 7] /lib64/libc.so.6(+0x24700)[0x1000004b4700]
[white23:58788] [ 8] /lib64/libc.so.6(__libc_start_main+0xc4)[0x1000004b48f4]
[white23:58788] *** End of error message ***
[white22:73620] *** Process received signal ***
[white22:73620] Signal: Segmentation fault (11)
[white22:73620] Signal code: Invalid permissions (2)
[white22:73620] Failing at address: 0x1000008e0810


Thanks,

S.

—

Si Hammond
Scalable Computer Architectures
Sandia National Laboratories, NM, USA

[Sent from Remote Connection, Please excuse typos]




_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to