Hiii, I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_ get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the remaining tests are hanging at some message size.. the output is shown below
[root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl openib,self,sm -H 192.168.0.175,192.168.0.174 --mca orte_base_help_aggregate 0 /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bibw failed to create doorbell file /dev/plx2_char_dev -------------------------------------------------------------------------- WARNING: No preset parameters were found for the device that Open MPI detected: Local host: test1 Device name: plx2_0 Device vendor ID: 0x10b5 Device vendor part ID: 4277 Default device parameters will be used, which may result in lower performance. You can edit any of the files specified by the btl_openib_device_param_files MCA parameter to set values for your device. NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_no_device_params_found to 0. -------------------------------------------------------------------------- failed to create doorbell file /dev/plx2_char_dev -------------------------------------------------------------------------- WARNING: No preset parameters were found for the device that Open MPI detected: Local host: test2 Device name: plx2_0 Device vendor ID: 0x10b5 Device vendor part ID: 4277 Default device parameters will be used, which may result in lower performance. You can edit any of the files specified by the btl_openib_device_param_files MCA parameter to set values for your device. NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_no_device_params_found to 0. -------------------------------------------------------------------------- alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 # OSU One Sided MPI_Put Bi-directional Bandwidth Test v3.1.1 # Size Bi-Bandwidth (MB/s) plx2_create_qp line: 415 plx2_create_qp line: 415 plx2_create_qp line: 415 plx2_create_qp line: 415 1 0.00 2 0.00 4 0.01 8 0.03 16 0.07 32 0.15 64 0.11 128 0.21 256 0.43 512 0.88 1024 2.10 2048 4.21 4096 8.10 8192 16.19 16384 8.46 32768 20.34 65536 39.85 131072 84.22 262144 142.23 524288 234.83 mpirun: killing job... -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 7305 on node test2 exited on signal 0 (Unknown signal 0). -------------------------------------------------------------------------- 2 total processes killed (some possibly by mpirun during cleanup) mpirun: clean termination accomplished [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl openib,self,sm -H 192.168.0.175,192.168.0.174 --mca orte_base_help_aggregate 0 /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bw failed to create doorbell file /dev/plx2_char_dev -------------------------------------------------------------------------- WARNING: No preset parameters were found for the device that Open MPI detected: Local host: test1 Device name: plx2_0 Device vendor ID: 0x10b5 Device vendor part ID: 4277 Default device parameters will be used, which may result in lower performance. You can edit any of the files specified by the btl_openib_device_param_files MCA parameter to set values for your device. NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_no_device_params_found to 0. -------------------------------------------------------------------------- failed to create doorbell file /dev/plx2_char_dev -------------------------------------------------------------------------- WARNING: No preset parameters were found for the device that Open MPI detected: Local host: test2 Device name: plx2_0 Device vendor ID: 0x10b5 Device vendor part ID: 4277 Default device parameters will be used, which may result in lower performance. You can edit any of the files specified by the btl_openib_device_param_files MCA parameter to set values for your device. NOTE: You can turn off this warning by setting the MCA parameter btl_openib_warn_no_device_params_found to 0. -------------------------------------------------------------------------- alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 alloc_srq max: 512 wqe_shift: 5 # OSU One Sided MPI_Put Bandwidth Test v3.1.1 # Size Bandwidth (MB/s) plx2_create_qp line: 415 plx2_create_qp line: 415 plx2_create_qp line: 415 plx2_create_qp line: 415 1 0.02 2 0.05 4 0.10 8 0.19 16 0.39 32 0.77 64 1.53 128 2.57 256 4.16 512 8.30 1024 16.62 2048 33.22 4096 66.51 8192 42.45 16384 11.99 32768 18.20 65536 76.04 131072 98.64 262144 407.66 524288 489.84 mpirun: killing job... -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 7314 on node test2 exited on signal 0 (Unknown signal 0). -------------------------------------------------------------------------- 2 total processes killed (some possibly by mpirun during cleanup) mpirun: clean termination accomplished I even checked the logs but i couldn't see any errors... Could you suggest a way to overcome/debug this issue.. Thanks for the kind reply.. -- Thanks & Regards, D.Venkateswara Rao, Software Engineer,One Convergence Devices Pvt Ltd., Jubille Hills,Hyderabad.