Sorry, i forgot to introduce the system.. Ours is the customized OFED stack implemented to work on the specific hardware.. We tested the stack with the q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We want to execute the osu_benchamark3.1.1 suite on our OFED..
On Wed, Feb 29, 2012 at 9:57 PM, Venkateswara Rao Dokku <dvrao....@gmail.com > wrote: > Hiii, > I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I > could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_ > get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the > remaining tests are hanging at some message size.. the output is shown below > > [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl > openib,self,sm -H 192.168.0.175,192.168.0.174 --mca > orte_base_help_aggregate 0 > /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bibw > failed to create doorbell file /dev/plx2_char_dev > -------------------------------------------------------------------------- > WARNING: No preset parameters were found for the device that Open MPI > detected: > > Local host: test1 > Device name: plx2_0 > Device vendor ID: 0x10b5 > Device vendor part ID: 4277 > > Default device parameters will be used, which may result in lower > performance. You can edit any of the files specified by the > btl_openib_device_param_files MCA parameter to set values for your > device. > > NOTE: You can turn off this warning by setting the MCA parameter > btl_openib_warn_no_device_params_found to 0. > -------------------------------------------------------------------------- > failed to create doorbell file /dev/plx2_char_dev > -------------------------------------------------------------------------- > WARNING: No preset parameters were found for the device that Open MPI > detected: > > Local host: test2 > Device name: plx2_0 > Device vendor ID: 0x10b5 > Device vendor part ID: 4277 > > Default device parameters will be used, which may result in lower > performance. You can edit any of the files specified by the > btl_openib_device_param_files MCA parameter to set values for your > device. > > NOTE: You can turn off this warning by setting the MCA parameter > btl_openib_warn_no_device_params_found to 0. > -------------------------------------------------------------------------- > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > # OSU One Sided MPI_Put Bi-directional Bandwidth Test v3.1.1 > # Size Bi-Bandwidth (MB/s) > plx2_create_qp line: 415 > plx2_create_qp line: 415 > plx2_create_qp line: 415 > plx2_create_qp line: 415 > 1 0.00 > 2 0.00 > 4 0.01 > 8 0.03 > 16 0.07 > 32 0.15 > 64 0.11 > 128 0.21 > 256 0.43 > 512 0.88 > 1024 2.10 > 2048 4.21 > 4096 8.10 > 8192 16.19 > 16384 8.46 > 32768 20.34 > 65536 39.85 > 131072 84.22 > 262144 142.23 > 524288 234.83 > mpirun: killing job... > > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 7305 on node test2 exited on > signal 0 (Unknown signal 0). > -------------------------------------------------------------------------- > 2 total processes killed (some possibly by mpirun during cleanup) > mpirun: clean termination accomplished > > [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl > openib,self,sm -H 192.168.0.175,192.168.0.174 --mca > orte_base_help_aggregate 0 > /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bw > failed to create doorbell file /dev/plx2_char_dev > -------------------------------------------------------------------------- > WARNING: No preset parameters were found for the device that Open MPI > detected: > > Local host: test1 > Device name: plx2_0 > Device vendor ID: 0x10b5 > Device vendor part ID: 4277 > > Default device parameters will be used, which may result in lower > performance. You can edit any of the files specified by the > btl_openib_device_param_files MCA parameter to set values for your > device. > > NOTE: You can turn off this warning by setting the MCA parameter > btl_openib_warn_no_device_params_found to 0. > -------------------------------------------------------------------------- > failed to create doorbell file /dev/plx2_char_dev > -------------------------------------------------------------------------- > WARNING: No preset parameters were found for the device that Open MPI > detected: > > Local host: test2 > Device name: plx2_0 > Device vendor ID: 0x10b5 > Device vendor part ID: 4277 > > Default device parameters will be used, which may result in lower > performance. You can edit any of the files specified by the > btl_openib_device_param_files MCA parameter to set values for your > device. > > NOTE: You can turn off this warning by setting the MCA parameter > btl_openib_warn_no_device_params_found to 0. > -------------------------------------------------------------------------- > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > alloc_srq max: 512 wqe_shift: 5 > # OSU One Sided MPI_Put Bandwidth Test v3.1.1 > # Size Bandwidth (MB/s) > plx2_create_qp line: 415 > plx2_create_qp line: 415 > plx2_create_qp line: 415 > plx2_create_qp line: 415 > 1 0.02 > 2 0.05 > 4 0.10 > 8 0.19 > 16 0.39 > 32 0.77 > 64 1.53 > 128 2.57 > 256 4.16 > 512 8.30 > 1024 16.62 > 2048 33.22 > 4096 66.51 > 8192 42.45 > 16384 11.99 > 32768 18.20 > 65536 76.04 > 131072 98.64 > 262144 407.66 > 524288 489.84 > mpirun: killing job... > > -------------------------------------------------------------------------- > mpirun noticed that process rank 0 with PID 7314 on node test2 exited on > signal 0 (Unknown signal 0). > -------------------------------------------------------------------------- > 2 total processes killed (some possibly by mpirun during cleanup) > mpirun: clean termination accomplished > > I even checked the logs but i couldn't see any errors... > Could you suggest a way to overcome/debug this issue.. > > Thanks for the kind reply.. > > > -- > Thanks & Regards, > D.Venkateswara Rao, > Software Engineer,One Convergence Devices Pvt Ltd., > Jubille Hills,Hyderabad. > > -- Thanks & Regards, D.Venkateswara Rao, Software Engineer,One Convergence Devices Pvt Ltd., Jubille Hills,Hyderabad.