Sorry, i forgot to introduce the system.. Ours is the customized OFED stack
implemented to work on the specific hardware.. We tested the stack with the
q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We want to
execute the osu_benchamark3.1.1 suite on our OFED..

On Wed, Feb 29, 2012 at 9:57 PM, Venkateswara Rao Dokku <dvrao....@gmail.com
> wrote:

> Hiii,
> I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I
> could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_
> get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the
> remaining tests are hanging at some message size.. the output is shown below
>
> [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl
> openib,self,sm -H 192.168.0.175,192.168.0.174 --mca
> orte_base_help_aggregate 0
> /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bibw
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
>   Local host:            test1
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
>   Local host:            test2
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> # OSU One Sided MPI_Put Bi-directional Bandwidth Test v3.1.1
> # Size     Bi-Bandwidth (MB/s)
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
>  plx2_create_qp line: 415
> 1                         0.00
> 2                         0.00
> 4                         0.01
> 8                         0.03
> 16                        0.07
> 32                        0.15
> 64                        0.11
> 128                       0.21
> 256                       0.43
> 512                       0.88
> 1024                      2.10
> 2048                      4.21
> 4096                      8.10
> 8192                     16.19
> 16384                     8.46
> 32768                    20.34
> 65536                    39.85
> 131072                   84.22
> 262144                  142.23
> 524288                  234.83
> mpirun: killing job...
>
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 7305 on node test2 exited on
> signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
>
> [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl
> openib,self,sm -H 192.168.0.175,192.168.0.174 --mca
> orte_base_help_aggregate 0
> /root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bw
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
>   Local host:            test1
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> failed to create doorbell file /dev/plx2_char_dev
> --------------------------------------------------------------------------
> WARNING: No preset parameters were found for the device that Open MPI
> detected:
>
>   Local host:            test2
>   Device name:           plx2_0
>   Device vendor ID:      0x10b5
>   Device vendor part ID: 4277
>
> Default device parameters will be used, which may result in lower
> performance.  You can edit any of the files specified by the
> btl_openib_device_param_files MCA parameter to set values for your
> device.
>
> NOTE: You can turn off this warning by setting the MCA parameter
>       btl_openib_warn_no_device_params_found to 0.
> --------------------------------------------------------------------------
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> alloc_srq max: 512 wqe_shift: 5
> # OSU One Sided MPI_Put Bandwidth Test v3.1.1
> # Size        Bandwidth (MB/s)
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> plx2_create_qp line: 415
> 1                         0.02
> 2                         0.05
> 4                         0.10
> 8                         0.19
> 16                        0.39
> 32                        0.77
> 64                        1.53
> 128                       2.57
> 256                       4.16
> 512                       8.30
> 1024                     16.62
> 2048                     33.22
> 4096                     66.51
> 8192                     42.45
> 16384                    11.99
> 32768                    18.20
> 65536                    76.04
> 131072                   98.64
> 262144                  407.66
> 524288                  489.84
> mpirun: killing job...
>
> --------------------------------------------------------------------------
> mpirun noticed that process rank 0 with PID 7314 on node test2 exited on
> signal 0 (Unknown signal 0).
> --------------------------------------------------------------------------
> 2 total processes killed (some possibly by mpirun during cleanup)
> mpirun: clean termination accomplished
>
> I even checked the logs but i couldn't see any errors...
> Could you suggest a way to overcome/debug this issue..
>
> Thanks for the kind reply..
>
>
> --
> Thanks & Regards,
> D.Venkateswara Rao,
> Software Engineer,One Convergence Devices Pvt Ltd.,
> Jubille Hills,Hyderabad.
>
>


-- 
Thanks & Regards,
D.Venkateswara Rao,
Software Engineer,One Convergence Devices Pvt Ltd.,
Jubille Hills,Hyderabad.

Reply via email to