I'm pretty sure that they are correct.  Our one-sided implementation is
buggier than I'd like (indeed, I'm in the process of rewriting most of it
as part of Open MPI's support for MPI-3's revised RDMA), so it's likely
that the bugs are in Open MPI's onesided support.  Can you try a more
recent release (something from the 1.5 tree) and see if the problem
persists?

Thanks,

Brian

On 2/29/12 10:56 AM, "Jeffrey Squyres" <jsquy...@cisco.com> wrote:

>FWIW, I'm immediately suspicious of *any* MPI application that uses the
>MPI one-sided operations (i.e., MPI_PUT and MPI_GET).  It looks like
>these two OSU benchmarks are using those operations.
>
>Is it known that these two benchmarks are correct?
>
>
>
>On Feb 29, 2012, at 11:33 AM, Venkateswara Rao Dokku wrote:
>
>> Sorry, i forgot to introduce the system.. Ours is the customized OFED
>>stack implemented to work on the specific hardware.. We tested the stack
>>with the q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We
>>want to execute the osu_benchamark3.1.1 suite on our OFED..
>> 
>> On Wed, Feb 29, 2012 at 9:57 PM, Venkateswara Rao Dokku
>><dvrao....@gmail.com> wrote:
>> Hiii,
>> I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3...
>>I could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_
>> get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and
>>the remaining tests are hanging at some message size.. the output is
>>shown below
>> 
>> [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl
>>openib,self,sm -H 192.168.0.175,192.168.0.174 --mca
>>orte_base_help_aggregate 0
>>/root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bibw
>> failed to create doorbell file /dev/plx2_char_dev
>> 
>>-------------------------------------------------------------------------
>>-
>> WARNING: No preset parameters were found for the device that Open MPI
>> detected:
>> 
>>   Local host:            test1
>>   Device name:           plx2_0
>>   Device vendor ID:      0x10b5
>>   Device vendor part ID: 4277
>> 
>> Default device parameters will be used, which may result in lower
>> performance.  You can edit any of the files specified by the
>> btl_openib_device_param_files MCA parameter to set values for your
>> device.
>> 
>> NOTE: You can turn off this warning by setting the MCA parameter
>>       btl_openib_warn_no_device_params_found to 0.
>> 
>>-------------------------------------------------------------------------
>>-
>> failed to create doorbell file /dev/plx2_char_dev
>> 
>>-------------------------------------------------------------------------
>>-
>> WARNING: No preset parameters were found for the device that Open MPI
>> detected:
>> 
>>   Local host:            test2
>>   Device name:           plx2_0
>>   Device vendor ID:      0x10b5
>>   Device vendor part ID: 4277
>> 
>> Default device parameters will be used, which may result in lower
>> performance.  You can edit any of the files specified by the
>> btl_openib_device_param_files MCA parameter to set values for your
>> device.
>> 
>> NOTE: You can turn off this warning by setting the MCA parameter
>>       btl_openib_warn_no_device_params_found to 0.
>> 
>>-------------------------------------------------------------------------
>>-
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> # OSU One Sided MPI_Put Bi-directional Bandwidth Test v3.1.1
>> # Size     Bi-Bandwidth (MB/s)
>> plx2_create_qp line: 415
>> plx2_create_qp line: 415
>> plx2_create_qp line: 415
>> plx2_create_qp line: 415
>> 1                         0.00
>> 2                         0.00
>> 4                         0.01
>> 8                         0.03
>> 16                        0.07
>> 32                        0.15
>> 64                        0.11
>> 128                       0.21
>> 256                       0.43
>> 512                       0.88
>> 1024                      2.10
>> 2048                      4.21
>> 4096                      8.10
>> 8192                     16.19
>> 16384                     8.46
>> 32768                    20.34
>> 65536                    39.85
>> 131072                   84.22
>> 262144                  142.23
>> 524288                  234.83
>> mpirun: killing job...
>> 
>> 
>>-------------------------------------------------------------------------
>>-
>> mpirun noticed that process rank 0 with PID 7305 on node test2 exited
>>on signal 0 (Unknown signal 0).
>> 
>>-------------------------------------------------------------------------
>>-
>> 2 total processes killed (some possibly by mpirun during cleanup)
>> mpirun: clean termination accomplished
>> 
>> [root@test2 ~]# mpirun --prefix /usr/local/ -np 2 --mca btl
>>openib,self,sm -H 192.168.0.175,192.168.0.174 --mca
>>orte_base_help_aggregate 0
>>/root/ramu/ofed_pkgs/osu_benchmarks-3.1.1/osu_put_bw
>> failed to create doorbell file /dev/plx2_char_dev
>> 
>>-------------------------------------------------------------------------
>>-
>> WARNING: No preset parameters were found for the device that Open MPI
>> detected:
>> 
>>   Local host:            test1
>>   Device name:           plx2_0
>>   Device vendor ID:      0x10b5
>>   Device vendor part ID: 4277
>> 
>> Default device parameters will be used, which may result in lower
>> performance.  You can edit any of the files specified by the
>> btl_openib_device_param_files MCA parameter to set values for your
>> device.
>> 
>> NOTE: You can turn off this warning by setting the MCA parameter
>>       btl_openib_warn_no_device_params_found to 0.
>> 
>>-------------------------------------------------------------------------
>>-
>> failed to create doorbell file /dev/plx2_char_dev
>> 
>>-------------------------------------------------------------------------
>>-
>> WARNING: No preset parameters were found for the device that Open MPI
>> detected:
>> 
>>   Local host:            test2
>>   Device name:           plx2_0
>>   Device vendor ID:      0x10b5
>>   Device vendor part ID: 4277
>> 
>> Default device parameters will be used, which may result in lower
>> performance.  You can edit any of the files specified by the
>> btl_openib_device_param_files MCA parameter to set values for your
>> device.
>> 
>> NOTE: You can turn off this warning by setting the MCA parameter
>>       btl_openib_warn_no_device_params_found to 0.
>> 
>>-------------------------------------------------------------------------
>>-
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> alloc_srq max: 512 wqe_shift: 5
>> # OSU One Sided MPI_Put Bandwidth Test v3.1.1
>> # Size        Bandwidth (MB/s)
>> plx2_create_qp line: 415
>> plx2_create_qp line: 415
>> plx2_create_qp line: 415
>> plx2_create_qp line: 415
>> 1                         0.02
>> 2                         0.05
>> 4                         0.10
>> 8                         0.19
>> 16                        0.39
>> 32                        0.77
>> 64                        1.53
>> 128                       2.57
>> 256                       4.16
>> 512                       8.30
>> 1024                     16.62
>> 2048                     33.22
>> 4096                     66.51
>> 8192                     42.45
>> 16384                    11.99
>> 32768                    18.20
>> 65536                    76.04
>> 131072                   98.64
>> 262144                  407.66
>> 524288                  489.84
>> mpirun: killing job...
>> 
>> 
>>-------------------------------------------------------------------------
>>-
>> mpirun noticed that process rank 0 with PID 7314 on node test2 exited
>>on signal 0 (Unknown signal 0).
>> 
>>-------------------------------------------------------------------------
>>-
>> 2 total processes killed (some possibly by mpirun during cleanup)
>> mpirun: clean termination accomplished
>> 
>> I even checked the logs but i couldn't see any errors...
>> Could you suggest a way to overcome/debug this issue..
>> 
>> Thanks for the kind reply..
>> 
>> 
>> -- 
>> Thanks & Regards,
>> D.Venkateswara Rao,
>> Software Engineer,One Convergence Devices Pvt Ltd.,
>> Jubille Hills,Hyderabad.
>> 
>> 
>> 
>> 
>> -- 
>> Thanks & Regards,
>> D.Venkateswara Rao,
>> Software Engineer,One Convergence Devices Pvt Ltd.,
>> Jubille Hills,Hyderabad.
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>-- 
>Jeff Squyres
>jsquy...@cisco.com
>For corporate legal information go to:
>http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>_______________________________________________
>users mailing list
>us...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


-- 
  Brian W. Barrett
  Dept. 1423: Scalable System Software
  Sandia National Laboratories






Reply via email to