Hi,
I'm a system administrator trying to help users resolve gadget 2 code hangs
doing MPI_Sendrecv (similar to
http://www.open-mpi.org/community/lists/users/2010/05/13057.php).
I'm trying to determine appropriate values for mpool_rdma_rcache_size_limit
for our hardware, and to make sure RDMA settings are appropriate and do not
lead to data corruption (
http://www.open-mpi.org/faq/?category=openfabrics#setting-mpi-leave-pinned-1.3.2
).
The gadget code was running fine under openmpi 1.2.9 and the hangs showed up
in 1.4.3 (actually also 1.3.2).

code runs using tcp (-mca btl tcp,self,sm)

code hangs using infiniband

code runs using infiniband with "-mca btl_openib_flags 1" and "-mca
mpool_rdma_rcache_size_limit 209715200" (suggestion from poster from the
referenced link above)

Any suggestions would be appreciated.
Regards,
Gretchen
0. openmpi 1.4.3 (ompi_info attached, config.log is missing but may not be
needed as this is a more general usage/settings question)
1. OFED 1.4.2 from git.openfabrics.org
2. Debian 5.0, kernel 2.6.26-2-amd64
3. opensm-3.2.6
4. ibv_devinfo
hca_id:    mlx4_0
    fw_ver:                2.6.000
    node_guid:            0002:c903:0002:848c
    sys_image_guid:            0002:c903:0002:848f
    vendor_id:            0x02c9
    vendor_part_id:            25408
    hw_ver:                0xA0
    board_id:            MT_04A0130005
    phys_port_cnt:            2
        port:    1
            state:            PORT_ACTIVE (4)
            max_mtu:        2048 (4)
            active_mtu:        2048 (4)
            sm_lid:            30
            port_lid:        99
            port_lmc:        0x00

5. ifconfig
ib0       Link encap:UNSPEC  HWaddr
80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
          inet addr:10.16.10.20  Bcast:10.16.10.255  Mask:255.255.255.0
          inet6 addr: fe80::202:c903:2:848d/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
          RX packets:1936 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:5 overruns:0 carrier:0
          collisions:0 txqueuelen:256
          RX bytes:189055 (184.6 KiB)  TX bytes:0 (0.0 B)
6. unlimited
                 Package: Open MPI xxx@xxx Distribution
                Open MPI: 1.4.3
   Open MPI SVN revision: r23834
   Open MPI release date: Oct 05, 2010
                Open RTE: 1.4.3
   Open RTE SVN revision: r23834
   Open RTE release date: Oct 05, 2010
                    OPAL: 1.4.3
       OPAL SVN revision: r23834
       OPAL release date: Oct 05, 2010
            Ident string: 1.4.3
                  Prefix: /usr/local/openmpi-1.4.3
 Configured architecture: x86_64-unknown-linux-gnu
          Configure host: xxx
           Configured by: xxx
           Configured on: Tue Nov 30 16:24:27 EST 2010
          Configure host: xxx
                Built by: xxx
                Built on: Tue Nov 30 16:31:33 EST 2010
              Built host: xxx
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (all)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: yes
          Thread support: posix (mpi: no, progress: no)
           Sparse Groups: no
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
         MPI I/O support: yes
       MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no  (checkpoint thread: no)
           MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.4.3)
              MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.4.3)
           MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.4.3)
               MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.4.3)
               MCA carto: file (MCA v2.0, API v2.0, Component v1.4.3)
           MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.4.3)
           MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.4.3)
               MCA timer: linux (MCA v2.0, API v2.0, Component v1.4.3)
         MCA installdirs: env (MCA v2.0, API v2.0, Component v1.4.3)
         MCA installdirs: config (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.4.3)
              MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.4.3)
           MCA allocator: basic (MCA v2.0, API v2.0, Component v1.4.3)
           MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: basic (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: inter (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: self (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: sm (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: sync (MCA v2.0, API v2.0, Component v1.4.3)
                MCA coll: tuned (MCA v2.0, API v2.0, Component v1.4.3)
                  MCA io: romio (MCA v2.0, API v2.0, Component v1.4.3)
               MCA mpool: fake (MCA v2.0, API v2.0, Component v1.4.3)
               MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.4.3)
               MCA mpool: sm (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA pml: cm (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA pml: csum (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA pml: v (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA bml: r2 (MCA v2.0, API v2.0, Component v1.4.3)
              MCA rcache: vma (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA btl: ofud (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA btl: openib (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA btl: self (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA btl: sm (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA btl: tcp (MCA v2.0, API v2.0, Component v1.4.3)
                MCA topo: unity (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA osc: rdma (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA iof: hnp (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA iof: orted (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA iof: tool (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA oob: tcp (MCA v2.0, API v2.0, Component v1.4.3)
                MCA odls: default (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ras: slurm (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ras: tm (MCA v2.0, API v2.0, Component v1.4.3)
               MCA rmaps: load_balance (MCA v2.0, API v2.0, Component v1.4.3)
               MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.4.3)
               MCA rmaps: round_robin (MCA v2.0, API v2.0, Component v1.4.3)
               MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA rml: oob (MCA v2.0, API v2.0, Component v1.4.3)
              MCA routed: binomial (MCA v2.0, API v2.0, Component v1.4.3)
              MCA routed: direct (MCA v2.0, API v2.0, Component v1.4.3)
              MCA routed: linear (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA plm: rsh (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA plm: slurm (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA plm: tm (MCA v2.0, API v2.0, Component v1.4.3)
               MCA filem: rsh (MCA v2.0, API v2.0, Component v1.4.3)
              MCA errmgr: default (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ess: env (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ess: hnp (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ess: singleton (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ess: slurm (MCA v2.0, API v2.0, Component v1.4.3)
                 MCA ess: tool (MCA v2.0, API v2.0, Component v1.4.3)
             MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.4.3)
             MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.4.3)

Reply via email to