As far as the nightly builds go, I'm still seeing what I believe to be
this problem in both r10670 and r10652.  This is happening with
both Linux and OS X.  Below are the systems and ompi_info for the
newest revision 10670.

As an example of the error, when running HPL with Myrinet I get the
following error.  Using tcp everything is fine and I see the results I'd
expect.
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =
42820214496954887558164928727596662784.0000000 ...... FAILED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) = 156556068835.2711182 ......
FAILED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 1156439380.5172558 ......
FAILED
||Ax-b||_oo  . . . . . . . . . . . . . . . . . =
272683853978565028754868928512.000000
||A||_oo . . . . . . . . . . . . . . . . . . . =        3822.884181
||A||_1  . . . . . . . . . . . . . . . . . . . =        3823.922627
||x||_oo . . . . . . . . . . . . . . . . . . . =
37037692483529688659798261760.000000
||x||_1  . . . . . . . . . . . . . . . . . . . =
4102704048669982798475494948864.000000
===================================================

Finished      1 tests with the following results:
             0 tests completed and passed residual checks,
             1 tests completed and failed residual checks,
             0 tests skipped because of illegal input values.
----------------------------------------------------------------------------

Linux node41 2.6.16.19 #1 SMP Wed Jun 21 17:22:01 EDT 2006 ppc64 PPC970FX,
altivec supported GNU/Linux
jbronder@node41 ~ $ /usr/local/ompi-gnu-1.1.1a/bin/ompi_info
               Open MPI: 1.1.1a1r10670
  Open MPI SVN revision: r10670
               Open RTE: 1.1.1a1r10670
  Open RTE SVN revision: r10670
                   OPAL: 1.1.1a1r10670
      OPAL SVN revision: r10670
                 Prefix: /usr/local/ompi-gnu-1.1.1a
Configured architecture: powerpc64-unknown-linux-gnu
          Configured by: root
          Configured on: Thu Jul  6 10:15:37 EDT 2006
         Configure host: node41
               Built by: root
               Built on: Thu Jul  6 10:28:14 EDT 2006
             Built host: node41
             C bindings: yes
           C++ bindings: yes
     Fortran77 bindings: yes (all)
     Fortran90 bindings: yes
Fortran90 bindings size: small
             C compiler: gcc
    C compiler absolute: /usr/bin/gcc
           C++ compiler: g++
  C++ compiler absolute: /usr/bin/g++
     Fortran77 compiler: gfortran
 Fortran77 compiler abs:
/usr/powerpc64-unknown-linux-gnu/gcc-bin/4.1.0/gfortran
     Fortran90 compiler: gfortran
 Fortran90 compiler abs:
/usr/powerpc64-unknown-linux-gnu/gcc-bin/4.1.0/gfortran
            C profiling: yes
          C++ profiling: yes
    Fortran77 profiling: yes
    Fortran90 profiling: yes
         C++ exceptions: no
         Thread support: posix (mpi: no, progress: no)
 Internal debug support: no
    MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
        libltdl support: yes
             MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1.1)
          MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1.1)
          MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1.1)
              MCA timer: linux (MCA v1.0, API v1.0, Component v1.1.1)
          MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
          MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
               MCA coll: basic (MCA v1.0, API v1.0, Component v1.1.1)
          MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: self (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: sm (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1.1)
                 MCA io: romio (MCA v1.0, API v1.0, Component v1.1.1)
              MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1.1)
              MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1.1)
                MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1.1)
             MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: gm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: self (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: sm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
               MCA topo: unity (MCA v1.0, API v1.0, Component v1.1.1)
                MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
                MCA gpr: null (MCA v1.0, API v1.0, Component v1.1.1)
                MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
                MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1.1)
                MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1.1)
                MCA iof: svc (MCA v1.0, API v1.0, Component v1.1.1)
                 MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1.1)
                 MCA ns: replica (MCA v1.0, API v1.0, Component v1.1.1)
                MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1.1)
                MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
                MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1.1)
                MCA ras: tm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
               MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1.1)
              MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1.1)
               MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
               MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA rml: oob (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pls: fork (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pls: tm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: env (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: seed (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1.1)
Configured as:
./configure \
   --prefix=$PREFIX \
   --enable-mpi-f77 \
   --enable-mpi-f90 \
   --enable-mpi-profile \
   --enable-mpi-cxx \
   --enable-pty-support \
   --enable-shared \
   --enable-smp-locks \
   --enable-io-romio \
   --with-tm=/usr/local/pbs \
   --without-xgrid \
   --without-slurm \
   --with-gm=/opt/gm

Darwin node90.meldrew.clusters.umaine.edu 8.6.0 Darwin Kernel Version 8.6.0:
Tue Mar  7 16:58:48 PST 2006; root:xnu-792.6.70.obj~1/RELEASE_PPC Power
Macintosh powerpc
node90:~/src/hpl jbronder$ /usr/local/ompi-xl/bin/ompi_info
               Open MPI: 1.1.1a1r10670
  Open MPI SVN revision: r10670
               Open RTE: 1.1.1a1r10670
  Open RTE SVN revision: r10670
                   OPAL: 1.1.1a1r10670
      OPAL SVN revision: r10670
                 Prefix: /usr/local/ompi-xl
Configured architecture: powerpc-apple-darwin8.6.0
          Configured by:
          Configured on: Thu Jul  6 10:05:20 EDT 2006
         Configure host: node90.meldrew.clusters.umaine.edu
               Built by: root
               Built on: Thu Jul  6 10:37:40 EDT 2006
             Built host: node90.meldrew.clusters.umaine.edu
             C bindings: yes
           C++ bindings: yes
     Fortran77 bindings: yes (lower case)
     Fortran90 bindings: yes
Fortran90 bindings size: small
             C compiler: /opt/ibmcmp/vac/6.0/bin/xlc
    C compiler absolute: /opt/ibmcmp/vac/6.0/bin/xlc
           C++ compiler: /opt/ibmcmp/vacpp/6.0/bin/xlc++
  C++ compiler absolute: /opt/ibmcmp/vacpp/6.0/bin/xlc++
     Fortran77 compiler: /opt/ibmcmp/xlf/8.1/bin/xlf_r
 Fortran77 compiler abs: /opt/ibmcmp/xlf/8.1/bin/xlf_r
     Fortran90 compiler: /opt/ibmcmp/xlf/8.1/bin/xlf90_r
 Fortran90 compiler abs: /opt/ibmcmp/xlf/8.1/bin/xlf90_r
            C profiling: yes
          C++ profiling: yes
    Fortran77 profiling: yes
    Fortran90 profiling: yes
         C++ exceptions: no
         Thread support: posix (mpi: no, progress: no)
 Internal debug support: no
    MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
        libltdl support: yes
             MCA memory: darwin (MCA v1.0, API v1.0, Component v1.1.1)
          MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1.1)
              MCA timer: darwin (MCA v1.0, API v1.0, Component v1.1.1)
          MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
          MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
         MCA coll: basic (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: self (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: sm (MCA v1.0, API v1.0, Component v1.1.1)
               MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1.1)
                 MCA io: romio (MCA v1.0, API v1.0, Component v1.1.1)
              MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1.1)
              MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1.1)
                MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1.1)
             MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: self (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: sm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: gm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
               MCA topo: unity (MCA v1.0, API v1.0, Component v1.1.1)
                MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
                MCA gpr: null (MCA v1.0, API v1.0, Component v1.1.1)
                MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
                MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1.1)
                MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1.1)
                MCA iof: svc (MCA v1.0, API v1.0, Component v1.1.1)
                 MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1.1)
                 MCA ns: replica (MCA v1.0, API v1.0, Component v1.1.1)
                MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1.1)
                MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
                MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1.1)
                MCA ras: tm (MCA v1.0, API v1.0, Component v1.1.1)
              MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1.1)
                MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1.1)
              MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1.1)
               MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1.1)
               MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA rml: oob (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pls: fork (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1.1)
                MCA pls: tm (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: env (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: seed (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1.1)
                MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1.1)
Configured as:
./configure \
   --prefix=$PREFIX \
   --with-tm=/usr/local/pbs/ \
   --with-gm=/opt/gm \
   --enable-static \
   --disable-cxx
On 7/3/06, George Bosilca <bosi...@cs.utk.edu> wrote:

Bernard,

A bug in the Open MPI GM driver was discovered after the 1.1 release.
A patch for the 1.1 is on the way. However, I don't know if it will
be available before the 1.1.1. Meanwhile, you can use the nightly
build version or a fresh check-out from the SVN repository. Both of
them have the GM bug corrected.

   Sorry for the troubles,
     george.

On Jul 3, 2006, at 12:58 PM, Borenstein, Bernard S wrote:

> I've built and sucessfully run the Nasa Overflow 2.0aa program with
> Openmpi 1.0.2.  I'm running on an opteron linux cluster running SLES 9
> and GM 2.0.24. I built Openmpi 1.1 with the intel 9 compilers and
> try to
> run Overflow 2.0aa with myrinet, it get what looks like a data
> corruption error and the program dies quickly.
> There are no mpi errors at all.If I run using GIGE (--mca btl
> self,tcp),
> the program runs to competion correctly.  Here is my ompi_info
> output :
>
> bsb3227@mahler:~/openmpi_1.1/bin> ./ompi_info
>                 Open MPI: 1.1
>    Open MPI SVN revision: r10477
>                 Open RTE: 1.1
>    Open RTE SVN revision: r10477
>                     OPAL: 1.1
>        OPAL SVN revision: r10477
>                   Prefix: /home/bsb3227/openmpi_1.1
>  Configured architecture: x86_64-unknown-linux-gnu
>            Configured by: bsb3227
>            Configured on: Fri Jun 30 07:08:54 PDT 2006
>           Configure host: mahler
>                 Built by: bsb3227
>                 Built on: Fri Jun 30 07:54:46 PDT 2006
>               Built host: mahler
>               C bindings: yes
>             C++ bindings: yes
>       Fortran77 bindings: yes (all)
>       Fortran90 bindings: yes
>  Fortran90 bindings size: small
>               C compiler: icc
>      C compiler absolute: /opt/intel/cce/9.0.25/bin/icc
>             C++ compiler: icpc
>    C++ compiler absolute: /opt/intel/cce/9.0.25/bin/icpc
>       Fortran77 compiler: ifort
>   Fortran77 compiler abs: /opt/intel/fce/9.0.25/bin/ifort
>       Fortran90 compiler: /opt/intel/fce/9.0.25/bin/ifort
>   Fortran90 compiler abs: /opt/intel/fce/9.0.25/bin/ifort
>              C profiling: yes
>            C++ profiling: yes
>      Fortran77 profiling: yes
>      Fortran90 profiling: yes
>           C++ exceptions: no
>           Thread support: posix (mpi: no, progress: no)
>   Internal debug support: no
>      MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>          libltdl support: yes
>               MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component
> v1.1)
>            MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1)
>            MCA maffinity: first_use (MCA v1.0, API v1.0, Component
> v1.1)
>            MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1)
>                MCA timer: linux (MCA v1.0, API v1.0, Component v1.1)
>            MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
>            MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
>                 MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
>                 MCA coll: hierarch (MCA v1.0, API v1.0, Component
> v1.1)
>                 MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
>                 MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
>                 MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
>                   MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
>                MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
>                MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
>                  MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
>               MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
>                  MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
>                  MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA btl: gm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
>                 MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
>                  MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
>                  MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
>                  MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
>                  MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
>                  MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
>                  MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
>                   MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
>                   MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
>                  MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
>                  MCA ras: dash_host (MCA v1.0, API v1.0, Component
> v1.1)
>                  MCA ras: hostfile (MCA v1.0, API v1.0, Component
> v1.1)
>                  MCA ras: localhost (MCA v1.0, API v1.0, Component
> v1.1)
>                  MCA ras: slurm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA ras: tm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA rds: hostfile (MCA v1.0, API v1.0, Component
> v1.1)
>                  MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
>                MCA rmaps: round_robin (MCA v1.0, API v1.0, Component
> v1.1)
>                 MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
>                 MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
>                  MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
>                  MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
>                  MCA pls: slurm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA pls: tm (MCA v1.0, API v1.0, Component v1.1)
>                  MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
>                  MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
>                  MCA sds: singleton (MCA v1.0, API v1.0, Component
> v1.1)
>                  MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
>                  MCA sds: slurm (MCA v1.0, API v1.0, Component v1.1)
>
> Here is the ifconfig for one of the nodes :
>
> bsb3227@m045:~> /sbin/ifconfig
> eth0      Link encap:Ethernet  HWaddr 00:50:45:5D:CD:FE
>           inet addr:10.241.194.45  Bcast:10.241.195.255
> Mask:255.255.254.0
>           inet6 addr: fe80::250:45ff:fe5d:cdfe/64 Scope:Link
>           UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500
> Metric:1
>           RX packets:39913407 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:48794587 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:31847343907 (30371.9 Mb)  TX bytes:48231713866
> (45997.3 Mb)
>           Interrupt:19
>
> eth1      Link encap:Ethernet  HWaddr 00:50:45:5D:CD:FF
>           inet6 addr: fe80::250:45ff:fe5d:cdff/64 Scope:Link
>           UP BROADCAST MULTICAST  MTU:1500  Metric:1
>           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>           Interrupt:19
>
> lo        Link encap:Local Loopback
>           inet addr:127.0.0.1  Mask:255.0.0.0
>           inet6 addr: ::1/128 Scope:Host
>           UP LOOPBACK RUNNING  MTU:16436  Metric:1
>           RX packets:23141 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:23141 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:20145689 (19.2 Mb)  TX bytes:20145689 (19.2 Mb)
>
> I hope someone can give me some guidance on how to debug this problem.
> Thanx in advance for any help
> that can be provided.
>
> Bernie Borenstein
> The Boeing Company
> <config.log.gz>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

"Half of what I say is meaningless; but I say it so that the other
half may reach you"
                                   Kahlil Gibran


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to