1.) Compiling without XL will take a little while, but I have the setup for the other questions ready now. I figured I'd answer them right away.
2.) TCP works fine, and is quite quick compared to mpich-1.2.7p1 by the way. I just reverified this. WR11C2R4 5000 160 1 2 10.10 8.253e+00 ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0412956 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0272613 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0053214 ...... PASSED 3.) Exactly same setup, using mpichgm-1.2.6..14b WR11C2R4 5000 160 1 2 10.43 7.994e+00 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0353693 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0233491 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0045577 ...... PASSED It also worked with mpichgm-1.2.6..15 (I believe this is the version, I don't have a node up with it at the moment). Obviously mpich-1.2.7p1 works as well over ethernet. Anyways, I'll begin the build with the standard gcc compilers that are included with OS X. This is powerpc-apple-darwin8-gcc-4.0.1. Thanks, Justin. Jeff Squyres (jsquyres) wrote: > Justin -- > > Can we eliminate some variables so that we can figure out where the > error is originating? > > - Can you try compiling without the XL compilers? > - Can you try running with just TCP (and not Myrinet)? > - With the same support library installation (such as BLAS, etc., > assumedly also compiled with XL), can you try another MPI (e.g., LAM, > MPICH-gm, whatever)? > > Let us know what you find. Thanks! > > > ------------------------------------------------------------------------ > *From:* users-boun...@open-mpi.org > [mailto:users-boun...@open-mpi.org] *On Behalf Of *Justin Bronder > *Sent:* Thursday, July 06, 2006 3:16 PM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] Problem with Openmpi 1.1 > > With 1.0.3a1r10670 the same problem is occuring. Again the same > configure arguments > as before. For clarity, the Myrinet drive we are using is 2.0.21 > > node90:~/src/hpl/bin/ompi-xl-1.0.3 jbronder$ gm_board_info > GM build ID is "2.0.21_MacOSX_rc20050429075134PDT > r...@node96.meldrew.clusters.umaine.edu:/usr/src/gm-2.0.21_MacOSX > Fri Jun 16 14:39:45 EDT 2006." > > node90:~/src/hpl/bin/ompi-xl-1.0.3 jbronder$ > /usr/local/ompi-xl-1.0.3/bin/mpirun -np 2 xhpl > This succeeds. > ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.1196787 > ...... PASSED > ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0283195 > ...... PASSED > ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0063300 > ...... PASSED > > node90:~/src/hpl/bin/ompi-xl-1.0.3 jbronder$ > /usr/local/ompi-xl-1.0.3/bin/mpirun -mca btl gm -np 2 xhpl > This fails. > ||Ax-b||_oo / ( eps * ||A||_1 * N ) = > 717370209518881444284334080.0000000 ...... FAILED > ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 226686309135.4274597 > ...... FAILED > ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 2386641249.6518722 > ...... FAILED > ||Ax-b||_oo . . . . . . . . . . . . . . . . . = > 2037398812542965504.000000 > ||A||_oo . . . . . . . . . . . . . . . . . . . = 2561.554752 > ||A||_1 . . . . . . . . . . . . . . . . . . . = 2558.129237 > ||x||_oo . . . . . . . . . . . . . . . . . . . = > 300175355203841216.000000 > ||x||_1 . . . . . . . . . . . . . . . . . . . = > 31645943341479366656.000000 > > Does anyone have a working system with OS X and Myrinet (GM)? If > so, I'd love to hear > the configure arguments and various versions you are using. Bonus > points if you are > using the IBM XL compilers. > > Thanks, > Justin. > > > On 7/6/06, *Justin Bronder* <jsbron...@gmail.com > <mailto:jsbron...@gmail.com>> wrote: > > Yes, that output was actually cut and pasted from an OS X > run. I'm about to test > against 1.0.3a1r10670. > > Justin. > > On 7/6/06, *Galen M. Shipman* < gship...@lanl.gov > <mailto:gship...@lanl.gov>> wrote: > > Justin, > > Is the OS X run showing the same residual failure? > > - Galen > > On Jul 6, 2006, at 10:49 AM, Justin Bronder wrote: > > Disregard the failure on Linux, a rebuild from scratch of > HPL and OpenMPI > seems to have resolved the issue. At least I'm not > getting the errors during > the residual checks. > > However, this is persisting under OS X. > > Thanks, > Justin. > > On 7/6/06, *Justin Bronder* < jsbron...@gmail.com > <mailto:jsbron...@gmail.com>> wrote: > > For OS X: > /usr/local/ompi-xl/bin/mpirun -mca btl gm -np 4 ./xhpl > > For Linux: > ARCH=ompi-gnu-1.1.1a > /usr/local/$ARCH/bin/mpiexec -mca btl gm -np 2 -path > /usr/local/$ARCH/bin ./xhpl > > Thanks for the speedy response, > Justin. > > On 7/6/06, *Galen M. Shipman* < gship...@lanl.gov > <mailto:gship...@lanl.gov>> wrote: > Hey Justin, > > Please provide us your mca parameters (if any), these > could be in a config file, environment variables or on > the command line. > > Thanks, > > Galen > > On Jul 6, 2006, at 9:22 AM, Justin Bronder wrote: > > As far as the nightly builds go, I'm still seeing what > I believe to be > this problem in both r10670 and r10652. This is > happening with > both Linux and OS X. Below are the systems and > ompi_info for the > newest revision 10670. > > As an example of the error, when running HPL with > Myrinet I get the > following error. Using tcp everything is fine and I > see the results I'd > expect. > > ---------------------------------------------------------------------------- > ||Ax-b||_oo / ( eps * ||A||_1 * N ) = > 42820214496954887558164928727596662784.0000000 ...... > FAILED > ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = > 156556068835.2711182 ...... FAILED > ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = > 1156439380.5172558 ...... FAILED > ||Ax-b||_oo . . . . . . . . . . . . . . . . . = > 272683853978565028754868928512.000000 > ||A||_oo . . . . . . . . . . . . . . . . . . . > = 3822.884181 > ||A||_1 . . . . . . . . . . . . . . . . . . . > = 3823.922627 > ||x||_oo . . . . . . . . . . . . . . . . . . . = > 37037692483529688659798261760.000000 > ||x||_1 . . . . . . . . . . . . . . . . . . . = > 4102704048669982798475494948864.000000 > =================================================== > > Finished 1 tests with the following results: > 0 tests completed and passed residual > checks, > 1 tests completed and failed residual > checks, > 0 tests skipped because of illegal input > values. > > ---------------------------------------------------------------------------- > > Linux node41 2.6.16.19 <http://2.6.16.19> #1 SMP Wed > Jun 21 17:22:01 EDT 2006 ppc64 PPC970FX, altivec > supported GNU/Linux > jbronder@node41 ~ $ /usr/local/ompi- > gnu-1.1.1a/bin/ompi_info > Open MPI: 1.1.1a1r10670 > Open MPI SVN revision: r10670 > Open RTE: 1.1.1a1r10670 > Open RTE SVN revision: r10670 > OPAL: 1.1.1a1r10670 > OPAL SVN revision: r10670 > Prefix: /usr/local/ompi-gnu-1.1.1a > Configured architecture: powerpc64-unknown-linux-gnu > Configured by: root > Configured on: Thu Jul 6 10:15:37 EDT 2006 > Configure host: node41 > Built by: root > Built on: Thu Jul 6 10:28:14 EDT 2006 > Built host: node41 > C bindings: yes > C++ bindings: yes > Fortran77 bindings: yes (all) > Fortran90 bindings: yes > Fortran90 bindings size: small > C compiler: gcc > C compiler absolute: /usr/bin/gcc > C++ compiler: g++ > C++ compiler absolute: /usr/bin/g++ > Fortran77 compiler: gfortran > Fortran77 compiler abs: > /usr/powerpc64-unknown-linux-gnu/gcc-bin/4.1.0/gfortran > Fortran90 compiler: gfortran > Fortran90 compiler abs: > /usr/powerpc64-unknown-linux-gnu/gcc-bin/4.1.0/gfortran > C profiling: yes > C++ profiling: yes > Fortran77 profiling: yes > Fortran90 profiling: yes > C++ exceptions: no > Thread support: posix (mpi: no, progress: no) > Internal debug support: no > MPI parameter check: runtime > Memory profiling support: no > Memory debugging support: no > libltdl support: yes > MCA memory: ptmalloc2 (MCA v1.0, API > v1.0, Component v1.1.1) > MCA paffinity: linux (MCA v1.0, API v1.0, > Component v1.1.1) > MCA maffinity: first_use (MCA v1.0, API > v1.0, Component v1.1.1) > MCA timer: linux (MCA v1.0, API v1.0, > Component v1.1.1) > MCA allocator: basic (MCA v1.0, API v1.0, > Component v1.0) > MCA allocator: bucket (MCA v1.0, API v1.0, > Component v1.0) > MCA coll: basic (MCA v1.0, API v1.0, > Componentv1.1.1) > > MCA coll: hierarch (MCA v1.0, API v1.0, > Component v1.1.1) > MCA coll: self (MCA v1.0, API v1.0, > Component v1.1.1) > MCA coll: sm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA coll: tuned (MCA v1.0, API v1.0, > Component v1.1.1) > MCA io: romio (MCA v1.0, API v1.0, > Component v1.1.1) > MCA mpool: gm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA mpool: sm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pml: ob1 (MCA v1.0, API v1.0, > Component v1.1.1) > MCA bml: r2 (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rcache: rb (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: gm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: self (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: sm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: tcp (MCA v1.0, API v1.0, > Component v1.0) > MCA topo: unity (MCA v1.0, API v1.0, > Component v1.1.1) > MCA osc: pt2pt (MCA v1.0, API v1.0, > Component v1.0) > MCA gpr: null (MCA v1.0, API v1.0, > Component v1.1.1) > MCA gpr: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA gpr: replica (MCA v1.0, API v1.0, > Component v1.1.1) > MCA iof: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA iof: svc (MCA v1.0, API v1.0, > Component v1.1.1) > MCA ns: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA ns: replica (MCA v1.0, API v1.0, > Component v1.1.1) > MCA oob: tcp (MCA v1.0, API v1.0, > Component v1.0) > MCA ras: dash_host (MCA v1.0, API > v1.0, Component v1.1.1) > MCA ras: hostfile (MCA v1.0, API > v1.0, Component v1.1.1) > MCA ras: localhost (MCA v1.0, API > v1.0, Component v1.1.1) > MCA ras: tm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rds: hostfile (MCA v1.0, API > v1.0, Component v1.1.1) > MCA rds: resfile (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rmaps: round_robin (MCA v1.0, API > v1.0, Component v1.1.1) > MCA rmgr: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rmgr: urm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rml: oob (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pls: fork (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pls: rsh (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pls: tm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: env (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: pipe (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: seed (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: singleton (MCA v1.0, API > v1.0, Component v1.1.1) > Configured as: > ./configure \ > --prefix=$PREFIX \ > --enable-mpi-f77 \ > --enable-mpi-f90 \ > --enable-mpi-profile \ > --enable-mpi-cxx \ > --enable-pty-support \ > --enable-shared \ > --enable-smp-locks \ > --enable-io-romio \ > --with-tm=/usr/local/pbs \ > --without-xgrid \ > --without-slurm \ > --with-gm=/opt/gm > > Darwin node90.meldrew.clusters.umaine.edu > <http://node90.meldrew.clusters.umaine.edu> 8.6.0 > Darwin Kernel Version 8.6.0: Tue Mar 7 16:58:48 PST > 2006; root:xnu-792.6.70.obj~1/RELEASE_PPC Power > Macintosh powerpc > node90:~/src/hpl jbronder$ > /usr/local/ompi-xl/bin/ompi_info > Open MPI: 1.1.1a1r10670 > Open MPI SVN revision: r10670 > Open RTE: 1.1.1a1r10670 > Open RTE SVN revision: r10670 > OPAL: 1.1.1a1r10670 > OPAL SVN revision: r10670 > Prefix: /usr/local/ompi-xl > Configured architecture: powerpc-apple-darwin8.6.0 > Configured by: > Configured on: Thu Jul 6 10:05:20 EDT 2006 > Configure host: > node90.meldrew.clusters.umaine.edu > <http://node90.meldrew.clusters.umaine.edu> > Built by: root > Built on: Thu Jul 6 10:37:40 EDT 2006 > Built host: > node90.meldrew.clusters.umaine.edu > <http://node90.meldrew.clusters.umaine.edu> > C bindings: yes > C++ bindings: yes > Fortran77 bindings: yes (lower case) > Fortran90 bindings: yes > Fortran90 bindings size: small > C compiler: /opt/ibmcmp/vac/6.0/bin/xlc > C compiler absolute: /opt/ibmcmp/vac/6.0/bin/xlc > C++ compiler: /opt/ibmcmp/vacpp/6.0/bin/xlc++ > C++ compiler absolute: /opt/ibmcmp/vacpp/6.0/bin/xlc++ > Fortran77 compiler: /opt/ibmcmp/xlf/8.1/bin/xlf_r > Fortran77 compiler abs: /opt/ibmcmp/xlf/8.1/bin/xlf_r > Fortran90 compiler: /opt/ibmcmp/xlf/8.1/bin/xlf90_r > Fortran90 compiler abs: /opt/ibmcmp/xlf/8.1/bin/xlf90_r > C profiling: yes > C++ profiling: yes > Fortran77 profiling: yes > Fortran90 profiling: yes > C++ exceptions: no > Thread support: posix (mpi: no, progress: no) > Internal debug support: no > MPI parameter check: runtime > Memory profiling support: no > Memory debugging support: no > libltdl support: yes > MCA memory: darwin (MCA v1.0, API v1.0, > Component v1.1.1) > MCA maffinity: first_use (MCA v1.0, API > v1.0, Component v1.1.1) > MCA timer: darwin (MCA v1.0, API v1.0, > Component v1.1.1) > MCA allocator: basic (MCA v1.0, API v1.0, > Component v1.0) > MCA allocator: bucket (MCA v1.0, API v1.0, > Component v1.0) > MCA coll: basic (MCA v1.0, API v1.0, > Component v1.1.1) > MCA coll: hierarch (MCA v1.0, API > v1.0, Component v1.1.1) > MCA coll: self (MCA v1.0, API v1.0, > Component v1.1.1) > MCA coll: sm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA coll: tuned (MCA v1.0, API v1.0, > Component v1.1.1) > MCA io: romio (MCA v1.0, API v1.0, > Component v1.1.1) > MCA mpool: sm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA mpool: gm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pml: ob1 (MCA v1.0, API v1.0, > Component v1.1.1) > MCA bml: r2 (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rcache: rb (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: self (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: sm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: gm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA btl: tcp (MCA v1.0, API v1.0, > Component v1.0) > MCA topo: unity (MCA v1.0, API v1.0, > Component v1.1.1) > MCA osc: pt2pt (MCA v1.0, API v1.0, > Component v1.0) > MCA gpr: null (MCA v1.0, API v1.0, > Component v1.1.1) > MCA gpr: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA gpr: replica (MCA v1.0, API v1.0, > Component v1.1.1) > MCA iof: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA iof: svc (MCA v1.0, API v1.0, > Component v1.1.1) > MCA ns: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA ns: replica (MCA v1.0, API v1.0, > Component v1.1.1) > MCA oob: tcp (MCA v1.0, API v1.0, > Component v1.0) > MCA ras: dash_host (MCA v1.0, API > v1.0, Component v1.1.1) > MCA ras: hostfile (MCA v1.0, API > v1.0, Component v1.1.1) > MCA ras: localhost (MCA v1.0, API > v1.0, Component v1.1.1) > MCA ras: tm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rds: hostfile (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rds: resfile (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rmaps: round_robin (MCA v1.0, API > v1.0, Component v1.1.1) > MCA rmgr: proxy (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rmgr: urm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA rml: oob (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pls: fork (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pls: rsh (MCA v1.0, API v1.0, > Component v1.1.1) > MCA pls: tm (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: env (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: seed (MCA v1.0, API v1.0, > Component v1.1.1) > MCA sds: singleton (MCA v1.0, API > v1.0, Component v1.1.1) > MCA sds: pipe (MCA v1.0, API v1.0, > Component v1.1.1) > Configured as: > ./configure \ > --prefix=$PREFIX \ > --with-tm=/usr/local/pbs/ \ > --with-gm=/opt/gm \ > --enable-static \ > --disable-cxx > On 7/3/06, *George Bosilca* < bosi...@cs.utk.edu > <mailto:bosi...@cs.utk.edu>> wrote: > Bernard, > > A bug in the Open MPI GM driver was discovered after > the 1.1 release. > A patch for the 1.1 is on the way. However, I don't > know if it will > be available before the 1.1.1. Meanwhile, you can use > the nightly > build version or a fresh check-out from the SVN > repository. Both of > them have the GM bug corrected. > > Sorry for the troubles, > george. > > On Jul 3, 2006, at 12:58 PM, Borenstein, Bernard S wrote: > > > I've built and sucessfully run the Nasa Overflow > 2.0aa program with > > Openmpi 1.0.2. I'm running on an opteron linux > cluster running SLES 9 > > and GM 2.0.24. I built Openmpi 1.1 with the intel 9 > compilers and > > try to > > run Overflow 2.0aa with myrinet, it get what looks > like a data > > corruption error and the program dies quickly. > > There are no mpi errors at all.If I run using GIGE > (--mca btl > > self,tcp), > > the program runs to competion correctly. Here is my > ompi_info > > output : > > > > bsb3227@mahler:~/openmpi_1.1/bin> ./ompi_info > > Open MPI: 1.1 > > Open MPI SVN revision: r10477 > > Open RTE: 1.1 > > Open RTE SVN revision: r10477 > > OPAL: 1.1 > > OPAL SVN revision: r10477 > > Prefix: /home/bsb3227/openmpi_1.1 > > Configured architecture: x86_64-unknown-linux-gnu > > Configured by: bsb3227 > > Configured on: Fri Jun 30 07:08:54 PDT 2006 > > Configure host: mahler > > Built by: bsb3227 > > Built on: Fri Jun 30 07:54:46 PDT 2006 > > Built host: mahler > > C bindings: yes > > C++ bindings: yes > > Fortran77 bindings: yes (all) > > Fortran90 bindings: yes > > Fortran90 bindings size: small > > C compiler: icc > > C compiler absolute: /opt/intel/cce/9.0.25/bin/icc > > C++ compiler: icpc > > C++ compiler absolute: /opt/intel/cce/9.0.25/bin/icpc > > Fortran77 compiler: ifort > > Fortran77 compiler abs: /opt/intel/fce/9.0.25/bin/ifort > > Fortran90 compiler: /opt/intel/fce/9.0.25/bin/ifort > > Fortran90 compiler abs: > /opt/intel/fce/9.0.25/bin/ifort > > C profiling: yes > > C++ profiling: yes > > Fortran77 profiling: yes > > Fortran90 profiling: yes > > C++ exceptions: no > > Thread support: posix (mpi: no, progress: no) > > Internal debug support: no > > MPI parameter check: runtime > > Memory profiling support: no > > Memory debugging support: no > > libltdl support: yes > > MCA memory: ptmalloc2 (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA paffinity: linux (MCA v1.0, API v1.0, > Component v1.1) > > MCA maffinity: first_use (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA maffinity: libnuma (MCA v1.0, API > v1.0, Component v1.1) > > MCA timer: linux (MCA v1.0, API v1.0, > Component v1.1) > > MCA allocator: basic (MCA v1.0, API v1.0, > Component v1.0) > > MCA allocator: bucket (MCA v1.0, API v1.0, > Component v1.0) > > MCA coll: basic (MCA v1.0, API v1.0, > Component v1.1) > > MCA coll: hierarch (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA coll: self (MCA v1.0, API v1.0, > Component v1.1) > > MCA coll: sm (MCA v1.0, API v1.0, > Component v1.1) > > MCA coll: tuned (MCA v1.0, API v1.0, > Component v1.1) > > MCA io: romio (MCA v1.0, API v1.0, > Component v1.1) > > MCA mpool: sm (MCA v1.0, API v1.0, > Component v1.1) > > MCA mpool: gm (MCA v1.0, API v1.0, > Component v1.1) > > MCA pml: ob1 (MCA v1.0, API v1.0, > Component v1.1) > > MCA bml: r2 (MCA v1.0, API v1.0, > Component v1.1) > > MCA rcache: rb (MCA v1.0, API v1.0, > Component v1.1) > > MCA btl: self (MCA v1.0, API v1.0, > Component v1.1) > > MCA btl: sm (MCA v1.0, API v1.0, > Component v1.1) > > MCA btl: gm (MCA v1.0, API v1.0, > Component v1.1) > > MCA btl: tcp (MCA v1.0, API v1.0, > Component v1.0) > > MCA topo: unity (MCA v1.0, API v1.0, > Component v1.1) > > MCA osc: pt2pt (MCA v1.0, API v1.0, > Component v1.0) > > MCA gpr: null (MCA v1.0, API v1.0, > Component v1.1) > > MCA gpr: proxy (MCA v1.0, API v1.0, > Component v1.1) > > MCA gpr: replica (MCA v1.0, API > v1.0, Component v1.1) > > MCA iof: proxy (MCA v1.0, API v1.0, > Component v1.1) > > MCA iof: svc (MCA v1.0, API v1.0, > Component v1.1) > > MCA ns: proxy (MCA v1.0, API v1.0, > Component v1.1) > > MCA ns: replica (MCA v1.0, API > v1.0, Component v1.1) > > MCA oob: tcp (MCA v1.0, API v1.0, > Component v1.0) > > MCA ras: dash_host (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA ras: hostfile (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA ras: localhost (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA ras: slurm (MCA v1.0, API v1.0, > Component v1.1) > > MCA ras: tm (MCA v1.0, API v1.0, > Component v1.1) > > MCA rds: hostfile (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA rds: resfile (MCA v1.0, API > v1.0, Component v1.1) > > MCA rmaps: round_robin (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA rmgr: proxy (MCA v1.0, API v1.0, > Component v1.1) > > MCA rmgr: urm (MCA v1.0, API v1.0, > Component v1.1) > > MCA rml: oob (MCA v1.0, API v1.0, > Component v1.1) > > MCA pls: fork (MCA v1.0, API v1.0, > Component v1.1) > > MCA pls: rsh (MCA v1.0, API v1.0, > Component v1.1) > > MCA pls: slurm (MCA v1.0, API v1.0, > Component v1.1) > > MCA pls: tm (MCA v1.0, API v1.0, > Component v1.1) > > MCA sds: env (MCA v1.0, API v1.0, > Component v1.1) > > MCA sds: seed (MCA v1.0, APIv1.0, > Component v1.1) > > > MCA sds: singleton (MCA v1.0, API > v1.0, Component > > v1.1) > > MCA sds: pipe (MCAv1.0, API v1.0, > Component v1.1) > > > MCA sds: slurm (MCA v1.0, API v1.0, > Component v1.1) > > > > Here is the ifconfig for one of the nodes : > > > > bsb3227@m045:~> /sbin/ifconfig > > eth0 Link encap:Ethernet HWaddr 00:50:45:5D:CD:FE > > inet addr:10.241.194.45 > <http://10.241.194.45> Bcast: 10.241.195.255 > <http://10.241.195.255> > > Mask:255.255.254.0 <http://255.255.254.0> > > inet6 addr: fe80::250:45ff:fe5d:cdfe/64 > Scope:Link > > UP BROADCAST NOTRAILERS RUNNING > MULTICAST MTU:1500 > > Metric:1 > > RX packets:39913407 errors:0 dropped:0 > overruns:0 frame:0 > > TX packets:48794587 errors:0 dropped:0 > overruns:0 carrier:0 > > collisions:0 txqueuelen:1000 > > RX bytes:31847343907 (30371.9 Mb) TX > bytes:48231713866 > > (45997.3 Mb) > > Interrupt:19 > > > > eth1 Link encap:Ethernet HWaddr 00:50:45:5D:CD:FF > > inet6 addr: fe80::250:45ff:fe5d:cdff/64 > Scope:Link > > UP BROADCAST MULTICAST MTU:1500 Metric:1 > > RX packets:0 errors:0 dropped:0 overruns:0 > frame:0 > > TX packets:0 errors:0 dropped:0 overruns:0 > carrier:0 > > collisions:0 txqueuelen:1000 > > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) > > Interrupt:19 > > > > lo Link encap:Local Loopback > > inet addr: 127.0.0.1 > <http://127.0.0.1> Mask: 255.0.0.0 <http://255.0.0.0> > > inet6 addr: ::1/128 Scope:Host > > UP LOOPBACK RUNNING MTU:16436 Metric:1 > > RX packets:23141 errors:0 dropped:0 > overruns:0 frame:0 > > TX packets:23141 errors:0 dropped:0 > overruns:0 carrier:0 > > collisions:0 txqueuelen:0 > > RX bytes:20145689 (19.2 Mb) TX > bytes:20145689 (19.2 Mb) > > > > I hope someone can give me some guidance on how to > debug this problem. > > Thanx in advance for any help > > that can be provided. > > > > Bernie Borenstein > > The Boeing Company > > <config.log.gz> > > _______________________________________________ > > users mailing list > > us...@open-mpi.org <mailto:us...@open-mpi.org> > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > "Half of what I say is meaningless; but I say it so > that the other > half may reach you" > Kahlil Gibran > > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > > > ------------------------------------------------------------------------ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Justin Bronder University of Maine, Orono Advanced Computing Research Lab 20 Godfrey Dr Orono, ME 04473 www.clusters.umaine.edu Mathematics Department 425 Neville Hall Orono, ME 04469