Hi Terry,

unfortunately I haven't got a stack trace.

OS: Mac OS X 10.4.7 Server on the Xgrid-server and Mac OS X 10.4.7 Client on every node (G4 and G5). For testing-purposes I've installed OpenMPI 1.1 on a Dual-G4-node and on a Dual-G5-node with my Xgrid consisting of only either the Dual-G4- or the Dual-G5-node. No matter which configuration, I ran into the bus error.

Yours,
Frank

users-requ...@open-mpi.org wrote:
Send users mailing list submissions to
        us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
        users-requ...@open-mpi.org

You can reach the person managing the list at
        users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

   1. Re: OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown, error:        0)
      si_code:1(BUS_ADRALN) (Terry D. Dontje)
   2. Re: OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown error: 0),
      si_code:1(BUS_ADRALN) (Frank) (Frank)


----------------------------------------------------------------------

Message: 1
Date: Wed, 28 Jun 2006 07:26:46 -0400
From: "Terry D. Dontje" <terry.don...@sun.com>
Subject: Re: [OMPI users] OpenMPI 1.1: Signal:10
        info.si_errno:0(Unknown, error: 0) si_code:1(BUS_ADRALN)
To: us...@open-mpi.org
Message-ID: <44a26776.2000...@sun.com>
Content-Type: text/plain; format=flowed; charset=ISO-8859-1

Frank,

Do you possibly have a stack trace of the program that failed.  Also,
what OS and platform are you running on?
--td

----------------------------------------------------------------------

Message: 1
Date: Wed, 28 Jun 2006 12:53:14 +0200
From: Frank <openmpi-u...@fraka-mp.de>
Subject: [OMPI users] OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown
        error:  0) si_code:1(BUS_ADRALN)
To: us...@open-mpi.org
Message-ID: <44a25f9a.3070...@fraka-mp.de>
Content-Type: text/plain; charset="iso-8859-1"

Hi!

I've recently updated to OpenMPI 1.1 on a few nodes and running into a problem that wasn't there with OpenMPI 1.0.2.

Submitting a job to the XGrid with OpenMPI 1.1 yields a Bus error that isn't there when not submitting the job to the XGrid:

[g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
[g5dual.3-net:08794] [0,0,0] setting up session dir with
[g5dual.3-net:08794]    universe default-universe
[g5dual.3-net:08794]    user motte
[g5dual.3-net:08794]    host g5dual.3-net
[g5dual.3-net:08794]    jobid 0
[g5dual.3-net:08794]    procid 0
[g5dual.3-net:08794] procdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0/0 [g5dual.3-net:08794] jobdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0 [g5dual.3-net:08794] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe
[g5dual.3-net:08794] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:08794] tmp: /tmp
[g5dual.3-net:08794] [0,0,0] contact_file /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/universe-setup.txt
[g5dual.3-net:08794] [0,0,0] wrote setup file
Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
Failing at addr:0x10
*** End of error message ***
Bus error
[g5dual:/Network/CFD/MVH-1.0] motte%

When not xgrid-submitting the job with OpenMPI 1.1 everything is just fine:

[g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
[g5dual.3-net:08957] procdir: (null)
[g5dual.3-net:08957] jobdir: (null)
[g5dual.3-net:08957] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe
[g5dual.3-net:08957] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:08957] tmp: /tmp
[g5dual.3-net:08957] connect_uni: contact info read
[g5dual.3-net:08957] connect_uni: connection not allowed
[g5dual.3-net:08957] [0,0,0] setting up session dir with
[g5dual.3-net:08957]    tmpdir /tmp
[g5dual.3-net:08957]    universe default-universe-8957
[g5dual.3-net:08957]    user motte
[g5dual.3-net:08957]    host g5dual.3-net
[g5dual.3-net:08957]    jobid 0
[g5dual.3-net:08957]    procid 0
[g5dual.3-net:08957] procdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957/0/0 [g5dual.3-net:08957] jobdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957/0 [g5dual.3-net:08957] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957
[g5dual.3-net:08957] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:08957] tmp: /tmp
[g5dual.3-net:08957] [0,0,0] contact_file /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957/universe-setup.txt
[g5dual.3-net:08957] [0,0,0] wrote setup file
[g5dual.3-net:08957] pls:rsh: local csh: 1, local bash: 0
[g5dual.3-net:08957] pls:rsh: assuming same remote shell as local shell
[g5dual.3-net:08957] pls:rsh: remote csh: 1, remote bash: 0
[g5dual.3-net:08957] pls:rsh: final template argv:
[g5dual.3-net:08957] pls:rsh: /usr/bin/ssh <template> orted --debug --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename <template> --universe motte@g5dual.3-net:default-universe-8957 --nsreplica "0.0.0;tcp://192.168.3.2:49281" --gprreplica "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 0
[g5dual.3-net:08957] pls:rsh: launching on node localhost
[g5dual.3-net:08957] pls:rsh: oversubscribed -- setting mpi_yield_when_idle to 1 (1 2)
[g5dual.3-net:08957] pls:rsh: localhost is a LOCAL node
[g5dual.3-net:08957] pls:rsh: changing to directory /Users/motte
[g5dual.3-net:08957] pls:rsh: executing: orted --debug --bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost --universe motte@g5dual.3-net:default-universe-8957 --nsreplica "0.0.0;tcp://192.168.3.2:49281" --gprreplica "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 1
[g5dual.3-net:08958] [0,0,1] setting up session dir with
[g5dual.3-net:08958]    universe default-universe-8957
[g5dual.3-net:08958]    user motte
[g5dual.3-net:08958]    host localhost
[g5dual.3-net:08958]    jobid 0
[g5dual.3-net:08958]    procid 1
[g5dual.3-net:08958] procdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/0/1 [g5dual.3-net:08958] jobdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/0 [g5dual.3-net:08958] unidir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957
[g5dual.3-net:08958] top: openmpi-sessions-motte@localhost_0
[g5dual.3-net:08958] tmp: /tmp
[g5dual.3-net:08962] [0,1,1] setting up session dir with
[g5dual.3-net:08962]    universe default-universe-8957
[g5dual.3-net:08962]    user motte
[g5dual.3-net:08962]    host localhost
[g5dual.3-net:08962]    jobid 1
[g5dual.3-net:08962]    procid 1
[g5dual.3-net:08962] procdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1/1 [g5dual.3-net:08962] jobdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1 [g5dual.3-net:08962] unidir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957
[g5dual.3-net:08962] top: openmpi-sessions-motte@localhost_0
[g5dual.3-net:08962] tmp: /tmp
[g5dual.3-net:08960] [0,1,0] setting up session dir with
[g5dual.3-net:08960]    universe default-universe-8957
[g5dual.3-net:08960]    user motte
[g5dual.3-net:08960]    host localhost
[g5dual.3-net:08960]    jobid 1
[g5dual.3-net:08960]    procid 0
[g5dual.3-net:08960] procdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1/0 [g5dual.3-net:08960] jobdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1 [g5dual.3-net:08960] unidir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957
[g5dual.3-net:08960] top: openmpi-sessions-motte@localhost_0
[g5dual.3-net:08960] tmp: /tmp
[g5dual.3-net:08957] spawn: in job_state_callback(jobid = 1, state = 0x4)
[g5dual.3-net:08957] Info: Setting up debugger process table for applications
 MPIR_being_debugged = 0
 MPIR_debug_gate = 0
 MPIR_debug_state = 1
 MPIR_acquired_pre_main = 0
 MPIR_i_am_starter = 0
 MPIR_proctable_size = 2
 MPIR_proctable:
   (i, host, exe, pid) = (0, localhost, ./vhone, 8960)
   (i, host, exe, pid) = (1, localhost, ./vhone, 8962)
[g5dual.3-net:08960] [0,1,0] ompi_mpi_init completed
[g5dual.3-net:08962] [0,1,1] ompi_mpi_init completed

(Rest ommitted)

When xgrid-submitting the job with OpenMPI 1.0.2 there's no Bus error, either:

[g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
[g5dual.3-net:00497] [0,0,0] setting up session dir with
[g5dual.3-net:00497]    universe default-universe
[g5dual.3-net:00497]    user motte
[g5dual.3-net:00497]    host g5dual.3-net
[g5dual.3-net:00497]    jobid 0
[g5dual.3-net:00497]    procid 0
[g5dual.3-net:00497] procdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0/0 [g5dual.3-net:00497] jobdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0 [g5dual.3-net:00497] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe
[g5dual.3-net:00497] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:00497] tmp: /tmp
[g5dual.3-net:00497] [0,0,0] contact_file /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/universe-setup.txt
[g5dual.3-net:00497] [0,0,0] wrote setup file
[g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x1)
[g5dual.3-net:00507] [0,1,1] setting up session dir with
[g5dual.3-net:00507]    universe default-universe
[g5dual.3-net:00507]    user nobody
[g5dual.3-net:00507]    host xgrid-node-1
[g5dual.3-net:00507]    jobid 1
[g5dual.3-net:00507]    procid 1
[g5dual.3-net:00507] procdir: /tmp/openmpi-sessions-nobody@xgrid-node-1_0/default-universe/1/1 [g5dual.3-net:00507] jobdir: /tmp/openmpi-sessions-nobody@xgrid-node-1_0/default-universe/1 [g5dual.3-net:00507] unidir: /tmp/openmpi-sessions-nobody@xgrid-node-1_0/default-universe
[g5dual.3-net:00507] top: openmpi-sessions-nobody@xgrid-node-1_0
[g5dual.3-net:00507] tmp: /tmp
[g5dual.3-net:00509] [0,1,0] setting up session dir with
[g5dual.3-net:00509]    universe default-universe
[g5dual.3-net:00509]    user nobody
[g5dual.3-net:00509]    host xgrid-node-0
[g5dual.3-net:00509]    jobid 1
[g5dual.3-net:00509]    procid 0
[g5dual.3-net:00509] procdir: /tmp/openmpi-sessions-nobody@xgrid-node-0_0/default-universe/1/0 [g5dual.3-net:00509] jobdir: /tmp/openmpi-sessions-nobody@xgrid-node-0_0/default-universe/1 [g5dual.3-net:00509] unidir: /tmp/openmpi-sessions-nobody@xgrid-node-0_0/default-universe
[g5dual.3-net:00509] top: openmpi-sessions-nobody@xgrid-node-0_0
[g5dual.3-net:00509] tmp: /tmp
[g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x3)
[g5dual.3-net:00497] Info: Setting up debugger process table for applications
 MPIR_being_debugged = 0
 MPIR_debug_gate = 0
 MPIR_debug_state = 1
 MPIR_acquired_pre_main = 0
 MPIR_i_am_starter = 0
 MPIR_proctable_size = 2
 MPIR_proctable:
   (i, host, exe, pid) = (0, xgrid-node-1, ./vhone, 507)
   (i, host, exe, pid) = (1, xgrid-node-0, ./vhone, 509)
[g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x4)
[g5dual.3-net:00507] [0,1,1] ompi_mpi_init completed
[g5dual.3-net:00509] [0,1,0] ompi_mpi_init completed

(Rest ommitted)

These are the steps I've done in order to upgrade to OpenMPI 1.1:

- sudo make uninstall (within the OpenMPI 1.0.2-directory tree)
- cd to OpenMPI 1.1-directory tree
- ./configure --prefix=/usr --with-xgrid
- sudo make all install

Maybe the ompi_info-output is of any help:

[g5dual:/Network/CFD/MVH-1.0] motte% ompi_info
               Open MPI: 1.1
  Open MPI SVN revision: r10477
               Open RTE: 1.1
  Open RTE SVN revision: r10477
                   OPAL: 1.1
      OPAL SVN revision: r10477
                 Prefix: /usr
Configured architecture: powerpc-apple-darwin8.7.0
          Configured by: admin
          Configured on: Wed Jun 28 10:55:12 CEST 2006
         Configure host: G5-Dual.local
               Built by: root
               Built on: Wed Jun 28 11:32:04 CEST 2006
             Built host: G5-Dual.local
             C bindings: yes
           C++ bindings: yes
     Fortran77 bindings: yes (single underscore)
     Fortran90 bindings: yes
Fortran90 bindings size: small
             C compiler: gcc
    C compiler absolute: /usr/bin/gcc
           C++ compiler: g++
  C++ compiler absolute: /usr/bin/g++
     Fortran77 compiler: gfortran
 Fortran77 compiler abs: /usr/local/bin/gfortran
     Fortran90 compiler: gfortran
 Fortran90 compiler abs: /usr/local/bin/gfortran
            C profiling: yes
          C++ profiling: yes
    Fortran77 profiling: yes
    Fortran90 profiling: yes
         C++ exceptions: no
         Thread support: posix (mpi: no, progress: no)
 Internal debug support: no
    MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
        libltdl support: yes
             MCA memory: darwin (MCA v1.0, API v1.0, Component v1.1)
          MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
              MCA timer: darwin (MCA v1.0, API v1.0, Component v1.1)
          MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
          MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
               MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
               MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
               MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
               MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
               MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
                 MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
              MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
                MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
                MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
             MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
                MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
                MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
                MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
               MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
                MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
                MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
                MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
                MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
                MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
                MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
                 MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
                MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
                MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
                MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
                MCA ras: xgrid (MCA v1.0, API v1.0, Component v1.1)
                MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
                MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
              MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
               MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
               MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
                MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
                MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
                MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
                MCA pls: xgrid (MCA v1.0, API v1.0, Component v1.1)
                MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
                MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
                MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
                MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)

Enclosed you'll find the config.log.

Yours,
Frank


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: config.log
Url: 
http://www.open-mpi.org/MailArchives/users/attachments/20060628/a640acf1/config.pl

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 317, Issue 1
*************************************



------------------------------

Message: 2
Date: Wed, 28 Jun 2006 13:56:32 +0200
From: Frank <openmpi-u...@fraka-mp.de>
Subject: Re: [OMPI users] OpenMPI 1.1: Signal:10
        info.si_errno:0(Unknown error:  0), si_code:1(BUS_ADRALN) (Frank)
To: us...@open-mpi.org
Message-ID: <44a26e70.8030...@fraka-mp.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Hi!

The very same error occured with openmpi-1.1rc2r10468, too.

Yours,
Frank

users-requ...@open-mpi.org wrote:
Send users mailing list submissions to
        us...@open-mpi.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.open-mpi.org/mailman/listinfo.cgi/users
or, via email, send a message with subject or body 'help' to
        users-requ...@open-mpi.org

You can reach the person managing the list at
        users-ow...@open-mpi.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of users digest..."


Today's Topics:

   1. OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown error:     0)
      si_code:1(BUS_ADRALN) (Frank)


----------------------------------------------------------------------

Message: 1
Date: Wed, 28 Jun 2006 12:53:14 +0200
From: Frank <openmpi-u...@fraka-mp.de>
Subject: [OMPI users] OpenMPI 1.1: Signal:10 info.si_errno:0(Unknown
        error:  0) si_code:1(BUS_ADRALN)
To: us...@open-mpi.org
Message-ID: <44a25f9a.3070...@fraka-mp.de>
Content-Type: text/plain; charset="iso-8859-1"

Hi!

I've recently updated to OpenMPI 1.1 on a few nodes and running into a problem that wasn't there with OpenMPI 1.0.2.

Submitting a job to the XGrid with OpenMPI 1.1 yields a Bus error that isn't there when not submitting the job to the XGrid:

[g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
[g5dual.3-net:08794] [0,0,0] setting up session dir with
[g5dual.3-net:08794]    universe default-universe
[g5dual.3-net:08794]    user motte
[g5dual.3-net:08794]    host g5dual.3-net
[g5dual.3-net:08794]    jobid 0
[g5dual.3-net:08794]    procid 0
[g5dual.3-net:08794] procdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0/0 [g5dual.3-net:08794] jobdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0 [g5dual.3-net:08794] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe
[g5dual.3-net:08794] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:08794] tmp: /tmp
[g5dual.3-net:08794] [0,0,0] contact_file /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/universe-setup.txt
[g5dual.3-net:08794] [0,0,0] wrote setup file
Signal:10 info.si_errno:0(Unknown error: 0) si_code:1(BUS_ADRALN)
Failing at addr:0x10
*** End of error message ***
Bus error
[g5dual:/Network/CFD/MVH-1.0] motte%

When not xgrid-submitting the job with OpenMPI 1.1 everything is just fine:

[g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
[g5dual.3-net:08957] procdir: (null)
[g5dual.3-net:08957] jobdir: (null)
[g5dual.3-net:08957] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe
[g5dual.3-net:08957] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:08957] tmp: /tmp
[g5dual.3-net:08957] connect_uni: contact info read
[g5dual.3-net:08957] connect_uni: connection not allowed
[g5dual.3-net:08957] [0,0,0] setting up session dir with
[g5dual.3-net:08957]    tmpdir /tmp
[g5dual.3-net:08957]    universe default-universe-8957
[g5dual.3-net:08957]    user motte
[g5dual.3-net:08957]    host g5dual.3-net
[g5dual.3-net:08957]    jobid 0
[g5dual.3-net:08957]    procid 0
[g5dual.3-net:08957] procdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957/0/0 [g5dual.3-net:08957] jobdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957/0 [g5dual.3-net:08957] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957
[g5dual.3-net:08957] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:08957] tmp: /tmp
[g5dual.3-net:08957] [0,0,0] contact_file /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe-8957/universe-setup.txt
[g5dual.3-net:08957] [0,0,0] wrote setup file
[g5dual.3-net:08957] pls:rsh: local csh: 1, local bash: 0
[g5dual.3-net:08957] pls:rsh: assuming same remote shell as local shell
[g5dual.3-net:08957] pls:rsh: remote csh: 1, remote bash: 0
[g5dual.3-net:08957] pls:rsh: final template argv:
[g5dual.3-net:08957] pls:rsh: /usr/bin/ssh <template> orted --debug --bootproxy 1 --name <template> --num_procs 2 --vpid_start 0 --nodename <template> --universe motte@g5dual.3-net:default-universe-8957 --nsreplica "0.0.0;tcp://192.168.3.2:49281" --gprreplica "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 0
[g5dual.3-net:08957] pls:rsh: launching on node localhost
[g5dual.3-net:08957] pls:rsh: oversubscribed -- setting mpi_yield_when_idle to 1 (1 2)
[g5dual.3-net:08957] pls:rsh: localhost is a LOCAL node
[g5dual.3-net:08957] pls:rsh: changing to directory /Users/motte
[g5dual.3-net:08957] pls:rsh: executing: orted --debug --bootproxy 1 --name 0.0.1 --num_procs 2 --vpid_start 0 --nodename localhost --universe motte@g5dual.3-net:default-universe-8957 --nsreplica "0.0.0;tcp://192.168.3.2:49281" --gprreplica "0.0.0;tcp://192.168.3.2:49281" --mpi-call-yield 1
[g5dual.3-net:08958] [0,0,1] setting up session dir with
[g5dual.3-net:08958]    universe default-universe-8957
[g5dual.3-net:08958]    user motte
[g5dual.3-net:08958]    host localhost
[g5dual.3-net:08958]    jobid 0
[g5dual.3-net:08958]    procid 1
[g5dual.3-net:08958] procdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/0/1 [g5dual.3-net:08958] jobdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/0 [g5dual.3-net:08958] unidir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957
[g5dual.3-net:08958] top: openmpi-sessions-motte@localhost_0
[g5dual.3-net:08958] tmp: /tmp
[g5dual.3-net:08962] [0,1,1] setting up session dir with
[g5dual.3-net:08962]    universe default-universe-8957
[g5dual.3-net:08962]    user motte
[g5dual.3-net:08962]    host localhost
[g5dual.3-net:08962]    jobid 1
[g5dual.3-net:08962]    procid 1
[g5dual.3-net:08962] procdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1/1 [g5dual.3-net:08962] jobdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1 [g5dual.3-net:08962] unidir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957
[g5dual.3-net:08962] top: openmpi-sessions-motte@localhost_0
[g5dual.3-net:08962] tmp: /tmp
[g5dual.3-net:08960] [0,1,0] setting up session dir with
[g5dual.3-net:08960]    universe default-universe-8957
[g5dual.3-net:08960]    user motte
[g5dual.3-net:08960]    host localhost
[g5dual.3-net:08960]    jobid 1
[g5dual.3-net:08960]    procid 0
[g5dual.3-net:08960] procdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1/0 [g5dual.3-net:08960] jobdir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957/1 [g5dual.3-net:08960] unidir: /tmp/openmpi-sessions-motte@localhost_0/default-universe-8957
[g5dual.3-net:08960] top: openmpi-sessions-motte@localhost_0
[g5dual.3-net:08960] tmp: /tmp
[g5dual.3-net:08957] spawn: in job_state_callback(jobid = 1, state = 0x4)
[g5dual.3-net:08957] Info: Setting up debugger process table for applications
  MPIR_being_debugged = 0
  MPIR_debug_gate = 0
  MPIR_debug_state = 1
  MPIR_acquired_pre_main = 0
  MPIR_i_am_starter = 0
  MPIR_proctable_size = 2
  MPIR_proctable:
    (i, host, exe, pid) = (0, localhost, ./vhone, 8960)
    (i, host, exe, pid) = (1, localhost, ./vhone, 8962)
[g5dual.3-net:08960] [0,1,0] ompi_mpi_init completed
[g5dual.3-net:08962] [0,1,1] ompi_mpi_init completed

(Rest ommitted)

When xgrid-submitting the job with OpenMPI 1.0.2 there's no Bus error, either:

[g5dual:/Network/CFD/MVH-1.0] motte% mpirun -d -np 2 ./vhone
[g5dual.3-net:00497] [0,0,0] setting up session dir with
[g5dual.3-net:00497]    universe default-universe
[g5dual.3-net:00497]    user motte
[g5dual.3-net:00497]    host g5dual.3-net
[g5dual.3-net:00497]    jobid 0
[g5dual.3-net:00497]    procid 0
[g5dual.3-net:00497] procdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0/0 [g5dual.3-net:00497] jobdir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/0 [g5dual.3-net:00497] unidir: /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe
[g5dual.3-net:00497] top: openmpi-sessions-motte@g5dual.3-net_0
[g5dual.3-net:00497] tmp: /tmp
[g5dual.3-net:00497] [0,0,0] contact_file /tmp/openmpi-sessions-motte@g5dual.3-net_0/default-universe/universe-setup.txt
[g5dual.3-net:00497] [0,0,0] wrote setup file
[g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x1)
[g5dual.3-net:00507] [0,1,1] setting up session dir with
[g5dual.3-net:00507]    universe default-universe
[g5dual.3-net:00507]    user nobody
[g5dual.3-net:00507]    host xgrid-node-1
[g5dual.3-net:00507]    jobid 1
[g5dual.3-net:00507]    procid 1
[g5dual.3-net:00507] procdir: /tmp/openmpi-sessions-nobody@xgrid-node-1_0/default-universe/1/1 [g5dual.3-net:00507] jobdir: /tmp/openmpi-sessions-nobody@xgrid-node-1_0/default-universe/1 [g5dual.3-net:00507] unidir: /tmp/openmpi-sessions-nobody@xgrid-node-1_0/default-universe
[g5dual.3-net:00507] top: openmpi-sessions-nobody@xgrid-node-1_0
[g5dual.3-net:00507] tmp: /tmp
[g5dual.3-net:00509] [0,1,0] setting up session dir with
[g5dual.3-net:00509]    universe default-universe
[g5dual.3-net:00509]    user nobody
[g5dual.3-net:00509]    host xgrid-node-0
[g5dual.3-net:00509]    jobid 1
[g5dual.3-net:00509]    procid 0
[g5dual.3-net:00509] procdir: /tmp/openmpi-sessions-nobody@xgrid-node-0_0/default-universe/1/0 [g5dual.3-net:00509] jobdir: /tmp/openmpi-sessions-nobody@xgrid-node-0_0/default-universe/1 [g5dual.3-net:00509] unidir: /tmp/openmpi-sessions-nobody@xgrid-node-0_0/default-universe
[g5dual.3-net:00509] top: openmpi-sessions-nobody@xgrid-node-0_0
[g5dual.3-net:00509] tmp: /tmp
[g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x3)
[g5dual.3-net:00497] Info: Setting up debugger process table for applications
  MPIR_being_debugged = 0
  MPIR_debug_gate = 0
  MPIR_debug_state = 1
  MPIR_acquired_pre_main = 0
  MPIR_i_am_starter = 0
  MPIR_proctable_size = 2
  MPIR_proctable:
    (i, host, exe, pid) = (0, xgrid-node-1, ./vhone, 507)
    (i, host, exe, pid) = (1, xgrid-node-0, ./vhone, 509)
[g5dual.3-net:00497] spawn: in job_state_callback(jobid = 1, state = 0x4)
[g5dual.3-net:00507] [0,1,1] ompi_mpi_init completed
[g5dual.3-net:00509] [0,1,0] ompi_mpi_init completed

(Rest ommitted)

These are the steps I've done in order to upgrade to OpenMPI 1.1:

- sudo make uninstall (within the OpenMPI 1.0.2-directory tree)
- cd to OpenMPI 1.1-directory tree
- ./configure --prefix=/usr --with-xgrid
- sudo make all install

Maybe the ompi_info-output is of any help:

[g5dual:/Network/CFD/MVH-1.0] motte% ompi_info
                Open MPI: 1.1
   Open MPI SVN revision: r10477
                Open RTE: 1.1
   Open RTE SVN revision: r10477
                    OPAL: 1.1
       OPAL SVN revision: r10477
                  Prefix: /usr
 Configured architecture: powerpc-apple-darwin8.7.0
           Configured by: admin
           Configured on: Wed Jun 28 10:55:12 CEST 2006
          Configure host: G5-Dual.local
                Built by: root
                Built on: Wed Jun 28 11:32:04 CEST 2006
              Built host: G5-Dual.local
              C bindings: yes
            C++ bindings: yes
      Fortran77 bindings: yes (single underscore)
      Fortran90 bindings: yes
 Fortran90 bindings size: small
              C compiler: gcc
     C compiler absolute: /usr/bin/gcc
            C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
      Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/local/bin/gfortran
      Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/local/bin/gfortran
             C profiling: yes
           C++ profiling: yes
     Fortran77 profiling: yes
     Fortran90 profiling: yes
          C++ exceptions: no
          Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
     MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
         libltdl support: yes
              MCA memory: darwin (MCA v1.0, API v1.0, Component v1.1)
           MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1)
               MCA timer: darwin (MCA v1.0, API v1.0, Component v1.1)
           MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
           MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
                MCA coll: basic (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: self (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: sm (MCA v1.0, API v1.0, Component v1.1)
                MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1)
                  MCA io: romio (MCA v1.0, API v1.0, Component v1.1)
               MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1)
                 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1)
                 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1)
              MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: self (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: sm (MCA v1.0, API v1.0, Component v1.1)
                 MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
                MCA topo: unity (MCA v1.0, API v1.0, Component v1.1)
                 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0)
                 MCA gpr: null (MCA v1.0, API v1.0, Component v1.1)
                 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1)
                 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1)
                 MCA iof: svc (MCA v1.0, API v1.0, Component v1.1)
                  MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1)
                  MCA ns: replica (MCA v1.0, API v1.0, Component v1.1)
                 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
                 MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1)
                 MCA ras: xgrid (MCA v1.0, API v1.0, Component v1.1)
                 MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1)
                 MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1)
               MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1)
                MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1)
                MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1)
                 MCA rml: oob (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: fork (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1)
                 MCA pls: xgrid (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: env (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: pipe (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: seed (MCA v1.0, API v1.0, Component v1.1)
                 MCA sds: singleton (MCA v1.0, API v1.0, Component v1.1)

Enclosed you'll find the config.log.

Yours,
Frank


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: config.log
Url: 
http://www.open-mpi.org/MailArchives/users/attachments/20060628/a640acf1/config.pl

------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 317, Issue 1
*************************************





------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

End of users Digest, Vol 317, Issue 2
*************************************



Reply via email to