Re: [OMPI users] tcsh: orted: Not Found

2006-03-02 Thread Brian Barrett
On Mar 1, 2006, at 5:26 PM, Xiaoning (David) Yang wrote: I installed Open MPI 1.0.1 on two Mac G5s (one with two cpus and the other with 4 cpus.). I set up ssh on both machines according to the FAQ. My mpi jobs work fine if I run the jobs on only one computer. But when I ran a job across t

Re: [OMPI users] OpenMPI 1.0.x and PGI pgf90

2006-03-02 Thread Jeff Squyres
On Mar 1, 2006, at 1:55 PM, Bjoern Nachtwey wrote: I tried to compile OpenMPI using the PortzlandGroup compiler Suite, but the configure-script tells me, my fortran compiler cannot compile .f or .f90 files. I'm sure it can ;-) [snipped] PS: Full Script and Logfiles can be found at http://w

Re: [OMPI users] OpenMPI 1.0.x and PGI pgf90

2006-03-02 Thread Jeff Squyres
On Mar 1, 2006, at 5:14 PM, Troy Telford wrote: That being said, I have been unable to get OpenMPI to compile with PGI 6.1 (but it does finish ./configure; it breaks during 'make'). Troy -- Can you provide some details on what is going wrong? We currently only have PGI 5.2 and 6.0 to tes

[OMPI users] Building OpenMPI with Lahey Fortran 95

2006-03-02 Thread Adams Samuel D Contr AFRL/HEDR
I am trying to build OpenMPI using Lahey Fortran 95 6.2 on a Fedora Core 3 box. I run the configure script ok, but the problem occurs when run make. It appears that it is bombing out when it is building the Fortran libraries. It seems like to me that OpenMPI is naming its modules with .ompi_mod in

[OMPI users] Spawn and distribution of slaves

2006-03-02 Thread Jean Latour
Hello, Testing the MPI_Comm_Spawn function of Open MPI version 1.0.1, I have an example that works OK, except that it shows that the spawned processes do not follow the "machinefile" setting of processors. In this example a master process spawns first 2 processes, then disconnects from them an

Re: [OMPI users] tcsh: orted: Not Found

2006-03-02 Thread Xiaoning (David) Yang
Brian, Thank you for the help. I did include path to orted in my .tcshrc file on mac2, but I put the path at the end of the file. It is interesting that when I logged into mac with ssh, the path was included and orted was in my path. But when I ran "ssh mac2 which orted", orted was not found. It f

Re: [OMPI users] Spawn and distribution of slaves

2006-03-02 Thread Edgar Gabriel
as far as I know, Open MPI should follow the machinefile for spawn operations, starting however for every spawn at the beginning of the machinefile again. An info object such as 'lam_sched_round_robin' is currently not available/implemented. Let me look into this... Jean Latour wrote: Hello,

Re: [OMPI users] tcsh: orted: Not Found

2006-03-02 Thread Brian Barrett
On Mar 2, 2006, at 11:34 AM, Xiaoning (David) Yang wrote: Thank you for the help. I did include path to orted in my .tcshrc file on mac2, but I put the path at the end of the file. It is interesting that when I logged into mac with ssh, the path was included and orted was in my path. But wh

Re: [OMPI users] Spawn and Disconnect

2006-03-02 Thread Edgar Gabriel
Open MPI currently does not fully support a proper disconnection of parent and child processes. Thus, if a child dies/aborts, the parents will abort as well, despite of calling MPI_Comm_disconnect. (The new RTE will have better support for these operations, Ralph/Jeff can probably give a better

Re: [OMPI users] cannot mak a simple ping-pong

2006-03-02 Thread Jose Pedro Garcia Mahedero
Finally it was a network problem. I had to disable one network interface in the master node of the cluster by setting btl_tcp_if_include = eth1 on file /usr/local/etc/openmpi-mca-params.conf thank you all for your help. Jose Pedro On 3/1/06, Jose Pedro Garcia Mahedero wrote: > > OK, it ALMOST w

Re: [OMPI users] Spawn and Disconnect

2006-03-02 Thread Ralph Castain
We expect to have much better support for the entire comm_spawn process in the next incarnation of the RTE. I don't expect that to be included in a release, however, until 1.1 (Jeff may be able to give you an estimate for when that will happen). Jeff et al may be able to give you access to an

Re: [OMPI users] tcsh: orted: Not Found

2006-03-02 Thread Xiaoning (David) Yang
Yes, that's it! I do have an if statement for interactive shells. Now I know. Thanks. David * Correspondence * > From: Brian Barrett > Reply-To: Open MPI Users > Date: Thu, 2 Mar 2006 12:09:18 -0500 > To: Open MPI Users > Subject: Re: [OMPI users] tcsh: orted: Not Found > > On Mar

Re: [OMPI users] Spawn and distribution of slaves

2006-03-02 Thread Edgar Gabriel
so for my tests, Open MPI did follow the machinefile (see output) further below, however, for each spawn operation it starts from the very beginning of the machinefile... The following example spawns 5 child processes (with a single MPI_Comm_spawn), and each child prints its rank and the hostname

Re: [OMPI users] cannot mak a simple ping-pong

2006-03-02 Thread Jeff Squyres
Jose -- This sounds like a problem that we just recently fixed in the 1.0.x branch -- there were some situations where the "wrong" ethernet device could have been picked by Open MPI (e.g., if you have a cluster with all private IP addresses, and you run an MPI job that spans the head node

Re: [OMPI users] OpenMPI 1.0.x and PGI pgf90 ==> Problem solved

2006-03-02 Thread Bjoern Nachtwey
Dear Folks, I had to add the "--with-gnu-ld" flag and call my variables F77 and FC (not FC and F90). now it works :-) Thanks! Bjørn you wrote: > I've used > > ./configure --with-gnu-ld F77=pgf77 FFLAGS=-fastsse FC=pgf90 > FCFLAGS=-fastsse > > and that worked for me. Email direct if you have

[OMPI users] Problem running open mpi across nodes.

2006-03-02 Thread Xiaoning (David) Yang
I installed Open MPI on two Mac G5s, one with 2 cpus and the other with 4 cpus. I can run jobs on either of the machines fine. But when I ran a job on machine one across the two nodes, the all processes I requested would start, but they then seemed to hang and I got the error message: [0,1,1][btl_

Re: [OMPI users] Problem running open mpi across nodes.

2006-03-02 Thread Brian Barrett
On Mar 2, 2006, at 3:56 PM, Xiaoning (David) Yang wrote: I installed Open MPI on two Mac G5s, one with 2 cpus and the other with 4 cpus. I can run jobs on either of the machines fine. But when I ran a job on machine one across the two nodes, the all processes I requested would start, but t

[OMPI users] C++ bool type reduction failing

2006-03-02 Thread Andy Selle
I am trying to do a reduction using a bool type using the C++ bindings. I am using this sample program to test: - #include #include int main(int argc,char *argv[]) { MPI::Init(); int rank=MPI::COMM_WORLD.Get_rank(); {bool test=true;

Re: [OMPI users] Problem running open mpi across nodes.

2006-03-02 Thread Xiaoning (David) Yang
Brian, My G5s only have one ethernet card each and are connected to the network through those cards. I upgraded to Open MPI 1.0.2. The problem remains the same. A somewhat detailed description of the problem is like this. When I run jobs from the 4-cpu machine, specifying 6 processes, orted, orte

Re: [OMPI users] Problem running open mpi across nodes.

2006-03-02 Thread Brian Barrett
On Mar 2, 2006, at 8:19 PM, Xiaoning (David) Yang wrote: My G5s only have one ethernet card each and are connected to the network through those cards. I upgraded to Open MPI 1.0.2. The problem remains the same. A somewhat detailed description of the problem is like this. When I run jobs