[OMPI users] mca_oob_tcp_peer_try_connect error when checkpoint and restart.

2007-10-01 Thread Hiep Bui Hoang
Hi, I had setup Open MPI "trunk_16171" for 3 computers with Lan connection, and set environment parameters, ssh without typing password for each node. I use Red Hat Enterprise Linux 5. The program I tried is 'send_recv'. I run successful my 'send_recv' program in those 3 nodes. And checkpoint/resta

Re: [OMPI users] OpenMPI Giving problems when using -mca btl mx, sm, self

2007-10-01 Thread Tim Prins
Hi, On Monday 01 October 2007 03:08:04 am Hammad Siddiqi wrote: > One more thing to add -mca mtl mx uses ethernet and IP emulation of > Myrinet to my knowledge. I want to use Myrinet(not its IP Emulation) > and shared memory simultaneously. This is not true (as far as I know...). Open MPI has 2 di

Re: [OMPI users] mpirun ERROR: The daemon exited unexpectedly with status 255.

2007-10-01 Thread Tim Prins
Hi, On Monday 01 October 2007 03:56:16 pm Dino Rossegger wrote: > Hi again, > > Yes the error output is the same: > root@sun:~# mpirun --hostfile hostfile main > [sun:23748] [0,0,0] ORTE_ERROR_LOG: Timeout in file > base/pls_base_orted_cmds.c at line 275 > [sun:23748] [0,0,0] ORTE_ERROR_LOG: Timeo

Re: [OMPI users] init_thread + spawn error

2007-10-01 Thread Tim Prins
Hi Joao, Unfortunately Comm_spawn is a bit broken right now on the Open MPI trunk. We are currently working on some major changes to the runtime system, so I would rather not dig into this until these changes have made it onto the trunk. I do not know of a timeline for when this these changes w

Re: [OMPI users] mca_oob_tcp_peer_try_connect: messages

2007-10-01 Thread Mostyn Lewis
More information. Sorry about the length of this. I switched on -mca oob_tcp_debug 1000 and the result is below. Later on there's an "ifconfig -a" as the trace seems to show you are trying connections to all 3 interfaces in oob - 5.* is InfiniBand IPoIB - 7.* is a private ethernet with no connecti

[OMPI users] Tool communication

2007-10-01 Thread Oleg Morajko
Hello, In the context of my PhD research, I have been developing a run-time performance analyzer for MPI-based applications. My tool provides a controller process for each MPI task. In particular, when a MPI job is started, a special wrapper script is generated that first starts my controller proc

Re: [OMPI users] mpirun ERROR: The daemon exited unexpectedly with status 255.

2007-10-01 Thread Dino Rossegger
Hi again, Yes the error output is the same: root@sun:~# mpirun --hostfile hostfile main [sun:23748] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/pls_base_orted_cmds.c at line 275 [sun:23748] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1164 [sun:23748] [0,0,0] ORTE_ERROR_LOG: T

[OMPI users] init_thread + spawn error

2007-10-01 Thread Joao Vicente Lima
Hi all! I'm getting a error on call MPI_Init_thread and MPI_Comm_spawn. am I mistaking something? the attachments contains my ompi_info and source ... thank! Joao char *arg[]= {"spawn1", (char *)0}; MPI_Init_thread (&argc, &argv, MPI_THREAD_MULTIPLE, &provided); MPI_Comm_spawn ("./spa

Re: [OMPI users] mpirun ERROR: The daemon exited unexpectedly with status 255.

2007-10-01 Thread jody
Now that the PATHs seem to be set correctly for ssh i don't know what the problem could be. Is the error message still the same on as in the first mail? Did you do the envorpnment/sshd_config on both machines? What shell are you using? On other test you could make is to start your application wit

Re: [OMPI users] mpirun ERROR: The daemon exited unexpectedly with status 255.

2007-10-01 Thread Dino Rossegger
Hi Jodi, did the steps as you said, but it didn't work for me. I set LD_LIBRARY_PATH in /etc/environment and ~/.shh/environment and made the changes to sshd_config. But this all didn't solve my problem, although the pahts seemed to be set correctly (judging what ssh saturn `printenv >> test` says)

Re: [OMPI users] problem with 'orted'

2007-10-01 Thread Si Hammond
Amit Kumar Saha wrote: > hello, > > I am using Open MPI 1.2.3 to run a task on 4 hosts as follows: > > amit@ubuntu-desktop-1:~/mpi-exec$ mpirun --np 4 --hostfile > mpi-host-file ParallelSearch > bash: /usr/local/bin/orted: No such file or directory > > The problem is that 'orted' is not found on

Re: [OMPI users] Rank to host mapping

2007-10-01 Thread Tim Prins
So you know this is something that we are working on for the next major release of Open MPI (v 1.3). More details on some of the discussion can be found here: https://svn.open-mpi.org/trac/ompi/ticket/1023 Tim Torje Henriksen wrote: Specifying nodes several times in the hostfile or with the --

[OMPI users] problem with 'orted'

2007-10-01 Thread Amit Kumar Saha
hello, I am using Open MPI 1.2.3 to run a task on 4 hosts as follows: amit@ubuntu-desktop-1:~/mpi-exec$ mpirun --np 4 --hostfile mpi-host-file ParallelSearch bash: /usr/local/bin/orted: No such file or directory The problem is that 'orted' is not found on one of the 4 hosts. I investigated the p

Re: [OMPI users] Rank to host mapping

2007-10-01 Thread Torje Henriksen
Specifying nodes several times in the hostfile or with the --host parameter seems to just add up the number of slots availible for the given node. It doesn't seem to affect the mapping of the ranks. I think this is due to how the hostfile is read into the structure that holds this information

Re: [OMPI users] Rank to host mapping

2007-10-01 Thread Christian Bell
How about a hostfile such as % cat -n ~/tmp/hostfile 1 node0 2 node0 3 node1 4 node0 5 node1 6 node1 Looks like the function to express the mapping is not anything simple. If it's an expressible function but too complicated for open mpi, you'll have to make yo

Re: [OMPI users] Rank to host mapping

2007-10-01 Thread Torje Henriksen
Oh man, sorry about that, and thanks for the fast response. Let me try again, please :) I want to manually specify what ranks should run on what node. Here is an example of a mapping that I can't seem to be able to do, since it isn't a round-robin type of mapping. hosts ranks === nod

Re: [OMPI users] Multiple threads

2007-10-01 Thread Gleb Natapov
On Mon, Oct 01, 2007 at 10:39:12AM +0200, Olivier DUBUISSON wrote: > Hello, > > I compile openmpi 1.2.3 with options ./configure --with-threads=posix > --enable-mpi-thread --enable-progress-threads --enable-smp-locks. > > My program has 2 threads (main thread and an other). When i run it, i > ca

[OMPI users] libnbc compilation

2007-10-01 Thread Neeraj Chourasia
Hello Everyone,    I was checking the development version from svn and found that support for libnbc is going to come in next release. I thought of compiling it, but failed to do.Could some one suggest me how to get it compiled.When i made changes to configure script(Basically added some flags)

[OMPI users] Multiple threads

2007-10-01 Thread Olivier DUBUISSON
Hello, I compile openmpi 1.2.3 with options ./configure --with-threads=posix --enable-mpi-thread --enable-progress-threads --enable-smp-locks. My program has 2 threads (main thread and an other). When i run it, i can see 4 threads. I think that two threads are the progress threads, is it right ?

Re: [OMPI users] Rank to host mapping

2007-10-01 Thread jody
> hosts ranks > === > node0 1,2,4 > node1 3,4,6 I guess there must be a typo: You can't assign one rank (4) to two nodes And ranks start from 0 not from 1. Check this site, http://www.open-mpi.org/faq/?category=running#mpirun-host there might be some inforegarding your problem. Jody

Re: [OMPI users] OpenMPI Giving problems when using -mca btl mx, sm, self

2007-10-01 Thread Hammad Siddiqi
One more thing to add -mca mtl mx uses ethernet and IP emulation of Myrinet to my knowledge. I want to use Myrinet(not its IP Emulation) and shared memory simultaneously. Thanks Regards, Hammad Hammad Siddiqi wrote: Dear Tim, Your and Tim Matox's suggestion yielded following results, *1. /

Re: [OMPI users] OpenMPI Giving problems when using -mca btl mx, sm, self

2007-10-01 Thread Hammad Siddiqi
Dear Tim, Your and Tim Matox's suggestion yielded following results, *1. /opt/SUNWhpc/HPC7.0/bin/mpirun -np 2 -mca btl mx,sm,self -host "indus1,indus2" -mca btl_base_debug 1000 ./hello* /opt/SUNWhpc/HPC7.0/bin/mpirun -np 4 -mca btl mx,sm,self -host "indus1,indus2,indus3,indus4" -mca btl_b