Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
t here on the mailing list. -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Jeff Squyres (jsquyres) via users Sent: Thursday, May 5, 2022 3:31 PM To: George Bosilca; Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
2022 3:19 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres); Scott Sayres Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu,

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread George Bosilca via users
That is weird, but maybe it is not a deadlock, but a very slow progress. In the child can you print the fdmax and i in the frame do_child. George. On Thu, May 5, 2022 at 11:50 AM Scott Sayres via users < users@lists.open-mpi.org> wrote: > Jeff, thanks. > from 1: > > (lldb) process attach --pid 9

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Scott Sayres via users
Jeff, thanks. from 1: (lldb) process attach --pid 95083 Process 95083 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0001bde25628 libsystem_kernel.dylib`close + 8 libsystem_kernel.dylib`close: -> 0x1bde25628 <+8>: b.lo 0x1bde25648

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
You can use "lldb -p PID" to attach to a running process. -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Thursday, May 5, 2022 11:22 AM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Scott Sayres via users
Jeff, It does launch two mpirun processes (when hung from another terminal window) scottsayres 95083 99.0 0.0 408918416 1472 s002 R 8:20AM 0:04.48 mpirun -np 4 foo.sh scottsayres 95085 0.0 0.0 408628368 1632 s006 S+8:20AM 0:00.00 egrep mpirun|foo.sh scottsayres

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Bennet Fauber via users
happens immediately after forking the > child process... which is weird). > > -- > Jeff Squyres > jsquy...@cisco.com > > ____ > From: Scott Sayres > Sent: Wednesday, May 4, 2022 4:02 PM > To: Jeff Squyres (jsquyres) > Cc: Open MPI Users > Subject: Re: [OMPI users] mp

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Jeff Squyres (jsquyres) via users
ild process... which is weird). -- Jeff Squyres jsquy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 4:02 PM To: Jeff Squyres (jsquyres) Cc: Open MPI Users Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 foo.sh is executabl

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-05 Thread Gilles Gouaillardet via users
it via: >> >> mpirun -np 1 foo.sh >> >> If you start seeing output, good!If it completes, better! >> >> If it hangs, and/or if you don't see any output at all, do this: >> >> ps auxwww | egrep 'mpirun|foo.sh' >> >> It should show mp

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
ut at all, do this: > > ps auxwww | egrep 'mpirun|foo.sh' > > It should show mpirun and 2 copies of foo.sh (and probably a grep). Does > it? > > -- > Jeff Squyres > jsquy...@cisco.com > > ________ > From: Scott Sayres >

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
uy...@cisco.com From: Scott Sayres Sent: Wednesday, May 4, 2022 2:47 PM To: Open MPI Users Cc: Jeff Squyres (jsquyres) Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
Following Jeff's advice, I have rebuilt open-mpi by hand using the -g option. This shows more information as below. I am attempting George's advice of how to track the child but notice that gdb does not support arm64. attempting to update lldb. scottsayres@scotts-mbp openmpi-4.1.3 % lldb mpir

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
Sent: Wednesday, May 4, 2022 12:35 PM To: Open MPI Users Cc: George Bosilca Subject: Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3 I compiled a fresh copy of the 4.1.3 branch on my M1 laptop, and I can run both MPI and non-MPI apps without any issues. Try running `lldb mpirun -- -np 1 hos

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread George Bosilca via users
/ >>> >>> [scotts-mbp.3500.dhcp.###:05469] [[48286,0],0] Releasing job data for >>> [INVALID] >>> >>> Can you recommend a way to find where mpirun gets stuck? >>> Thanks! >>> Scott >>> >>> On Wed, May 4, 2022 at 6:06 AM

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
t; >> On Wed, May 4, 2022 at 6:06 AM Jeff Squyres (jsquyres) < >> jsquy...@cisco.com> wrote: >> >>> Are you able to use mpirun to launch a non-MPI application? E.g.: >>> >>> mpirun -np 2 hostname >>> >>> And if that works, can you r

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread George Bosilca via users
;hello world" and >> "ring" programs)? E.g.: >> >> cd examples >> make >> mpirun -np 4 hello_c >> mpirun -np 4 ring_c >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >> >

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Scott Sayres via users
gt; > cd examples > make > mpirun -np 4 hello_c > mpirun -np 4 ring_c > > -- > Jeff Squyres > jsquy...@cisco.com > > > From: users on behalf of Scott Sayres > via users > Sent: Tuesday, May 3, 2022 1:07 PM >

Re: [OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-04 Thread Jeff Squyres (jsquyres) via users
ples make mpirun -np 4 hello_c mpirun -np 4 ring_c -- Jeff Squyres jsquy...@cisco.com From: users on behalf of Scott Sayres via users Sent: Tuesday, May 3, 2022 1:07 PM To: users@lists.open-mpi.org Cc: Scott Sayres Subject: [OMPI users] mpirun hangs on m1 ma

[OMPI users] mpirun hangs on m1 mac w openmpi-4.1.3

2022-05-03 Thread Scott Sayres via users
Hello, I am new to openmpi, but would like to use it for ORCA calculations, and plan to run codes on the 10 processors of my macbook pro. I installed this manually and also through homebrew with similar results. I am able to compile codes with mpicc and run them as native codes, but everything th

Re: [OMPI users] mpirun hangs

2018-08-15 Thread Jeff Squyres (jsquyres) via users
There can be lots of reasons that this happens. Can you send all the information listed here? https://www.open-mpi.org/community/help/ > On Aug 15, 2018, at 10:55 AM, Mota, Thyago wrote: > > Hello. > > I have openmpi 2.0.4 installed on a Cent OS 7. When I try to run "mpirun" it > hang

[OMPI users] mpirun hangs

2018-08-15 Thread Mota, Thyago
Hello. I have openmpi 2.0.4 installed on a Cent OS 7. When I try to run "mpirun" it *hangs*. Below is the output I get using the debug option: $ mpirun -d [elm:07778] procdir: /tmp/openmpi-sessions-551034197@elm_0/12011/0/0 [elm:07778] jobdir: /tmp/openmpi-sessions-551034197@elm_0/12011/0 [elm

Re: [OMPI users] mpirun hangs without internet connection

2015-01-21 Thread Klara Hornisova
Thank you for help. Now it works. Klara Hornisova On Thu, Jan 15, 2015 at 5:54 PM, Marco Atzeri wrote: > > > On 1/15/2015 5:39 PM, Klara Hornisova wrote: > >> I have installed OpenMPI 1.6.5 under cygwin. When trying test example >> >> $mpirun hello >> > > current cygwin package is 1.8.4-1, coul

Re: [OMPI users] mpirun hangs without internet connection

2015-01-15 Thread Marco Atzeri
On 1/15/2015 5:39 PM, Klara Hornisova wrote: I have installed OpenMPI 1.6.5 under cygwin. When trying test example $mpirun hello current cygwin package is 1.8.4-1, could you test it ? or, e.g., more complex examples from scalapack, such as $mpirun -np 4 xslu everything works fine when t

[OMPI users] mpirun hangs without internet connection

2015-01-15 Thread Klara Hornisova
I have installed OpenMPI 1.6.5 under cygwin. When trying test example $mpirun hello or, e.g., more complex examples from scalapack, such as $mpirun -np 4 xslu everything works fine when there is an internet connection. However, when the cable is disconnected, mpirun hangs without any error mess

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-12 Thread Rodrigo Gómez Vázquez
I solved the issue by accepting the input traffic of data packages through the TCP Ports as long as they are sent "from" and "to" the local machine. Here is the line I added to the iptables: /sbin/iptables -A INPUT --source --destination --protocol tcp -j ACCEPT Just an observation, I

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-11 Thread Ralph Castain
FWIW: I'm working on a rewrite of our out-of-band comm system (it does the wireup that is hanging on your system) that will include a shared memory module. Once that is in place, this problem will go away when running on a single node (still need sockets for multi-node, of course). On Apr 11,

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-11 Thread Rodrigo Gómez Vázquez
You were right, Ralph. I made a short test turning off the firewall and MPI ran as predicted. I am taking a look to the firewall rules, to figure out how to set it up properly, so that it does not interfere with OpenMPI's functionalities. I will post the required changes in those settings as so

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Rodrigo Gómez Vázquez
In fact we should have restrictive firewall settings, as long as I remember. I will check the rules again tomorrow morning. That's very interesting, I would expect such kind of problem if I were working with a cluster, but I haven't thought that it might lead also to problems for the internal c

Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Ralph Castain
Best guess is that there is some issue with getting TCP sockets on the system - once the procs are launched, they need to open a TCP socket and communicate back to mpirun. If the socket is "stuck" waiting to complete the open, things will hang. You might check to ensure there isn't some securit

[OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Rodrigo Gómez Vázquez
Hi, I am having troubles with the program in a simulation server. The system consists of several processors but all in the same node (more information of the specs. is in the attachments). The system is quite new (few months) and a user reported me that it was not possible to run simulations on

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs ( mpirun compiled without thread support )

2012-01-19 Thread Jeff Squyres
On Jan 18, 2012, at 4:15 AM, Theiner, Andre wrote: > I also have requested the user to run the following adaption to his original > command "mpriun -np 9 interFoam -parallel". I hoped to get a kind of debug > output > which points me into the right way. The new command did not work and I am a >

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs ( mpirun compiled without thread support )

2012-01-18 Thread Theiner, Andre
o:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Dienstag, 17. Januar 2012 22:53 To: Open MPI Users Subject: Re: [OMPI users] mpirun hangs when used on more than 2 CPUs You should probably also run the ompi_info command; it tells you details about your installation, and how it was

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread Jeff Squyres
multiple processors? >>> Is there a special flag which tells the compiler to care for multiple CPUs? >>> >>> Andre >>> >>> >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >>> Behalf Of devendra rai >

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread Ralph Castain
a special flag which tells the compiler to care for multiple CPUs? >> >> Andre >> >> >> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >> Behalf Of devendra rai >> Sent: Montag, 16. Januar 2012 13:25 >> To: Open MPI Users >>

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread John Hearns
cial flag which tells the compiler to care for multiple CPUs? > > Andre > > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > Behalf Of devendra rai > Sent: Montag, 16. Januar 2012 13:25 > To: Open MPI Users > Subject: Re: [OMPI users] mpirun ha

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-17 Thread Theiner, Andre
rai Sent: Montag, 16. Januar 2012 13:25 To: Open MPI Users Subject: Re: [OMPI users] mpirun hangs when used on more than 2 CPUs Hello Andre, It may be possible that your openmpi does not support threaded MPI-calls (if these are happening). I had a similar problem, and it was traced to this cause

Re: [OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-16 Thread devendra rai
: "us...@open-mpi.org" Sent: Monday, 16 January 2012, 11:55 Subject: [OMPI users] mpirun hangs when used on more than 2 CPUs   Hi everyone, may I have your help on a strange problem? High performance computing is new to me and I have not much idea about OpenMPI and OpenFoam (OF

[OMPI users] mpirun hangs when used on more than 2 CPUs

2012-01-16 Thread Theiner, Andre
Hi everyone, may I have your help on a strange problem? High performance computing is new to me and I have not much idea about OpenMPI and OpenFoam (OF) which uses the "mpirun" command. I have to support the OF application in my company and have been trying to find the problem since about 1 week

Re: [OMPI users] mpirun hangs during runtime on Intel quad-core

2010-08-15 Thread Ralph Castain
Cryptic enough :-) Best I can tell, your TCP comm isn't working. All your procs are failing because they can't talk to each other. I'm also seeing something I don't understand: *** The MPI_Init() function was called before MPI_INIT was invoked. *** This is disallowed by the MPI standard. You

[OMPI users] mpirun hangs during runtime on Intel quad-core

2010-08-15 Thread Manik Mayur
Hi All, I am getting a runtime error with mpirun, the details are attached in error.log. Please let me know what is the problem. Open-mpi version:1.4.2 $ uname -a Linux bingo 2.6.34-gentoo-r1 #7 SMP Fri Aug 13 10:18:23 IST 2010 i686 Intel(R) Core(TM)2 Quad CPU Q8200 @ 2.33GHz GenuineIntel GNU/L

Re: [OMPI users] mpirun hangs with multiple nodes

2010-01-06 Thread Ralph Castain
There is a bug in that tarball which was fixed as of yesterday. However, the patch that you need was the cause of the bug, so the fix for your problem is no longer in the 1.4 branch. As you probably recall, I had cautioned that the fix might not make it to the 1.4 series. At the time, I was con

[OMPI users] mpirun hangs with multiple nodes

2010-01-06 Thread Marcia Cristina Cera
Hi, I am using the OpenMPI v1.4a1r22335 to run an MPI application that creates dynamically processes. The application behavior is like explained in a previous e-mail http://www.open-mpi.org/community/lists/users/2009/12/11540.php The application is launched by a command line such as: $ mpirun

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Raymond Wan
Hi Bogdan, Thanks for the information and looking forward to the new OpenMPI feature of port restriction... About Debian, I was wondering about that...I've had no problems with it and I was thinking everything was just done for me; of course, another possibility is that there was no firewall

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Bogdan Costescu
On Wed, 18 Mar 2009, Raymond Wan wrote: Perhaps it has something to do with RH's defaults for the firewall settings? If your sysadmin uses kickstart to configure the systems, (s)he has to add 'firewall --disabled'; similar for SELinux which seems to have caused problems to another person on

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-18 Thread Raymond Wan
Hi Ron, Ron Babich wrote: Thanks for your response. I had noticed your thread, which is why I'm embarrassed (but happy) to say that it looks like my problem was the same as yours. I mentioned in my original email that there was no firewall running, which it turns out was a lie. I think th

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Ron Babich
Hi Ray, Thanks for your response. I had noticed your thread, which is why I'm embarrassed (but happy) to say that it looks like my problem was the same as yours. I mentioned in my original email that there was no firewall running, which it turns out was a lie. I think that when I checked b

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Raymond Wan
Hi Ron, Ron Babich wrote: Hi Everyone, I'm having a very basic problem getting an MPI job to run on multiple nodes. My setup consists of two identically configured nodes, called node01 and node02, connected via ethernet and infiniband. They are running CentOS 5.2 and the bundled OMPI, ver

[OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Ron Babich
Hi Everyone, I'm having a very basic problem getting an MPI job to run on multiple nodes. My setup consists of two identically configured nodes, called node01 and node02, connected via ethernet and infiniband. They are running CentOS 5.2 and the bundled OMPI, version 1.2.5. I've attached the

Re: [OMPI users] mpirun hangs

2009-01-06 Thread Maciej Kazulak
2009/1/6 Ralph Castain > > On Jan 5, 2009, at 5:19 PM, Jeff Squyres wrote: > > On Jan 5, 2009, at 5:01 PM, Maciej Kazulak wrote: >> >> Interesting though. I thought in such a simple scenario shared memory >>> would be used for IPC (or whatever's fastest) . But nope. Even with one >>> process st

Re: [OMPI users] mpirun hangs

2009-01-06 Thread Ralph Castain
On Jan 5, 2009, at 5:19 PM, Jeff Squyres wrote: On Jan 5, 2009, at 5:01 PM, Maciej Kazulak wrote: Interesting though. I thought in such a simple scenario shared memory would be used for IPC (or whatever's fastest) . But nope. Even with one process still it wants to use TCP/IP to communicat

Re: [OMPI users] mpirun hangs

2009-01-05 Thread Jeff Squyres
On Jan 5, 2009, at 5:01 PM, Maciej Kazulak wrote: Interesting though. I thought in such a simple scenario shared memory would be used for IPC (or whatever's fastest) . But nope. Even with one process still it wants to use TCP/IP to communicate between mpirun and orted. Correct -- we only

Re: [OMPI users] mpirun hangs

2009-01-05 Thread Maciej Kazulak
2009/1/3 Maciej Kazulak > Hi, > > I have a weird problem. After a fresh install mpirun refuses to work: > > box% ./hello > Process 0 on box out of 1 > box% mpirun -np 1 ./hello > # hangs here, no output, nothing at all; on another terminal: > box% ps axl | egrep 'mpirun|orted' > 0 1000 24162 76

[OMPI users] mpirun hangs

2009-01-03 Thread Maciej Kazulak
Hi, I have a weird problem. After a fresh install mpirun refuses to work: box% ./hello Process 0 on box out of 1 box% mpirun -np 1 ./hello # hangs here, no output, nothing at all; on another terminal: box% ps axl | egrep 'mpirun|orted' 0 1000 24162 7687 20 0 86704 2744 - Sl+ pts/2

Re: [OMPI users] mpirun hangs

2007-08-16 Thread Jeff Squyres
On Aug 16, 2007, at 5:34 AM, jody wrote: Just a quick update about my ssh/LD_LIBRARY_PATH problem. Apparently on my System the sshd was configured not to permit user defined environment variables (security reasons?). To fix that i had to change the file /etc/ssh/sshd_config By changing the en

Re: [OMPI users] mpirun hangs

2007-08-16 Thread jody
Hi Tim Just a quick update about my ssh/LD_LIBRARY_PATH problem. Apparently on my System the sshd was configured not to permit user defined environment variables (security reasons?). To fix that i had to change the file /etc/ssh/sshd_config By changing the entry #PermitUserEnvironment no to

Re: [OMPI users] mpirun hangs

2007-08-14 Thread Tim Prins
Jody, jody wrote: Hi TIm thanks for the suggestions. I now set both paths in .zshenv but it seems that LD_LIBRARY_PATH still does not get set. The ldd experment shows that all openmpi libraries are not found, and indeed the printenv shows that PATH is there but LD_LIBRARY_PATH is not. Are you

Re: [OMPI users] mpirun hangs

2007-08-14 Thread jody
Hi TIm thanks for the suggestions. I now set both paths in .zshenv but it seems that LD_LIBRARY_PATH still does not get set. The ldd experment shows that all openmpi libraries are not found, and indeed the printenv shows that PATH is there but LD_LIBRARY_PATH is not. It is rather unclear why thi

Re: [OMPI users] mpirun hangs

2007-08-14 Thread Tim Prins
Hi Jody, jody wrote: Hi I installed openmpi 1.2.2 on a quad core intel machine running fedora 6 (hostname plankton) I set PATH and LD_LIBRARY in the .zshrc file: Note that .zshrc is only used for interactive logins. You need to setup your system so the LD_LIBRARY_PATH and PATH is also set for

[OMPI users] mpirun hangs

2007-08-14 Thread jody
Hi I installed openmpi 1.2.2 on a quad core intel machine running fedora 6 (hostname plankton) I set PATH and LD_LIBRARY in the .zshrc file: $ echo $PATH /opt/openmpi/bin:/usr/kerberos/bin:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/jody/bin $ echo $LD_LIBRARY_PATH /opt/openmpi/lib: When i r

[OMPI users] mpirun hangs on remote nodes -- how to find where and why?

2007-07-16 Thread Bill Johnstone
Hello. I'm trying to use Open MPI 1.2.3 on a cluster of dual-processor AMD64 nodes. These nodes are all connected via gigabit ethernet on a private, self-contained IP network. The OS is GNU/Linux, gcc 4.1.2, kernel 2.6.21 . Open MPI was configured with --prefix=/usr/local and installed via make

[OMPI users] mpirun hangs??

2006-05-27 Thread imran shaik
Hi jeff, Thanks, i have installed openMPI with the threads enabled option as per the readme file of openmpi. (alpha 7 of openmpi1.1) Hi have a problem with mpirun. 1)When there is no relevant executable present on the remote node where i want to launch the mpi process, mpirun just hangs

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Brian Barrett
On Feb 24, 2006, at 8:23 AM, Emanuel Ziegler wrote: So, the question from the mpirun_debug.out-file is, what IP- addresses do node01 and node02 have, is the local 10.0.0.1 node01, while 10.1.0.1 is node02? Maybe the route on node01 is not correct to node02? Ok, I figured out the problem, bu

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Emanuel Ziegler
> So, the question from the mpirun_debug.out-file is, what IP-addresses do > node01 and node02 have, is the local 10.0.0.1 node01, while 10.1.0.1 is > node02? > Maybe the route on node01 is not correct to node02? Ok, I figured out the problem, but didn't solve it completely. node01 and node02 b

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Bogdan Costescu
On Fri, 24 Feb 2006, Emanuel Ziegler wrote: So "No rout to host" means that the TCP package could not be sent (usually host down, broken routing table, network interface down, ...). But it's 'ping'able and even rsh works fine. ... or some packet filtering is enabled. Check with 'iptables -L -

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Rainer Keller
Hello Emanual, can you actually log in using rsh without submitting a password? I would rather use the ssh-based login using public-keys to login. This is definitely more secure but in Your first mail, You said, ssh wouldn't work either? So, the question from the mpirun_debug.out-file is, what I

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Emanuel Ziegler
> >From /usr/include/asm/errno.h: > > #define EHOSTUNREACH113 /* No route to host */ Ah, I thought it was an internal openMPI error number and 'grep'ed the source code without success. So "No rout to host" means that the TCP package could not be sent (usually host down, broken routi

Re: [OMPI users] mpirun hangs

2006-02-24 Thread Bogdan Costescu
On Thu, 23 Feb 2006, Emanuel Ziegler wrote: Unfortunately, I don't know what errno=113 means, but obviously it's a TCP problem. From /usr/include/asm/errno.h: #define EHOSTUNREACH113 /* No route to host */ -- Bogdan Costescu IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliche

[OMPI users] mpirun hangs

2006-02-23 Thread Emanuel Ziegler
Hi! I finally installed OpenMPI 1.0.2-a7 with libibverbs-1.0-rc5 and libmthca-1.0-rc5 on Debian sarge with kernel 2.6.15 (from www.backports.org) in order to use InfiniBand. While InfiniBand seems to be working (ping with IPoIB works perfectly), the mpirun/orterun command causes trouble using rsh