Re: [OMPI users] Can't start program across network -- solved!

2009-03-17 Thread Raymond Wan
Hi Prentice/Jeff, Prentice Bisbal wrote: In an earlier e-mail in this thread, I theorized that this might be a problem with your name service. This latest information seems to support that theory. Thank you very much for the suggestions and help! After discussing with our system administra

[OMPI users] open mpi on non standard ssh port

2009-03-17 Thread Bernhard Knapp
Hi I want to start a gromacs simulation on a small cluster where non standard ports are used for ssh. If I just use a "normal" maschinelist file (with the ips of the nodes), consequently, the following error comes up: ssh: connect to host 192.168.0.103 port 22: Connection refused I guess tha

[OMPI users] open-mpi error: unable to create listen socket

2009-03-17 Thread -andria-
Dear all, I am still learning how to create a parallel program with open-mpi. I try to run a mpihello program on my cluster, but it gives error when it is executed as ordinary (public) user. however, it gives the correct result when it is run by root user. why this happen? how can it be solved?

Re: [OMPI users] open mpi on non standard ssh port

2009-03-17 Thread Gilbert Grosdidier
Hi Bernhard, You may want to use the .ssh/config file, where you will be able to specify on a machine by machine basis the port you want to use thru the 'Port' directive. Have a look to 'man ssh_config' page. Hope this helps, Gilbert. On Tue, 17 Mar 2009, Bernhard Knapp wrote: > Hi

Re: [OMPI users] open-mpi error: unable to create listen socket

2009-03-17 Thread Ralph Castain
Hi Andria The problem is a permissions one - your system has been setup so that only root has permission to open a TCP socket. I don't know what system you are running - you might want to talk to your system admin or someone knowledgeable on that operating system to ask them how to revise

Re: [OMPI users] open mpi on non standard ssh port

2009-03-17 Thread Jeff Squyres
We don't have an easy way to specify using different ports for each host (this is a fairly uncommon configuration), but you can set it up in your $HOME/.ssh/config file, perhaps something like this: Host 192.168.0.101 Port 5101 Host 192.168.0.102 Port 5102 ...and so on. Then "ssh 19

[OMPI users] WRF Slow Down

2009-03-17 Thread Philip Hayes
Hi, I am running WRF simulations on multiple nodes and am running into problems where the simulation will randomly slow down. The model still works, but slows down tremendously. I looked at the each node and found that 1 node will only be using 25% of the CPU, while the others are using 100%. I

Re: [OMPI users] Run-time problem

2009-03-17 Thread Jeff Squyres
On Mar 16, 2009, at 7:23 PM, justin oppenheim wrote: I managed to run it just recently... It turns out that some libraries libib* were missing, as well as some others. I learned this by trying to install an old version of openmpi that was in the repository of my Suse Linux. The "software ma

Re: [OMPI users] WRF Slow Down

2009-03-17 Thread Elvedin Trnjanin
Have you switched versions of OMPI and this behavior surfaced with that? Which version are you running and which version(s) do you know work? What about system specs - multiple cores, processors? I have experience with versions 1.2.5 and 1.2.8 running WRF with 4x DDR Infiniband working without

Re: [OMPI users] compile crash with pathscale and openmpi-1.3

2009-03-17 Thread Ethan Mallove
On Mon, Jan/26/2009 12:16:47PM, Jeff Squyres wrote: > Yowza! Bummer. Please let us know what Pathscale says. I encountered the same issue and here is Pathscale's response: "C++ OpenMP is not fully supported in the GCC3-based front-end, that your compilation is using. This old front-end is

[OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Ron Babich
Hi Everyone, I'm having a very basic problem getting an MPI job to run on multiple nodes. My setup consists of two identically configured nodes, called node01 and node02, connected via ethernet and infiniband. They are running CentOS 5.2 and the bundled OMPI, version 1.2.5. I've attached the

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Raymond Wan
Hi Ron, Ron Babich wrote: Hi Everyone, I'm having a very basic problem getting an MPI job to run on multiple nodes. My setup consists of two identically configured nodes, called node01 and node02, connected via ethernet and infiniband. They are running CentOS 5.2 and the bundled OMPI, ver

Re: [OMPI users] PGI 8.0-4 doesn't like ompi/mca/op/op.h

2009-03-17 Thread Jeff Squyres
We tracked this down further -- it appears that the culprit was an out- of-date Autoconf installation. Specifically, somewhere between Autoconf 2.61 and 2.63, they changed the order of looking for the various "restrict" keywords. AC 2.63 looks at "__restrict" *first* (i.e., before "restrict

Re: [OMPI users] mpirun hangs when launching job on remote node

2009-03-17 Thread Ron Babich
Hi Ray, Thanks for your response. I had noticed your thread, which is why I'm embarrassed (but happy) to say that it looks like my problem was the same as yours. I mentioned in my original email that there was no firewall running, which it turns out was a lie. I think that when I checked b

Re: [OMPI users] open-mpi error: unable to create listen socket

2009-03-17 Thread -andria-
Thank you Ralph, I found the problem. it is because I wrongly configured the second node's selinux property (which is set to be enforced). After it is disabled, the parallel-hello works fine. regards, -andria On Tue, Mar 17, 2009 at 8:08 PM, Ralph Castain wrote: > Hi Andria > > The problem is

[OMPI users] openmpi 1.3 and gridengine tight integration problem

2009-03-17 Thread Salmon, Rene
Hi, I have looked through the list archives and google but could not find anything related to what I am seeing. I am simply trying to run the basic cpi.c code using SGE and tight integration. If run outside SGE i can run my jobs just fine: hpcp7781(salmr0)132:mpiexec -np 2 --machinefile x a.ou