Re: [OMPI users] Tmpdir work for first process only

2007-11-15 Thread Aurelien Bouteiller
Hi Clement, First, if you run 400 jobs on 16 nodes you will end up with around 32 processes on each nodes. Depending on the memory footprint of the application it will fail because of memory exhaustion. Usually I am able to oversubscribe up to 64 NAS class B processes on 2GB, and less tha

[OMPI users] MPI daemons suspend running job

2007-11-15 Thread Murat Knecht
Hi, I am encountering the problem that a working child process is frozen in the middle of its work, and continues only when its parent process ( which spawned it earlier on ) calls some MPI function. The issue here is, that in order to accept client socket communication the parent process is, at th

Re: [OMPI users] Error on running large number of processes

2007-11-15 Thread Pak Lui
I am assuming all the processes are running on a single SMP? Not sure if you have tried it but you may want to set the mpool_sm_max_size to something other than the default 512MB, since you seem to be using shared memory? Jeff Squyres wrote: My guess is that this is similar to the last post: y

Re: [OMPI users] Suggestions on multi-compiler/multi-mpi build?

2007-11-15 Thread Brock Palen
Really modules is the only way to go, we use it to maintain no less than 12 versions of openmpi compiled with pgi, gcc, nag, and intel. Yes as the other reply said just set LD_LIBRARY_PATH you can use ldd executable to see which libraries the executable is going to use under the current en

Re: [OMPI users] Suggestions on multi-compiler/multi-mpi build?

2007-11-15 Thread Katherine Holcomb
We have almost exactly the situation you describe with our clusters. I'm not the system administrator but I am the one who actually writes the module scripts. It is best to compile OpenMPI (and any other MPI) with each compiler separately; this is especially necessary for the Fortran and C++

[OMPI users] Suggestions on multi-compiler/multi-mpi build?

2007-11-15 Thread Jim Kusznir
Hi all: I'm trying to set up a cluster for a group of users with very different needs. So far, it looks like I need gcc, pgi, and intel to work with openmpi and mpich, with each user able to control what combination they get. This is turning out to be much more difficult than I expected. Someon

Re: [OMPI users] OpenMPI - compilation

2007-11-15 Thread Tim Prins
I have seen situations where after installing Open MPI, the wrapper compilers did not create any executables, and seemed to do nothing. I was never able to figure out why the wrappers were broken, and reinstalling Open MPI always seemed to make it work. If I recall correctly, when this happen

Re: [OMPI users] Tmpdir work for first process only

2007-11-15 Thread Clement Kam Man Chu
Jeff Squyres wrote: Thanks for your reply. I am using pbs job scheduler and I reqested 16 cpus to run 400 processes, but I don't how many processes are allocated on each cpus. Do you think it is a problem? Clement Are you running all of these processes on the same machine, or multiple dif

Re: [OMPI users] Error on running large number of processes

2007-11-15 Thread Jeff Squyres
My guess is that this is similar to the last post: you are oversubscribing the nodes so heavily that the OS is running out of some resources (perhaps regular or registered memory?) such that Open MPI is unable to setup its network transport layers properly. On Nov 15, 2007, at 6:35 AM, Cle

Re: [OMPI users] Tmpdir work for first process only

2007-11-15 Thread Jeff Squyres
Are you running all of these processes on the same machine, or multiple different machines? If you're running 400 processes on the same machine, it may well be that you are simply running out of memory or other OS resources. In particular, I've never seem iof fail that way before (iof is o

[OMPI users] Error on running large number of processes

2007-11-15 Thread Clement Kam Man Chu
Hi, I am using openmpi 1.2.3 under ia64 machine and uses pbs job scheduler. I can successfully run 100 processes on 16 cpus, but I got an error If run 200 processes on the same number of cpus. The error is : PML add procs failed --> Returned "Temporarily out of resource" (-3) instead o

Re: [OMPI users] OpenMPI - compilation

2007-11-15 Thread Jeff Squyres
On Nov 14, 2007, at 10:48 PM, Sajjad wrote: No i didn't find any executable after the issued the command "mpicc mpitest1.c -o mpitest1" If you're not finding the executable at all, then something else is very wrong. The "mpicc" command is just a "wrapper" compiler, meaning that it takes y

[OMPI users] OpenMPI - compilation

2007-11-15 Thread Sajjad
Hello Jeff, No i didn't find any executable after the issued the command "mpicc mpitest1.c -o mpitest1" And sorry for dumping such an irrelevant chunk of data to the mailing list. Sajjad

Re: [OMPI users] OpenMPI - compilation

2007-11-15 Thread Jeff Squyres
You didn't answer my question as to whether the "mpitest1" executable was available on all nodes or not. :-) That is the real problem here. But unrelated to that, you are running an ancient version of Open MPI (v1.1). Is there any chance that you can upgrade to the latest stable release

[OMPI users] OpenMPI - compilation

2007-11-15 Thread Sajjad
Hello Jeff, I thought that the following information will be helpful to track the issue. *** sajjad@sajjad:~$ ompi_info Open MPI: 1.1 Open MPI SVN revision: r10477 Open RTE: 1.1 Open RTE SVN revision: r10477

Re: [OMPI users] Tmpdir work for first process only

2007-11-15 Thread Clement Kam Man Chu
Hi, I have configured out why the tmpdir parameter works for the first process. I got another problem if I tried to run 400 processes (no problem if under 400 processes). I got an error "ORTE_ERROR_LOG: Out of resource in file base/iof_base_setup.c at line 106". I attached the message as belo