[OMPI users] Simple question on GRID

2012-03-01 Thread Shaandar Nyamtulga
Hi I have two Beowulf clusters (both Ubuntu 10.10, one is OpenMPI, one is MPICH2). They run separately in their local network environment.I know there is a way to integrate them through Internet, presumably by Grid software, I guess. Is there any tutorial to do this?

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-03-01 Thread Venkateswara Rao Dokku
Hi, I tried executing those tests with the other devices like tcp instead of ib with the same open-mpi 1.4.3.. It went fine but it took time to execute, when i tried to execute the same test on the customized OFED ,tests are hanging at the same message size.. Can u please tel me, what could

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-03-01 Thread Jingcha Joba
Aah... So when openMPI is compile with OFED, and run on a Infiniband/RoCE devices, I would use the mpi would simply direct to ofed to do point to point calls in the ofed way? > > More specifically: all things being equal, you don't care which is used. > You just want your message to get to the re

Re: [OMPI users] Simple question on GRID

2012-03-01 Thread Alexander Beck-Ratzka
Hi Shaandar, this is not a simple question! If you want to bring your cluster into the Grid, you first have to decide which Grid, because the different Grids use different Grid softwares. Having taken this decision, I would recommend to look onto the wen page of this Grid community, usually yo

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-03-01 Thread Jingcha Joba
Well, as Jeff says, looks like its to do with the 1 sided comm. But the reason why I said was because of what I experienced a couple of months ago: When I had a Myri-10G and an Intel gigabit ethernet card lying around, I wanted to test the kernel bypass using open-mx stack and I ran the osu benchm

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-03-01 Thread PukkiMonkey
What Jeff means is that because u didn't have echo "mpirun...>>outfile" but echo mpirun>>outfile , you were piping the output to the outfile instead of stdout. Sent from my iPhone On Feb 29, 2012, at 8:44 PM, Syed Ahsan Ali wrote: > Sorry Jeff I couldn't get you point. > > On Wed, Feb 2

Re: [OMPI users] Very slow MPI_GATHER

2012-03-01 Thread Pinero, Pedro_jose
Thank you for your fast response. I am launching 200 light processes in two computers with 8 cores each one (Intel i7 processor). They are dedicated and are interconnected through a point-to-point Gigabit Ethernet link. I read about oversubscribing nodes in the open-mpi documentation, and f

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-03-01 Thread Abhinav Sarje
Hi Nathan, I tried building on an internal login node, and it did not fail at the previous point. But, after compiling for a very long time, it failed while building libmpi.la, with a multiple definition error: -- ... CC mpiext/mpiext.lo CC mpi/f77/base/mpi_f77_base_libmpi_f77_

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-03-01 Thread Ralph Castain
You need to update your source code - this was identified and fixed on Wed. Unfortunately, our trunk is a developer's environment. While we try hard to keep it fully functional, bugs do occasionally work their way into the code. On Mar 1, 2012, at 1:37 AM, Abhinav Sarje wrote: > Hi Nathan, > >

Re: [OMPI users] Very slow MPI_GATHER

2012-03-01 Thread Ralph Castain
Wow - with that heavy an oversubscription, your performance experience certainly is reasonable. Not much you can do about it except reduce the oversubscription, either by increasing the number of computers or reducing the number of processes. On Mar 1, 2012, at 1:33 AM, Pinero, Pedro_jose wrot

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format errorI

2012-03-01 Thread Syed Ahsan Ali
I am able to run the application with LSF now, it strange because I wasn't able to trace any error. On Thu, Mar 1, 2012 at 11:34 AM, PukkiMonkey wrote: > What Jeff means is that because u didn't have echo "mpirun...>>outfile" > but > echo mpirun>>outfile , > you were piping the output to the

Re: [OMPI users] Very slow MPI_GATHER

2012-03-01 Thread Jeffrey Squyres
On Mar 1, 2012, at 3:33 AM, Pinero, Pedro_jose wrote: > I am launching 200 light processes in two computers with 8 cores each one > (Intel i7 processor). They are dedicated and are interconnected through a > point-to-point Gigabit Ethernet link. > > I read about oversubscribing nodes in the op

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-03-01 Thread Jeffrey Squyres
I would just ignore these tests: 1. The use of MPI one-sided functionality is extremely rare out in the real world. 2. Brian said there were probably bugs in Open MPI's implementation of the MPI one-sided functionality itself, and he's in the middle of re-writing the one-sided functionality any

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-03-01 Thread Jeffrey Squyres
On Mar 1, 2012, at 1:17 AM, Jingcha Joba wrote: > Aah... > So when openMPI is compile with OFED, and run on a Infiniband/RoCE devices, I > would use the mpi would simply direct to ofed to do point to point calls in > the ofed way? I'm not quite sure how to parse that. :-) The openib BTL uses

Re: [OMPI users] Simple question on GRID

2012-03-01 Thread Mohamed Adel
You can use CyberIntegrator (http://isda.ncsa.uiuc.edu/cyberintegrator/) developed by NCSA, or UNICORE (http://www.unicore.eu/) developed by Julich to integrate resources. best, madel From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Shaandar Nyamtulga Sent: Thu

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-03-01 Thread Abhinav Sarje
Thanks Ralph. That did help, but only till the next hurdle. Now the build fails at the following point with an 'undefined reference': --- Making all in tools/ompi_info make[2]: Entering directory `/global/u1/a/asarje/hopper/openmpi-dev-trunk/build/ompi/tools/ompi_info' CC ompi_info.o

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-03-01 Thread Jeffrey Squyres
Did you do a full autogen / configure / make clean / make all ? On Mar 1, 2012, at 8:53 AM, Abhinav Sarje wrote: > Thanks Ralph. That did help, but only till the next hurdle. Now the > build fails at the following point with an 'undefined reference': > --- > Making all in tools/ompi_info

[OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Claudio Pastorino
Dear all, I apologize in advance if this is not the right list to post this. I am a newcomer and please let me know if I should be sending this to another list. I program MPI trying to do HPC parallel programs. In particular I wrote a parallel code for molecular dynamics simulations. The program s

Re: [OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Ralph Castain
Is it really the rank that matters, or where the rank is located? For example, you could leave the ranks as assigned by the cartesian topology, but then map them so that ranks 0 and 2 share a node, 1 and 3 share a node, etc. Is that what you are trying to achieve? On Mar 1, 2012, at 11:57 AM,

Re: [OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Jingcha Joba
mpirun -np 4 --host node1,node2,node1,node2 ./app Is this what you want? On Thu, Mar 1, 2012 at 10:57 AM, Claudio Pastorino < claudio.pastor...@gmail.com> wrote: > Dear all, > I apologize in advance if this is not the right list to post this. I > am a newcomer and please let me know if I should

Re: [OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Claudio Pastorino
Hi, thanks for the answer. You are right is not the rank what matters but how do I arrange the physical procs in the cartesian topology. I don't care about the label. So, how do I achieve that? Regards, Claudio 2012/3/1, Ralph Castain : > Is it really the rank that matters, or where the rank i

Re: [OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Claudio Pastorino
Probably yes, do I have a more systematic way? Thanks Claudio 2012/3/1, Jingcha Joba : > mpirun -np 4 --host node1,node2,node1,node2 ./app > > Is this what you want? > > On Thu, Mar 1, 2012 at 10:57 AM, Claudio Pastorino < > claudio.pastor...@gmail.com> wrote: > >> Dear all, >> I apologize in adv

Re: [OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Gustavo Correa
Hi Claudio Check 'man mpirun'. You will find examples of the '-byslot', '-bynode', '-loadbalance', and rankfile options, which allow some control of how ranks are mapped into processors/cores. I hope this helps, Gus Correa On Mar 1, 2012, at 2:34 PM, Claudio Pastorino wrote: > Hi, thanks for

Re: [OMPI users] Redefine proc in cartesian topologies

2012-03-01 Thread Ralph Castain
Also the sequential mapper may be of help - allows you to specify the node each rank is to be place on, one line/rank. On Mar 1, 2012, at 12:40 PM, Gustavo Correa wrote: > Hi Claudio > > Check 'man mpirun'. > You will find examples of the > '-byslot', '-bynode', '-loadbalance', and rankfile

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Yiguang Yan
Hi Jeff, Here I made a developer build, and then got the following message with plm_base_verbose: >>> [gulftown:28340] mca: base: components_open: Looking for plm components [gulftown:28340] mca: base: components_open: opening plm components [gulftown:28340] mca: base: components_open: found l

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Ralph Castain
What did this command line look like? Can you provide the configure line as well? On Mar 1, 2012, at 12:46 PM, Yiguang Yan wrote: > Hi Jeff, > > Here I made a developer build, and then got the following message > with plm_base_verbose: > > [gulftown:28340] mca: base: components_open: Lo

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Yiguang Yan
Hi Ralph, Thanks, here is what I did as suggested by Jeff: > What did this command line look like? Can you provide the configure line as > well? As in my previous post, the script as following: (1) debug messages: >>> yiguang@gulftown testdmp]$ ./test.bash [gulftown:28340] mca: base: componen

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Jeffrey Squyres
I see the problem. It looks like the use of the app context file is triggering different behavior, and that behavior is erasing the use of --prefix. If I replace the app context file with a complete command line, it works and the --prefix behavior is observed. Specifically: $mpirunfile $mcap

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Jeffrey Squyres
Actually, I should say that I discovered that if you put --prefix on each line of the app context file, then the first case (running the app context file) works fine; it adheres to the --prefix behavior. Ralph: is this intended behavior? (I don't know if I have an opinion either way) On Mar

[OMPI users] run orterun with more than 200 processes

2012-03-01 Thread Jianzhang He
Hi, I am not sure if this is the right place to post this question. If you know where it is appropriate, please let me know. I need to run application that launches 200 processes with the command: 1)orterun --prefix ./ -np 200 -wd ./ -host hostname1.domain.com,1,2,3,4,5,6,7,8,9,.,196,

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Yiguang Yan
> Actually, I should say that I discovered that if you put --prefix on each > line of the app context file, then the first > case (running the app context file) works fine; it adheres to the --prefix > behavior. Yes, I confirmed this on our cluster. It works with --prefix on each line of the

Re: [OMPI users] orted daemon not found! --- environment not passed to slave nodes

2012-03-01 Thread Ralph Castain
I don't know - I didn't write the app file code, and I've never seen anything defining its behavior. So I guess you could say it is intended - or not! :-/ On Mar 1, 2012, at 2:53 PM, Jeffrey Squyres wrote: > Actually, I should say that I discovered that if you put --prefix on each > line of th

Re: [OMPI users] run orterun with more than 200 processes

2012-03-01 Thread Ralph Castain
You might try putting that list of hosts in a hostfile instead of on the cmd line - you may be hitting some limits there. I also don't believe that you can add an orted in that manner - orterun will have no idea how it got there and is likely to abort. On Mar 1, 2012, at 3:20 PM, Jianzhang He w

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-03-01 Thread Abhinav Sarje
yes, I did a full autogen, configure, make clean and make all On Thu, Mar 1, 2012 at 10:03 PM, Jeffrey Squyres wrote: > Did you do a full autogen / configure / make clean / make all ? > > > On Mar 1, 2012, at 8:53 AM, Abhinav Sarje wrote: > >> Thanks Ralph. That did help, but only till the next