Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Syed Ahsan Ali
Sorry Jeff I couldn't get you point. On Wed, Feb 29, 2012 at 4:27 PM, Jeffrey Squyres wrote: > On Feb 29, 2012, at 2:17 AM, Syed Ahsan Ali wrote: > > > [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile > $i{ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1 > .

Re: [OMPI users] Hybrid OpenMPI / OpenMP programming

2012-02-29 Thread Ralph Castain
It sounds like you are running into an issue with the Linux scheduler. I have an item to add an API "bind-this-thread-to-", but that won't be available until sometime in the future. Couple of things you could try in the meantime. First, use the --cpus-per-rank option to separate the ranks from

Re: [OMPI users] InfiniBand path migration not working

2012-02-29 Thread Jeremy
Hi Pasha, >On Wed, Feb 29, 2012 at 11:02 AM, Shamis, Pavel wrote: > > I would like to see all the file. > 28MB is it the size after compression ? > > I think gmail supports up to 25Mb. > You may try to create gzip file and then slice it using "split" command. See attached. At about line 151311 i

Re: [OMPI users] ssh between nodes

2012-02-29 Thread Martin Siegert
Hi, On Wed, Feb 29, 2012 at 09:09:27PM +, Denver Smith wrote: > >Hello, >On my cluster running moab and torque, I cannot ssh without a password >between compute nodes. I can however request multiple node jobs fine. I >was wondering if passwordless ssh keys need to be set up be

Re: [OMPI users] ssh between nodes

2012-02-29 Thread Lloyd Brown
It really depends. You certainly CAN have mpirun/mpiexec use ssh to launch the remote processes. If you're using Torque, though, I strongly recommend using the hooks in OpenMPI, into the Torque TM-API (see http://www.open-mpi.org/faq/?category=building#build-rte-tm). That will use the pbs_mom's

Re: [OMPI users] ssh between nodes

2012-02-29 Thread Randall Svancara
Depends on which launcher you are using. My understanding is that you can use torque to launch the MPI processes on remote nodes, but you must compile this support into OpenMPI. Please, someone correct me if I am wrong. For most clusters I work with and manage, we use passwordless keys. The rea

[OMPI users] ssh between nodes

2012-02-29 Thread Denver Smith
Hello, On my cluster running moab and torque, I cannot ssh without a password between compute nodes. I can however request multiple node jobs fine. I was wondering if passwordless ssh keys need to be set up between compute nodes in order for mpi applications to run correctly. Thanks

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jeffrey Squyres
On Feb 29, 2012, at 2:57 PM, Jingcha Joba wrote: > So if I understand correctly, if a message size is smaller than it will use > the MPI way (non-RDMA, 2 way communication), if its larger, then it would use > the Open Fabrics, by using the ibverbs (and ofed stack) instead of using the > MPI's s

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jingcha Joba
So if I understand correctly, if a message size is smaller than it will use the MPI way (non-RDMA, 2 way communication), if its larger, then it would use the Open Fabrics, by using the ibverbs (and ofed stack) instead of using the MPI's stack? If so, could that be the reason why the MPI_Put "hangs

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jeffrey Squyres
On Feb 29, 2012, at 2:30 PM, Jingcha Joba wrote: > Squyres, > I thought RDMA read and write are implemented as one side communication using > get and put respectively.. > Is it not so? Yes and no. Keep in mind the difference between two things here: - An an underlying transport's one-sided ca

Re: [OMPI users] orted daemon no found! --- environment not passed to slave nodes

2012-02-29 Thread Jeffrey Squyres
Gah. I didn't realize that my 1.4.x build was a *developer* build. *Developer* builds give a *lot* more detail with plm_base_verbose=100 (including the specific rsh command being used). You obviously didn't get that output because you don't have a developer build. :-\ Just for reference, he

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jingcha Joba
Squyres, I thought RDMA read and write are implemented as one side communication using get and put respectively.. Is it not so? On Wed, Feb 29, 2012 at 10:49 AM, Jeffrey Squyres wrote: > FWIW, if Brian says that our one-sided stuff is a bit buggy, I believe him > (because he wrote it). :-) > > T

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jeffrey Squyres
FWIW, if Brian says that our one-sided stuff is a bit buggy, I believe him (because he wrote it). :-) The fact is that the MPI-2 one-sided stuff is extremely complicated and somewhat open to interpretation. In practice, I haven't seen the MPI-2 one-sided stuff used much in the wild. The MPI-

Re: [OMPI users] Very slow MPI_GATHER

2012-02-29 Thread Jingcha Joba
two things: 1. Too many mpi processes on one node leading to processes pre-empting each other 2. Contention in your network. On Wed, Feb 29, 2012 at 8:01 AM, Pinero, Pedro_jose < pedro_jose.pin...@atmel.com> wrote: > Hi, > > ** ** > > I am using OMPI v.1.5.5 to communicate 200 Processes in a

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jingcha Joba
When I ran my osu tests , I was able to get the numbers out of all the tests except latency_mt (which was obvious, as I didnt compile open-mpi with multi threaded support). A good way to know if the problem is with openmpi or with your custom OFED stack would be to use some other device like tcp in

[OMPI users] Newbi question about MPI_wait vs MPI_wait any

2012-02-29 Thread Eric Chamberland
Hi, I would like to know which of "waitone" vs "waitany" is optimal and of course, will never produce deadlocks. Let's say we have "lNp" processes and they want to send an array of int of length "lNbInt" to process "0" in a non-blocking MPI_Isend (instead of MPI_Gather). Let's say the order

Re: [OMPI users] archlinux segmentation fault error

2012-02-29 Thread Jeffrey Squyres
On Feb 29, 2012, at 9:39 AM, Stefano Dal Pont wrote: > I'm a newbie with openMPI so the problem it's probably me :) > Im using a Fortran 90 code developed under Ubuntu 10.04. I've recently > installed the same code on my Archlinux machine but I have some issues > concerning openMPI. > A simple

Re: [OMPI users] [EXTERNAL] Re: Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Barrett, Brian W
I'm pretty sure that they are correct. Our one-sided implementation is buggier than I'd like (indeed, I'm in the process of rewriting most of it as part of Open MPI's support for MPI-3's revised RDMA), so it's likely that the bugs are in Open MPI's onesided support. Can you try a more recent rele

Re: [OMPI users] Very slow MPI_GATHER

2012-02-29 Thread Jeffrey Squyres
On Feb 29, 2012, at 11:01 AM, Pinero, Pedro_jose wrote: > I am using OMPI v.1.5.5 to communicate 200 Processes in a 2-Computers cluster > connected though Ethernet, obtaining a very poor performance. Let me making sure I'm parsing this statement properly: are you launching 200 MPI processes on

Re: [OMPI users] Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Jeffrey Squyres
FWIW, I'm immediately suspicious of *any* MPI application that uses the MPI one-sided operations (i.e., MPI_PUT and MPI_GET). It looks like these two OSU benchmarks are using those operations. Is it known that these two benchmarks are correct? On Feb 29, 2012, at 11:33 AM, Venkateswara Rao D

Re: [OMPI users] mpirun fails with no allocated resources

2012-02-29 Thread Muhammad Wahaj Sethi
Thanx alot. - Original Message - From: "Ralph Castain" To: "Open MPI Users" Sent: Wednesday, February 29, 2012 5:56:23 PM Subject: Re: [OMPI users] mpirun fails with no allocated resources Fixed with r26071 On Feb 29, 2012, at 4:55 AM, Jeffrey Squyres wrote: > Just to put this up fro

Re: [OMPI users] mpirun fails with no allocated resources

2012-02-29 Thread Ralph Castain
Fixed with r26071 On Feb 29, 2012, at 4:55 AM, Jeffrey Squyres wrote: > Just to put this up front: using the trunk is subject to have these kinds of > problems. It is the head of development, after all -- things sometimes > break. :-) > > Ralph: FWIW, I can replicate this problem on my Mac (O

Re: [OMPI users] Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Venkateswara Rao Dokku
Sorry, i forgot to introduce the system.. Ours is the customized OFED stack implemented to work on the specific hardware.. We tested the stack with the q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We want to execute the osu_benchamark3.1.1 suite on our OFED.. On Wed, Feb 29, 2012 at 9

[OMPI users] Question regarding osu-benchamarks 3.1.1

2012-02-29 Thread Venkateswara Rao Dokku
Hiii, I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_ get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the remaining tests are hanging at some message size.. the output is shown below

Re: [OMPI users] InfiniBand path migration not working

2012-02-29 Thread Shamis, Pavel
> >> On Tue, Feb 28, 2012 at 11:34 AM, Shamis, Pavel wrote: >> I reviewed the code and it seems to be ok :) The error should be reported if >> the port migration is already happened once (port 1 to port 2), and now you >> are trying to shutdown port 2 and MPI reports that it can't migrate anymo

[OMPI users] Very slow MPI_GATHER

2012-02-29 Thread Pinero, Pedro_jose
Hi, I am using OMPI v.1.5.5 to communicate 200 Processes in a 2-Computers cluster connected though Ethernet, obtaining a very poor performance. I have measured each operation time and I haver realised that the MPI_Gather operation takes about 1 second in each synchronization (only an integer is

Re: [OMPI users] orted daemon no found! --- environment not passed to slave nodes

2012-02-29 Thread Yiguang Yan
Hi Jeff, Thanks. I tried as what you suggested. Here are the output: >>> yiguang@gulftown testdmp]$ ./test.bash [gulftown:25052] mca: base: components_open: Looking for plm components [gulftown:25052] mca: base: components_open: opening plm components [gulftown:25052] mca: base: components_ope

[OMPI users] archlinux segmentation fault error

2012-02-29 Thread Stefano Dal Pont
Hi, I'm a newbie with openMPI so the problem it's probably me :) Im using a Fortran 90 code developed under Ubuntu 10.04. I've recently installed the same code on my Archlinux machine but I have some issues concerning openMPI. A simple example-code works fine on both machine while the "big" code g

Re: [OMPI users] mpirun fails with no allocated resources

2012-02-29 Thread Jeffrey Squyres
Just to put this up front: using the trunk is subject to have these kinds of problems. It is the head of development, after all -- things sometimes break. :-) Ralph: FWIW, I can replicate this problem on my Mac (OS X Lion) with the SVN trunk HEAD (svnversion tells me I have 26070M): - [6:

Re: [OMPI users] IMB-OpenMPI on Centos 6

2012-02-29 Thread Jeffrey Squyres
I haven't followed OFED development for a long time, so I don't know if there is a buggy OFED in RHEL 5.4. If you're doing development with the internals Open MPI (or if it'll be necessary to dive into the internals for debugging a custom device/driver), you might want to move this discussion t

Re: [OMPI users] Drastic OpenMPI performance reduction when message exeeds 128 KB

2012-02-29 Thread Jeffrey Squyres
On Feb 29, 2012, at 5:39 AM, adrian sabou wrote: > I am experiencing a rather unpleasant issue with a simple OpenMPI app. I have > 4 nodes communicating with a central node. Performance is good and the > application behaves as it should. (i.e. performance steadily decreases as I > increase the

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Jeffrey Squyres
On Feb 29, 2012, at 2:17 AM, Syed Ahsan Ali wrote: > [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile > $i{ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1 ./hrm > >> ${OUTFILE}_hrm 2>&1 > [pmdtest@pmd02 d00_dayfiles]$ Because you used >> and 2>&1, the out

Re: [OMPI users] Could not execute the executable"/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Jeffrey Squyres
FWIW: Ralph committed a change to mpirun the other day that will now check if you're missing integer command line arguments. This will appear in Open MPI v1.7. It'll look something like this: % mpirun -np hostname --- Open MPI has detected

[OMPI users] Drastic OpenMPI performance reduction when message exeeds 128 KB

2012-02-29 Thread adrian sabou
Hi all,   I am experiencing a rather unpleasant issue with a simple OpenMPI app. I have 4 nodes communicating with a central node. Performance is good and the application behaves as it should. (i.e. performance steadily decreases as I increase the work size). My problem is that immediately after

[OMPI users] Hybrid OpenMPI / OpenMP programming

2012-02-29 Thread Auclair Francis
Dear Open-MPI users, Our code is currently running Open-MPI (1.5.4) with SLURM on a NUMA machine (2 sockets by nodes and 4 cores by socket) with basically two levels of implementation for Open-MPI: - at lower level n "Master" MPI-processes (one by socket) are simultaneously runned by dividing c

Re: [OMPI users] mpirun fails with no allocated resources

2012-02-29 Thread Muhammad Wahaj Sethi
Snapshot of my hosts file is present below. localhost is present here. 127.0.0.1 localhost 127.0.1.1 wahaj-ThinkPad-T510 10.42.43.1 node0 10.42.43.2 node1 Every thing works fine if I don't specify host names. This problem only specific to Open MPI version 1.7. Open MPI

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Jingcha Joba
Well, it should be echo *"*mpirun.* ", * I just noticed that you have $i{ABSDIR}. I think should it be ${ABSDIR}. On Tue, Feb 28, 2012 at 11:17 PM, Syed Ahsan Ali wrote: > I tried to echo but it returns nothing. > > [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile > $i{ABSDI

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Syed Ahsan Ali
I tried to echo but it returns nothing. [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile $i{ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1 ./hrm >> ${OUTFILE}_hrm 2>&1 [pmdtest@pmd02 d00_dayfiles]$ On Wed, Feb 29, 2012 at 12:01 PM, Jingcha Joba wrote: > J

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Jingcha Joba
Just to be sure, can u try echo "${MPIRUN} -np ${NPROC} -hostfile ${ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1 ./hrm >> ${OUTFILE}_hrm 2>&1" and check if you are indeed getting the correct argument. If that looks fine, can u add --mca btl_openib_verbose 1 to the mpirun arg

Re: [OMPI users] Could not execute the executable "/home/MET/hrm/bin/hostlist": Exec format error

2012-02-29 Thread Syed Ahsan Ali
After creating new hostlist and making the scripts again it is working now and picking up the hostlist as u can see : *${MPIRUN} -np ${NPROC} -hostfile ${ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1 ./hrm >> ${OUTFILE}_hrm 2>&1 (The above command is used to submit job)* *[p