Sorry Jeff I couldn't get you point.
On Wed, Feb 29, 2012 at 4:27 PM, Jeffrey Squyres wrote:
> On Feb 29, 2012, at 2:17 AM, Syed Ahsan Ali wrote:
>
> > [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile
> $i{ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1
> .
It sounds like you are running into an issue with the Linux scheduler. I have
an item to add an API "bind-this-thread-to-", but that won't be
available until sometime in the future.
Couple of things you could try in the meantime. First, use the --cpus-per-rank
option to separate the ranks from
Hi Pasha,
>On Wed, Feb 29, 2012 at 11:02 AM, Shamis, Pavel wrote:
>
> I would like to see all the file.
> 28MB is it the size after compression ?
>
> I think gmail supports up to 25Mb.
> You may try to create gzip file and then slice it using "split" command.
See attached. At about line 151311 i
Hi,
On Wed, Feb 29, 2012 at 09:09:27PM +, Denver Smith wrote:
>
>Hello,
>On my cluster running moab and torque, I cannot ssh without a password
>between compute nodes. I can however request multiple node jobs fine. I
>was wondering if passwordless ssh keys need to be set up be
It really depends. You certainly CAN have mpirun/mpiexec use ssh to
launch the remote processes. If you're using Torque, though, I strongly
recommend using the hooks in OpenMPI, into the Torque TM-API (see
http://www.open-mpi.org/faq/?category=building#build-rte-tm). That will
use the pbs_mom's
Depends on which launcher you are using. My understanding is that you can
use torque to launch the MPI processes on remote nodes, but you must
compile this support into OpenMPI. Please, someone correct me if I am
wrong.
For most clusters I work with and manage, we use passwordless keys. The
rea
Hello,
On my cluster running moab and torque, I cannot ssh without a password between
compute nodes. I can however request multiple node jobs fine. I was wondering
if passwordless ssh keys need to be set up between compute nodes in order for
mpi applications to run correctly.
Thanks
On Feb 29, 2012, at 2:57 PM, Jingcha Joba wrote:
> So if I understand correctly, if a message size is smaller than it will use
> the MPI way (non-RDMA, 2 way communication), if its larger, then it would use
> the Open Fabrics, by using the ibverbs (and ofed stack) instead of using the
> MPI's s
So if I understand correctly, if a message size is smaller than it will use
the MPI way (non-RDMA, 2 way communication), if its larger, then it would
use the Open Fabrics, by using the ibverbs (and ofed stack) instead of
using the MPI's stack?
If so, could that be the reason why the MPI_Put "hangs
On Feb 29, 2012, at 2:30 PM, Jingcha Joba wrote:
> Squyres,
> I thought RDMA read and write are implemented as one side communication using
> get and put respectively..
> Is it not so?
Yes and no.
Keep in mind the difference between two things here:
- An an underlying transport's one-sided ca
Gah. I didn't realize that my 1.4.x build was a *developer* build.
*Developer* builds give a *lot* more detail with plm_base_verbose=100
(including the specific rsh command being used). You obviously didn't get that
output because you don't have a developer build. :-\
Just for reference, he
Squyres,
I thought RDMA read and write are implemented as one side communication
using get and put respectively..
Is it not so?
On Wed, Feb 29, 2012 at 10:49 AM, Jeffrey Squyres wrote:
> FWIW, if Brian says that our one-sided stuff is a bit buggy, I believe him
> (because he wrote it). :-)
>
> T
FWIW, if Brian says that our one-sided stuff is a bit buggy, I believe him
(because he wrote it). :-)
The fact is that the MPI-2 one-sided stuff is extremely complicated and
somewhat open to interpretation. In practice, I haven't seen the MPI-2
one-sided stuff used much in the wild. The MPI-
two things:
1. Too many mpi processes on one node leading to processes pre-empting each
other
2. Contention in your network.
On Wed, Feb 29, 2012 at 8:01 AM, Pinero, Pedro_jose <
pedro_jose.pin...@atmel.com> wrote:
> Hi,
>
> ** **
>
> I am using OMPI v.1.5.5 to communicate 200 Processes in a
When I ran my osu tests , I was able to get the numbers out of all the
tests except latency_mt (which was obvious, as I didnt compile open-mpi
with multi threaded support).
A good way to know if the problem is with openmpi or with your custom OFED
stack would be to use some other device like tcp in
Hi,
I would like to know which of "waitone" vs "waitany" is optimal and of
course, will never produce deadlocks.
Let's say we have "lNp" processes and they want to send an array of int
of length "lNbInt" to process "0" in a non-blocking MPI_Isend (instead
of MPI_Gather). Let's say the order
On Feb 29, 2012, at 9:39 AM, Stefano Dal Pont wrote:
> I'm a newbie with openMPI so the problem it's probably me :)
> Im using a Fortran 90 code developed under Ubuntu 10.04. I've recently
> installed the same code on my Archlinux machine but I have some issues
> concerning openMPI.
> A simple
I'm pretty sure that they are correct. Our one-sided implementation is
buggier than I'd like (indeed, I'm in the process of rewriting most of it
as part of Open MPI's support for MPI-3's revised RDMA), so it's likely
that the bugs are in Open MPI's onesided support. Can you try a more
recent rele
On Feb 29, 2012, at 11:01 AM, Pinero, Pedro_jose wrote:
> I am using OMPI v.1.5.5 to communicate 200 Processes in a 2-Computers cluster
> connected though Ethernet, obtaining a very poor performance.
Let me making sure I'm parsing this statement properly: are you launching 200
MPI processes on
FWIW, I'm immediately suspicious of *any* MPI application that uses the MPI
one-sided operations (i.e., MPI_PUT and MPI_GET). It looks like these two OSU
benchmarks are using those operations.
Is it known that these two benchmarks are correct?
On Feb 29, 2012, at 11:33 AM, Venkateswara Rao D
Thanx alot.
- Original Message -
From: "Ralph Castain"
To: "Open MPI Users"
Sent: Wednesday, February 29, 2012 5:56:23 PM
Subject: Re: [OMPI users] mpirun fails with no allocated resources
Fixed with r26071
On Feb 29, 2012, at 4:55 AM, Jeffrey Squyres wrote:
> Just to put this up fro
Fixed with r26071
On Feb 29, 2012, at 4:55 AM, Jeffrey Squyres wrote:
> Just to put this up front: using the trunk is subject to have these kinds of
> problems. It is the head of development, after all -- things sometimes
> break. :-)
>
> Ralph: FWIW, I can replicate this problem on my Mac (O
Sorry, i forgot to introduce the system.. Ours is the customized OFED stack
implemented to work on the specific hardware.. We tested the stack with the
q-perf and Intel Benchmarks(IMB-3.2.2).. they went fine.. We want to
execute the osu_benchamark3.1.1 suite on our OFED..
On Wed, Feb 29, 2012 at 9
Hiii,
I tried executing osu_benchamarks-3.1.1 suite with the openmpi-1.4.3... I
could run 10 bench-mark tests (except osu_put_bibw,osu_put_bw,osu_
get_bw,osu_latency_mt) out of 14 tests in the bench-mark suite... and the
remaining tests are hanging at some message size.. the output is shown below
>
>> On Tue, Feb 28, 2012 at 11:34 AM, Shamis, Pavel wrote:
>> I reviewed the code and it seems to be ok :) The error should be reported if
>> the port migration is already happened once (port 1 to port 2), and now you
>> are trying to shutdown port 2 and MPI reports that it can't migrate anymo
Hi,
I am using OMPI v.1.5.5 to communicate 200 Processes in a 2-Computers
cluster connected though Ethernet, obtaining a very poor performance. I
have measured each operation time and I haver realised that the
MPI_Gather operation takes about 1 second in each synchronization (only
an integer is
Hi Jeff,
Thanks.
I tried as what you suggested. Here are the output:
>>>
yiguang@gulftown testdmp]$ ./test.bash
[gulftown:25052] mca: base: components_open: Looking for plm
components
[gulftown:25052] mca: base: components_open: opening plm
components
[gulftown:25052] mca: base: components_ope
Hi,
I'm a newbie with openMPI so the problem it's probably me :)
Im using a Fortran 90 code developed under Ubuntu 10.04. I've recently
installed the same code on my Archlinux machine but I have some issues
concerning openMPI.
A simple example-code works fine on both machine while the "big" code g
Just to put this up front: using the trunk is subject to have these kinds of
problems. It is the head of development, after all -- things sometimes break.
:-)
Ralph: FWIW, I can replicate this problem on my Mac (OS X Lion) with the SVN
trunk HEAD (svnversion tells me I have 26070M):
-
[6:
I haven't followed OFED development for a long time, so I don't know if there
is a buggy OFED in RHEL 5.4.
If you're doing development with the internals Open MPI (or if it'll be
necessary to dive into the internals for debugging a custom device/driver), you
might want to move this discussion t
On Feb 29, 2012, at 5:39 AM, adrian sabou wrote:
> I am experiencing a rather unpleasant issue with a simple OpenMPI app. I have
> 4 nodes communicating with a central node. Performance is good and the
> application behaves as it should. (i.e. performance steadily decreases as I
> increase the
On Feb 29, 2012, at 2:17 AM, Syed Ahsan Ali wrote:
> [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile
> $i{ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1 ./hrm
> >> ${OUTFILE}_hrm 2>&1
> [pmdtest@pmd02 d00_dayfiles]$
Because you used >> and 2>&1, the out
FWIW: Ralph committed a change to mpirun the other day that will now check if
you're missing integer command line arguments. This will appear in Open MPI
v1.7. It'll look something like this:
% mpirun -np hostname
---
Open MPI has detected
Hi all,
I am experiencing a rather unpleasant issue with a simple OpenMPI app. I have 4
nodes communicating with a central node. Performance is good and the
application behaves as it should. (i.e. performance steadily decreases as I
increase the work size). My problem is that immediately after
Dear Open-MPI users,
Our code is currently running Open-MPI (1.5.4) with SLURM on a NUMA
machine (2 sockets by nodes and 4 cores by socket) with basically two
levels of implementation for Open-MPI:
- at lower level n "Master" MPI-processes (one by socket) are
simultaneously runned by dividing c
Snapshot of my hosts file is present below. localhost is present here.
127.0.0.1 localhost
127.0.1.1 wahaj-ThinkPad-T510
10.42.43.1 node0
10.42.43.2 node1
Every thing works fine if I don't specify host names.
This problem only specific to Open MPI version 1.7.
Open MPI
Well, it should be
echo *"*mpirun.* ", *
I just noticed that you have $i{ABSDIR}. I think should it be ${ABSDIR}.
On Tue, Feb 28, 2012 at 11:17 PM, Syed Ahsan Ali wrote:
> I tried to echo but it returns nothing.
>
> [pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile
> $i{ABSDI
I tried to echo but it returns nothing.
[pmdtest@pmd02 d00_dayfiles]$ echo ${MPIRUN} -np ${NPROC} -hostfile
$i{ABSDIR}/hostlist -mca btl sm,openib,self --mca btl_openib_use_srq 1
./hrm >> ${OUTFILE}_hrm 2>&1
[pmdtest@pmd02 d00_dayfiles]$
On Wed, Feb 29, 2012 at 12:01 PM, Jingcha Joba wrote:
> J
Just to be sure, can u try
echo "${MPIRUN} -np ${NPROC} -hostfile ${ABSDIR}/hostlist -mca btl
sm,openib,self --mca btl_openib_use_srq 1 ./hrm >> ${OUTFILE}_hrm 2>&1"
and check if you are indeed getting the correct argument.
If that looks fine, can u add --mca btl_openib_verbose 1 to the mpirun
arg
After creating new hostlist and making the scripts again it is working now
and picking up the hostlist as u can see :
*${MPIRUN} -np ${NPROC} -hostfile ${ABSDIR}/hostlist -mca btl
sm,openib,self --mca btl_openib_use_srq 1 ./hrm >> ${OUTFILE}_hrm 2>&1
(The above command is used to submit job)*
*[p
40 matches
Mail list logo