Okay - thanks!
First, be assured we run 64-bit ifort code under Torque at large scale
all the time here at LANL, so this is likely to be something trivial
in your environment.
A few things to consider/try:
1. most likely culprit is that your LD_LIBRARY_PATH is pointing to the
32-bit libr
[sims@raritan openmpi]$ mpirun -V
mpirun (Open MPI) 1.3.1rc4
From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of
Ralph Castain [r...@open-mpi.org]
Sent: Thursday, July 23, 2009 5:44 PM
To: Open MPI Users
Subject: Re: [OMPI users] Ope
On Thu, Jul 23, 2009 at 5:47 PM, Ralph Castain wrote:
> I doubt those two would work together - however, a combination of 1.3.2 and
> 1.3.3 should.
>
> You might look at the ABI compatibility discussion threads (there have been
> several) on this list for the reasons. Basically, binary compatibilit
Hi Ralph,
With the flag -mca btl gm,sm,self, runnig the job maually works and has a
better performance as you said!
However, it still gangs there when it goes through the PBS scheduler.
Here is my PBS script:
#!/bin/sh
#PBS -l nodes=2:ppn=2
#PBS -l walltime=00:02:00
#PBS -k eo
cd ~kaisong/tes
I doubt those two would work together - however, a combination of
1.3.2 and 1.3.3 should.
You might look at the ABI compatibility discussion threads (there have
been several) on this list for the reasons. Basically, binary
compatibility is supported starting with 1.3.2 and above.
On Jul 2
What OMPI version are you using?
On Jul 23, 2009, at 3:00 PM, Sims, James S. Dr. wrote:
I have an OpenMPI program compiled with a version of OpenMPI built
using the ifort 10.1
compiler. I can compile and run this code with no problem, using the
32 bit
version of ifort. And I can also submit
Is OpenMPI backwards compatible? I.e. If I am running 1.3.1 on one
machine and 1.3.3 on the rest, is it supposed to work? Or do they all
need exactly the same version?
When I add this wrong version machine to the machinelist, with a
simple "hello world from each process type program", I see no out
No - it is not guaranteed. (it is highly probable though)
The return from the MPI_Send only guarantees that the data is safely held
somewhere other than the send buffer so you are free to modify the send
buffer. The MPI standard does not say where the data is to be held. It only
says that once th
Shaun Jackman wrote:
Eugene Loh wrote:
Shaun Jackman wrote:
For my MPI application, each process reads a file and for each line
sends a message (MPI_Send) to one of the other processes determined
by the contents of that line. Each process posts a single MPI_Irecv
and uses MPI_Request_get_s
Hi,
Two processes run the following program:
request = MPI_Irecv
MPI_Send (to the other process)
MPI_Barrier
flag = MPI_Test(request)
Without the barrier, there's a race and MPI_Test may or may not return
true, indicating whether the message has been received. With the
barrier, is it guarante
I have an OpenMPI program compiled with a version of OpenMPI built using the
ifort 10.1
compiler. I can compile and run this code with no problem, using the 32 bit
version of ifort. And I can also submit batch jobs using torque with this
32-bit code.
However, compiling the same code to produce a
Eugene Loh wrote:
Shaun Jackman wrote:
For my MPI application, each process reads a file and for each line
sends a message (MPI_Send) to one of the other processes determined by
the contents of that line. Each process posts a single MPI_Irecv and
uses MPI_Request_get_status to test for a rece
My apologies - I had missed that -mca btl flag. That is the source of
the trouble. IIRC, GM doesn't have a loopback method in it. OMPI
requires that -every- proc be able to reach -every- proc, including
itself.
So you must include the "self" btl at a minimum. Also, if you want
more perfor
Hi Ralph,
Thanks for the fast reply! I put the --display-allocation and --display-map
flags on and it looks like the nodes allocation is just fine, but the job still
hang.
The output looks like this:
/home/kaisong/test
node0001
node0001
node
node
Starting parallel job
===
Hello all,
(this _might_ be related to https://svn.open-mpi.org/trac/ompi/ticket/1505)
I just compiled and installed 1.3.3 ins a CentOS 5 environment and we
noticed the
processes would deadlock as soon as they would start using TCP communications.
The
test program is one that has been run
Nifty Tom Mitchell wrote:
On Thu, Jun 25, 2009 at 08:37:21PM -0400, Jeff Squyres wrote:
Subject: Re: [OMPI users] 50%performance reduction due to OpenMPI v 1.3.2forcing
allMPI traffic over Ethernet instead of using Infiniband
While the previous thread on "performance r
Rolf Vandevaart wrote:
> I think what you are looking for is this:
>
> --mca plm_rsh_disable_qrsh 1
>
> This means we will disable the use of qrsh and use rsh or ssh instead.
>
> The --mca pls ^sge does not work anymore for two reasons. First, the
> "pls" framework was renamed "plm". Secondly,
> You don't specify and based on your description I infer that you are not
> using a batch/queueing system, but just a rsh/ssh based start-up mechanism.
You are absolutely correct. I am using rsh/ssh based start-up mechanism.
A batch/queueing system might be able to tell you whether a remote co
I think what you are looking for is this:
--mca plm_rsh_disable_qrsh 1
This means we will disable the use of qrsh and use rsh or ssh instead.
The --mca pls ^sge does not work anymore for two reasons. First, the
"pls" framework was renamed "plm". Secondly, the gridgengine plm was
folded into
The 'system' command will fork a separate process to run. If I
remember correctly, forking within MPI can lead to undefined behavior.
Can someone in OpenMPI development team clarify?
What I don't understand is: why is your TCP network so unstable that
you are worried about reachability? For MPI to
I have built OpenMPI 1.3.3 without support for SGE.
I just want to launch jobs with loose integration right
now.
Here is how I configured it:
./configure CC=pgcc CXX=pgCC F77=pgf90 F90=pgf90 FC=pgf90
--prefix=/opt/openmpi/1.3.3-pgi --without-sge
--enable-io-romio --with-openib=/opt/hjet/ofed/1
Thank you all Jeff, Jody, Prentice and Bogdan for your invaluable
clarification, solution and suggestion,
Open MPI should return a failure if TCP connectivity is lost, even with a
> non-blocking point-to-point operation. The failure should be returned in
> the call to MPI_TEST (and friends).
ev
On Thu, 23 Jul 2009, vipin kumar wrote:
1: Slave machine is reachable or not, (How I will do that ??? Given - I
have IP address and Host Name of Slave machine.)
2: if reachable, check whether program(orted and "slaveprocess") is alive
or not.
You don't specify and based on your description
Jeff Squyres wrote:
> On Jul 22, 2009, at 10:05 AM, vipin kumar wrote:
>
>> Actually requirement is how a C/C++ program running in "master" node
>> should find out whether "slave" node is reachable (as we check this
>> using "ping" command) or not ? Because IP address may change at any
>> time, t
Maybe you could make a system call to ping the other machine.
char sCommand[512];
// build the command string
sprintf(sCommand, "ping -c %d -q %s > /dev/null", numPings, sHostName);
// execute the command
int iResult =system(sCommand);
If the ping was successful, iResult will h
On Jul 22, 2009, at 3:17 AM, Alexey Sokolov wrote:
from /home/user/NetBeansProjects/Correlation_orig/
Correlation/Correlation.cpp:2:
/usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/
request_inln.h:347: warning: declaration ‘struct
MPI::Grequest_intercept_t’ does not declar
Hi Gus,
I played with collectives a few months ago. Details are here
http://www.cse.scitech.ac.uk/disco/publications/WorkingNotes.ConnectX.pdf
That was in the context of 1.2.6
You can get available tuning options by doing
ompi_info -all -mca coll_tuned_use_dynamic_rules 1 | grep alltoall
and simil
On Jul 23, 2009, at 7:36 AM, vipin kumar wrote:
I can't use blocking communication routines in my main program
( "masterprocess") because any type of network failure( may be due
to physical connectivity or TCP connectivity or MPI connection as
you told) may occur. So I am using non blocking
FWIW, for the Fortran MPI programmers out there, the MPI Forum is hard
at work on a new Fortran 03 set of bindings for MPI-3. We have a
prototype in a side branch of Open MPI that is "mostly" working. We
(the MPI Forum) expect to release a short document describing the new
features and th
On Jul 23, 2009, at 6:39 AM, Dave Love wrote:
> The MPI ABI has not changed since 1.3.2.
Good, thanks. I hadn't had time to investigate the items in the
release
notes that looked suspicious. Are there actually any known ABI
incompatibilities between 1.3.0 and 1.3.2? We haven't noticed any
On Thu, Jul 23, 2009 at 3:03 PM, Ralph Castain wrote:
> It depends on which network fails. If you lose all TCP connectivity, Open
> MPI should abort the job as the out-of-band system will detect the loss of
> connection. If you only lose the MPI connection (whether TCP or some other
> interconnec
Jeff Squyres writes:
> I *think* that there are compiler flags that you can use with ifort to
> make it behave similarly to gfortran in terms of sizes and constant
> values, etc.
At a slight tangent, if there are flags that might be helpful to add to
gfortran for compatibility (e.g. logical cons
Jeff Squyres writes:
> See https://svn.open-mpi.org/source/xref/ompi_1.3/README#257.
Ah, neat. I'd never thought of that, possibly due to ELF not being
relevant when I first started worrying about that sort of thing.
> Indeed. In OMPI, we tried to make this as simple as possible. But
> unles
Jeff Squyres writes:
> The MPI ABI has not changed since 1.3.2.
Good, thanks. I hadn't had time to investigate the items in the release
notes that looked suspicious. Are there actually any known ABI
incompatibilities between 1.3.0 and 1.3.2? We haven't noticed any as
far as I know.
> Note th
It depends on which network fails. If you lose all TCP connectivity,
Open MPI should abort the job as the out-of-band system will detect
the loss of connection. If you only lose the MPI connection (whether
TCP or some other interconnect), then I believe the system will
eventually generate a
Hi Martin
in your following solution I have a question:
in step2. move the Fortran module to the directory ...
what is "Fortran module"
in step3. we don't need to install openmpi?
thanks
- Original Message -
From: "Martin Siegert"
To: "Open MPI Users"
Sent: Monday, July 20, 2009 1:47:35
>
> Are you asking to find out this information before issuing "mpirun"? Open
> MPI does assume that the nodes you are trying to use are reachable.
>
>
NO,
Scenario is a pair of processes are running one in "master" node say
"masterprocess" and one in "slave" node say "slaveprocess". When
"maste
37 matches
Mail list logo