date:20090723

Re: [OMPI users] Open MPI:Problem with 64-bit openMPI and intel compiler

2009-07-23 Thread Ralph Castain

Okay - thanks! First, be assured we run 64-bit ifort code under Torque at large scale all the time here at LANL, so this is likely to be something trivial in your environment. A few things to consider/try: 1. most likely culprit is that your LD_LIBRARY_PATH is pointing to the 32-bit libr

Re: [OMPI users] Open MPI:Problem with 64-bit openMPI and intel compiler

2009-07-23 Thread Sims, James S. Dr.

[sims@raritan openmpi]$ mpirun -V mpirun (Open MPI) 1.3.1rc4 From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of Ralph Castain [r...@open-mpi.org] Sent: Thursday, July 23, 2009 5:44 PM To: Open MPI Users Subject: Re: [OMPI users] Ope

Re: [OMPI users] Backwards compatibility?

2009-07-23 Thread David Doria

On Thu, Jul 23, 2009 at 5:47 PM, Ralph Castain wrote: > I doubt those two would work together - however, a combination of 1.3.2 and > 1.3.3 should. > > You might look at the ABI compatibility discussion threads (there have been > several) on this list for the reasons. Basically, binary compatibilit

Re: [OMPI users] Open-MPI-1.3.2 compatibility with old torque?

2009-07-23 Thread Song, Kai Song

Hi Ralph, With the flag -mca btl gm,sm,self, runnig the job maually works and has a better performance as you said! However, it still gangs there when it goes through the PBS scheduler. Here is my PBS script: #!/bin/sh #PBS -l nodes=2:ppn=2 #PBS -l walltime=00:02:00 #PBS -k eo cd ~kaisong/tes

Re: [OMPI users] Backwards compatibility?

2009-07-23 Thread Ralph Castain

I doubt those two would work together - however, a combination of 1.3.2 and 1.3.3 should. You might look at the ABI compatibility discussion threads (there have been several) on this list for the reasons. Basically, binary compatibility is supported starting with 1.3.2 and above. On Jul 2

Re: [OMPI users] Open MPI:Problem with 64-bit openMPI and intel compiler

2009-07-23 Thread Ralph Castain

What OMPI version are you using? On Jul 23, 2009, at 3:00 PM, Sims, James S. Dr. wrote: I have an OpenMPI program compiled with a version of OpenMPI built using the ifort 10.1 compiler. I can compile and run this code with no problem, using the 32 bit version of ifort. And I can also submit

[OMPI users] Backwards compatibility?

2009-07-23 Thread David Doria

Is OpenMPI backwards compatible? I.e. If I am running 1.3.1 on one machine and 1.3.3 on the rest, is it supposed to work? Or do they all need exactly the same version? When I add this wrong version machine to the machinelist, with a simple "hello world from each process type program", I see no out

Re: [OMPI users] Interaction of MPI_Send and MPI_Barrier

2009-07-23 Thread Richard Treumann

No - it is not guaranteed. (it is highly probable though) The return from the MPI_Send only guarantees that the data is safely held somewhere other than the send buffer so you are free to modify the send buffer. The MPI standard does not say where the data is to be held. It only says that once th

Re: [OMPI users] Receiving an unknown number of messages

2009-07-23 Thread Eugene Loh

Shaun Jackman wrote: Eugene Loh wrote: Shaun Jackman wrote: For my MPI application, each process reads a file and for each line sends a message (MPI_Send) to one of the other processes determined by the contents of that line. Each process posts a single MPI_Irecv and uses MPI_Request_get_s

[OMPI users] Interaction of MPI_Send and MPI_Barrier

2009-07-23 Thread Shaun Jackman

Hi, Two processes run the following program: request = MPI_Irecv MPI_Send (to the other process) MPI_Barrier flag = MPI_Test(request) Without the barrier, there's a race and MPI_Test may or may not return true, indicating whether the message has been received. With the barrier, is it guarante

[OMPI users] Open MPI:Problem with 64-bit openMPI and intel compiler

2009-07-23 Thread Sims, James S. Dr.

I have an OpenMPI program compiled with a version of OpenMPI built using the ifort 10.1 compiler. I can compile and run this code with no problem, using the 32 bit version of ifort. And I can also submit batch jobs using torque with this 32-bit code. However, compiling the same code to produce a

Re: [OMPI users] Receiving an unknown number of messages

2009-07-23 Thread Shaun Jackman

Eugene Loh wrote: Shaun Jackman wrote: For my MPI application, each process reads a file and for each line sends a message (MPI_Send) to one of the other processes determined by the contents of that line. Each process posts a single MPI_Irecv and uses MPI_Request_get_status to test for a rece

Re: [OMPI users] Open-MPI-1.3.2 compatibility with old torque?

2009-07-23 Thread Ralph Castain

My apologies - I had missed that -mca btl flag. That is the source of the trouble. IIRC, GM doesn't have a loopback method in it. OMPI requires that -every- proc be able to reach -every- proc, including itself. So you must include the "self" btl at a minimum. Also, if you want more perfor

Re: [OMPI users] Open-MPI-1.3.2 compatibility with old torque?

2009-07-23 Thread Song, Kai Song

Hi Ralph, Thanks for the fast reply! I put the --display-allocation and --display-map flags on and it looks like the nodes allocation is just fine, but the job still hang. The output looks like this: /home/kaisong/test node0001 node0001 node node Starting parallel job ===

[OMPI users] TCP btl misbehaves if btl_tcp_port_min_v4 is not set.

2009-07-23 Thread Eric Thibodeau

Hello all, (this _might_ be related to https://svn.open-mpi.org/trac/ompi/ticket/1505) I just compiled and installed 1.3.3 ins a CentOS 5 environment and we noticed the processes would deadlock as soon as they would start using TCP communications. The test program is one that has been run

Re: [OMPI users] Profiling performance by forcing transport choice.

2009-07-23 Thread Eugene Loh

Nifty Tom Mitchell wrote: On Thu, Jun 25, 2009 at 08:37:21PM -0400, Jeff Squyres wrote: Subject: Re: [OMPI users] 50%performance reduction due to OpenMPI v 1.3.2forcing allMPI traffic over Ethernet instead of using Infiniband While the previous thread on "performance r

Re: [OMPI users] Problem launching jobs in SGE (with loose integration), OpenMPI 1.3.3

2009-07-23 Thread Craig Tierney

Rolf Vandevaart wrote: > I think what you are looking for is this: > > --mca plm_rsh_disable_qrsh 1 > > This means we will disable the use of qrsh and use rsh or ssh instead. > > The --mca pls ^sge does not work anymore for two reasons. First, the > "pls" framework was renamed "plm". Secondly,

Re: [OMPI users] Network connection check

2009-07-23 Thread vipin kumar

> You don't specify and based on your description I infer that you are not > using a batch/queueing system, but just a rsh/ssh based start-up mechanism. You are absolutely correct. I am using rsh/ssh based start-up mechanism. A batch/queueing system might be able to tell you whether a remote co

Re: [OMPI users] Problem launching jobs in SGE (with loose integration), OpenMPI 1.3.3

2009-07-23 Thread Rolf Vandevaart

I think what you are looking for is this: --mca plm_rsh_disable_qrsh 1 This means we will disable the use of qrsh and use rsh or ssh instead. The --mca pls ^sge does not work anymore for two reasons. First, the "pls" framework was renamed "plm". Secondly, the gridgengine plm was folded into

Re: [OMPI users] Network connection check

2009-07-23 Thread Durga Choudhury

The 'system' command will fork a separate process to run. If I remember correctly, forking within MPI can lead to undefined behavior. Can someone in OpenMPI development team clarify? What I don't understand is: why is your TCP network so unstable that you are worried about reachability? For MPI to

[OMPI users] Problem launching jobs in SGE (with loose integration), OpenMPI 1.3.3

2009-07-23 Thread Craig Tierney

I have built OpenMPI 1.3.3 without support for SGE. I just want to launch jobs with loose integration right now. Here is how I configured it: ./configure CC=pgcc CXX=pgCC F77=pgf90 F90=pgf90 FC=pgf90 --prefix=/opt/openmpi/1.3.3-pgi --without-sge --enable-io-romio --with-openib=/opt/hjet/ofed/1

Re: [OMPI users] Network connection check

2009-07-23 Thread vipin kumar

Thank you all Jeff, Jody, Prentice and Bogdan for your invaluable clarification, solution and suggestion, Open MPI should return a failure if TCP connectivity is lost, even with a > non-blocking point-to-point operation. The failure should be returned in > the call to MPI_TEST (and friends). ev

Re: [OMPI users] Network connection check

2009-07-23 Thread Bogdan Costescu

On Thu, 23 Jul 2009, vipin kumar wrote: 1: Slave machine is reachable or not, (How I will do that ??? Given - I have IP address and Host Name of Slave machine.) 2: if reachable, check whether program(orted and "slaveprocess") is alive or not. You don't specify and based on your description

Re: [OMPI users] Network connection check

2009-07-23 Thread Prentice Bisbal

Jeff Squyres wrote: > On Jul 22, 2009, at 10:05 AM, vipin kumar wrote: > >> Actually requirement is how a C/C++ program running in "master" node >> should find out whether "slave" node is reachable (as we check this >> using "ping" command) or not ? Because IP address may change at any >> time, t

Re: [OMPI users] Network connection check

2009-07-23 Thread jody

Maybe you could make a system call to ping the other machine. char sCommand[512]; // build the command string sprintf(sCommand, "ping -c %d -q %s > /dev/null", numPings, sHostName); // execute the command int iResult =system(sCommand); If the ping was successful, iResult will h

Re: [OMPI users] Warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declare anything

2009-07-23 Thread Jeff Squyres

On Jul 22, 2009, at 3:17 AM, Alexey Sokolov wrote: from /home/user/NetBeansProjects/Correlation_orig/ Correlation/Correlation.cpp:2: /usr/include/openmpi/1.2.4-gcc/openmpi/ompi/mpi/cxx/ request_inln.h:347: warning: declaration ‘struct MPI::Grequest_intercept_t’ does not declar

Re: [OMPI users] Tuned collectives: How to choose them dynamically? (-mca coll_tuned_dynamic_rules_filename dyn_rules)"

2009-07-23 Thread Igor Kozin

Hi Gus, I played with collectives a few months ago. Details are here http://www.cse.scitech.ac.uk/disco/publications/WorkingNotes.ConnectX.pdf That was in the context of 1.2.6 You can get available tuning options by doing ompi_info -all -mca coll_tuned_use_dynamic_rules 1 | grep alltoall and simil

Re: [OMPI users] Network connection check

2009-07-23 Thread Jeff Squyres

On Jul 23, 2009, at 7:36 AM, vipin kumar wrote: I can't use blocking communication routines in my main program ( "masterprocess") because any type of network failure( may be due to physical connectivity or TCP connectivity or MPI connection as you told) may occur. So I am using non blocking

Re: [OMPI users] ifort and gfortran module

2009-07-23 Thread Jeff Squyres

FWIW, for the Fortran MPI programmers out there, the MPI Forum is hard at work on a new Fortran 03 set of bindings for MPI-3. We have a prototype in a side branch of Open MPI that is "mostly" working. We (the MPI Forum) expect to release a short document describing the new features and th

Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released

2009-07-23 Thread Jeff Squyres

On Jul 23, 2009, at 6:39 AM, Dave Love wrote: > The MPI ABI has not changed since 1.3.2. Good, thanks. I hadn't had time to investigate the items in the release notes that looked suspicious. Are there actually any known ABI incompatibilities between 1.3.0 and 1.3.2? We haven't noticed any

Re: [OMPI users] Network connection check

2009-07-23 Thread vipin kumar

On Thu, Jul 23, 2009 at 3:03 PM, Ralph Castain wrote: > It depends on which network fails. If you lose all TCP connectivity, Open > MPI should abort the job as the out-of-band system will detect the loss of > connection. If you only lose the MPI connection (whether TCP or some other > interconnec

Re: [OMPI users] ifort and gfortran module

2009-07-23 Thread Dave Love

Jeff Squyres writes: > I *think* that there are compiler flags that you can use with ifort to > make it behave similarly to gfortran in terms of sizes and constant > values, etc. At a slight tangent, if there are flags that might be helpful to add to gfortran for compatibility (e.g. logical cons

Re: [OMPI users] ifort and gfortran module

2009-07-23 Thread Dave Love

Jeff Squyres writes: > See https://svn.open-mpi.org/source/xref/ompi_1.3/README#257. Ah, neat. I'd never thought of that, possibly due to ELF not being relevant when I first started worrying about that sort of thing. > Indeed. In OMPI, we tried to make this as simple as possible. But > unles

Re: [OMPI users] [Open MPI Announce] Open MPI v1.3.3 released

2009-07-23 Thread Dave Love

Jeff Squyres writes: > The MPI ABI has not changed since 1.3.2. Good, thanks. I hadn't had time to investigate the items in the release notes that looked suspicious. Are there actually any known ABI incompatibilities between 1.3.0 and 1.3.2? We haven't noticed any as far as I know. > Note th

Re: [OMPI users] Network connection check

2009-07-23 Thread Ralph Castain

It depends on which network fails. If you lose all TCP connectivity, Open MPI should abort the job as the out-of-band system will detect the loss of connection. If you only lose the MPI connection (whether TCP or some other interconnect), then I believe the system will eventually generate a

Re: [OMPI users] ifort and gfortran module

2009-07-23 Thread rahmani

Hi Martin in your following solution I have a question: in step2. move the Fortran module to the directory ... what is "Fortran module" in step3. we don't need to install openmpi? thanks - Original Message - From: "Martin Siegert" To: "Open MPI Users" Sent: Monday, July 20, 2009 1:47:35

Re: [OMPI users] Network connection check

2009-07-23 Thread vipin kumar

> > Are you asking to find out this information before issuing "mpirun"? Open > MPI does assume that the nodes you are trying to use are reachable. > > NO, Scenario is a pair of processes are running one in "master" node say "masterprocess" and one in "slave" node say "slaveprocess". When "maste

37 matches

Mail list logo