[OMPI users] compile crash with pathscale and openmpi-1.3

2009-01-23 Thread Alain Miniussi
FYI: I get the following problem when compiling openmpi-1.3 at -O2 and beyond: [alainm@rossini vtfilter]$pwd /misc/nice1/alainm/openmpi-1.3/ompi/contrib/vt/vt/tools/vtfilter [alainm@rossini vtfilter]$make CXXFLAGS=-O2 pathCC -DHAVE_CONFIG_H -I. -I../.. -I../../extlib/otf/otflib -I../../extlib/o

[OMPI users] Cannot compile on Linux Itanium system

2009-01-23 Thread Iannetti, Anthony C. (GRC-RTB0)
Dear OpenMPI Users: I cannot compile OpenMPI 1.3 on my Itanium 2 system. Attached is the ompi-output.tar.gz file. Briefly, my Intel compiler cannot compile the assembler code. Thanks, Tony Anthony C. Iannetti, P.E. NASA Glenn Research Center Aeropropulsion Division, Combustion Branch

[OMPI users] OpenMPI-1.3 and XGrid

2009-01-23 Thread Frank Kahle
I'm running OpenMPI on OS X 4.11. After upgrading to OpenMPI-1.3 I get the following error when submitting a job via XGrid: dyld: lazy symbol binding failed: Symbol not found: _orte_pointer_array_add Referenced from: /usr/local/mpi/lib/openmpi/mca_plm_xgrid.so Expected in: flat namespace

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Jeff Squyres
FWIW, OMPI v1.3 is much better that registered memory usage than the 1.2 series. We introduced some new things, to include being able to specify exactly what receive queues you want. See: ...gaaah! It's not on our FAQ yet. :-( The main idea is that there is a new MCA parameter for the op

[OMPI users] Newbie needs help! MPI_Wait/MPI_Start/MPI_Issend

2009-01-23 Thread Hartzman, Leslie D (MS)
I'm trying to modify some code that is involved in point-to-point communications. Process A has a one way mode of communication with Process B. 'A' checks to see if its rank is zero and if so will send a "command" to 'B' (MPI_Issend) about what kind of data is going to be coming next. After sending

Re: [OMPI users] Handling output of processes

2009-01-23 Thread Gijsbert Wiesenekker
jody wrote: Hi I have a small cluster consisting of 9 computers (8x2 CPUs, 1x4 CPUs). I would like to be able to observe the output of the processes separately during an mpirun. What i currently do is to apply the mpirun to a shell script which opens a xterm for each process, which then starts t

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread George Bosilca
I was somehow confused when I wrote my last email and I mixed up the MPI versions (thanks to Dick Treumann for gently pointing me to the truth). Before MPI 2.1, the MPI Standard was unclear how the MPI_Finalize should behave in the context of spawned or joined worlds, which make the disconn

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread George Bosilca
On Jan 23, 2009, at 11:24 , Eugene Loh wrote: Jeff Squyres wrote: As you have notes, MPI_Barrier is the *only* collective operation that MPI guarantees to have any synchronization properties (and it's a fairly weak guarantee at that; no process will exit the barrier until every process

Re: [OMPI users] 1.3 hangs running 2 exes with different names (Ralph Castain)

2009-01-23 Thread Geoffroy Pignot
Hi Ralph, Thanks for taking time to look into my problem. As you can see , it happens when i dont have both exe available on both nodes. When it's the case (test3) , it works. I dont know if my particular libdir causes the problem or not but I 'll try on Monday with a more classical setup. I ll

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Eugene Loh
Jeff Squyres wrote: As you have notes, MPI_Barrier is the *only* collective operation that MPI guarantees to have any synchronization properties (and it's a fairly weak guarantee at that; no process will exit the barrier until every process has entered the barrier -- but there's no guarantee

Re: [OMPI users] 1.3 hangs running 2 exes with different names (Ralph Castain)

2009-01-23 Thread Ralph Castain
HI Geoffrey Hmmmwell, I redid my tests to mirror yours, and still cannot replicate this problem. I tried it with both slurm and ssh environments - no difference in the results. % make hello % cp hello hello2 % ls hello hello2 % mpirun -n 1 -host odin038 ./hello : -n 1 -host odin039 .

Re: [OMPI users] Cluster with IB hosts and Ethernet hosts

2009-01-23 Thread Sangamesh B
Any solution for the following problem? On Fri, Jan 23, 2009 at 7:58 PM, Sangamesh B wrote: > On Fri, Jan 23, 2009 at 5:41 PM, Jeff Squyres wrote: >> On Jan 22, 2009, at 11:26 PM, Sangamesh B wrote: >> >>> We''ve a cluster with 23 nodes connected to IB switch and 8 nodes >>> have connected to

Re: [OMPI users] 1.3 and --preload-files and --preload-binary

2009-01-23 Thread Josh Hursey
The preload-binary problem had to do with how we were resolving relative path names before moving files. While fixing these bugs I also cleaned up some error reporting mechanisms. I believe that I have fixed both the --preload-binary and --preload- files options in the trunk (r20331). If you

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread George Bosilca
I don't know what your program is doing but I kind of guess what the problem is. If you use MPI 2 dynamics to spawn or connect two MPI_COMM_WORLD you have to disconnect them before calling MPI_Finalize. The reason is that an MPI_Finalize do the opposite of an MPI_Init, so it is MPI_COMM_WOR

Re: [OMPI users] Openmpi 1.3 problems with libtool-ltdl on CentOS 4 and 5

2009-01-23 Thread Jeff Squyres
Ew. Yes, I can see this being a problem. I'm guessing that the real issue is that OMPI embeds the libltdl from LT 2.2.6a inside libopen_pal (one of the internal OMPI libraries). Waving my hands a bit, but it's not hard to imagine some sort of clash is going on between the -lltdl you added

Re: [OMPI users] Cluster with IB hosts and Ethernet hosts

2009-01-23 Thread Sangamesh B
On Fri, Jan 23, 2009 at 5:41 PM, Jeff Squyres wrote: > On Jan 22, 2009, at 11:26 PM, Sangamesh B wrote: > >> We''ve a cluster with 23 nodes connected to IB switch and 8 nodes >> have connected to ethernet switch. Master node is also connected to IB >> switch. SGE(with tight integration, -pe orte

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Igor Kozin
Hi Gabriele, it might be that your message size is too large for available memory per node. I had a problem with IMB when I was not able to run to completion Alltoall on N=128, ppn=8 on our cluster with 16 GB per node. You'd think 16 GB is quite a lot but when you do the maths: 2* 4 MB * 128 procs

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Gabriele Fatigati
Thanks Jeff, i'll try this flag. Regards. 2009/1/23 Jeff Squyres : > This is with the 1.2 series, right? > > Have you tried using what is described here: > > > http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion > > I don't know if you can try OMPI v1.3 or not, but the is

[OMPI users] Openmpi 1.3 problems with libtool-ltdl on CentOS 4 and 5

2009-01-23 Thread Roy Dragseth
Hi, all. I do not know if this is to be considered a real bug or not, I'm just reporting it here so people can find it if they google around for the error message this produces. There is a backtrace at the end of this mail. Problem description: Openmpi 1.3 seems to be nonfunctional when used

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Jeff Squyres
This is with the 1.2 series, right? Have you tried using what is described here: http://www.open-mpi.org/faq/?category=openfabrics#v1.2-use-early-completion I don't know if you can try OMPI v1.3 or not, but the issue described in the the above FAQ item is fixed properly in the OMPI v1.3 s

Re: [OMPI users] Error compiling v1.3 with icc 10.1.021: PATH_MAX not defined

2009-01-23 Thread Jeff Squyres
On Jan 23, 2009, at 7:19 AM, Andrea Iob wrote: What files did you change? The files I changed are: openmpi-1.3/ompi/contrib/vt/vt/vtlib/vt_otf_gen.c openmpi-1.3/ompi/contrib/vt/vt/vtlib/vt_thrd.c openmpi-1.3/opal/util/path.c openmpi-1.3/orte/mca/plm/rsh/plm_rsh_component.c openmpi-1.3/orte/to

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Gabriele Fatigati
Hi Igor, My message size is 4096kb and i have 4 procs per core. There isn't any difference using different algorithms.. 2009/1/23 Igor Kozin : > what is your message size and the number of cores per node? > is there any difference using different algorithms? > > 2009/1/23 Gabriele Fatigati >> >>

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Igor Kozin
what is your message size and the number of cores per node? is there any difference using different algorithms? 2009/1/23 Gabriele Fatigati > Hi Jeff, > i would like to understand why, if i run over 512 procs or more, my > code stops over mpi collective, also with little send buffer. All > proce

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Gabriele Fatigati
Hi Jeff, i would like to understand why, if i run over 512 procs or more, my code stops over mpi collective, also with little send buffer. All processors are locked into call, doing nothing. But, if i add MPI_Barrier after MPI collective, it works! I run over Infiniband net. I know many people wi

Re: [OMPI users] Handling output of processes

2009-01-23 Thread Allen Barnett
On Thu, 2009-01-22 at 06:33 -0700, Ralph Castain wrote: > If you need to do this with a prior releasewell, I'm afraid it > won't work. :-) As a quick hack for 1.2.x, I sometimes use this script to wrap my executable: --- #!/bin/sh #

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread jody
Without knowing the internal details of the program, it's difficult to tell what could be going wrong. Sorry i cant' give you more help here. Perhaps you should try to incrementalls reduce the functionality of your program while keeping the error. That way you may reach a state where it may be ea

Re: [OMPI users] Error compiling v1.3 with icc 10.1.021: PATH_MAX not defined

2009-01-23 Thread Andrea Iob
> > What files did you change? > The files I changed are: openmpi-1.3/ompi/contrib/vt/vt/vtlib/vt_otf_gen.c openmpi-1.3/ompi/contrib/vt/vt/vtlib/vt_thrd.c openmpi-1.3/opal/util/path.c openmpi-1.3/orte/mca/plm/rsh/plm_rsh_component.c openmpi-1.3/orte/tools/orterun/debuggers.c I've attached a p

Re: [OMPI users] Cluster with IB hosts and Ethernet hosts

2009-01-23 Thread Jeff Squyres
On Jan 22, 2009, at 11:26 PM, Sangamesh B wrote: We''ve a cluster with 23 nodes connected to IB switch and 8 nodes have connected to ethernet switch. Master node is also connected to IB switch. SGE(with tight integration, -pe orte) is used for parallel/serial job submission. Open MPI-1.3 is

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Ashley Pittman
On Fri, 2009-01-23 at 06:51 -0500, Jeff Squyres wrote: > > This behaviour sometimes can cause some problems with a lot of > > processors in the jobs. > Can you describe what exactly you mean? The MPI spec specifically > allows this behavior; OMPI made specific design choices and > optimizatio

Re: [OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Jeff Squyres
On Jan 23, 2009, at 6:32 AM, Gabriele Fatigati wrote: I've noted that OpenMPI has an asynchronous behaviour in the collective calls. The processors, doesn't wait that other procs arrives in the call. That is correct. This behaviour sometimes can cause some problems with a lot of processors

Re: [OMPI users] MCA base component file not found

2009-01-23 Thread Jeff Squyres
This sounds like you installed OMPI v1.3 over your existing OMPI v1.2 installation. Note that there are several plugins in the v1.2 series that disappeared in the v1.3 series (i.e., those plugins are either no longer supported or transmorgified into new plugins in v1.3; the latter is the

Re: [OMPI users] Error compiling v1.3 with icc 10.1.021: PATH_MAX not defined

2009-01-23 Thread Jeff Squyres
On Jan 23, 2009, at 4:54 AM, Andrea Iob wrote: It looks like icc 10.1.021 does not define PATH_MAX (version 10.1.013 works without problems). Well that's fun; gotta love when compilers change their behavior. :-) As a workaround I've included in those files where PATH_MAX is used. What f

[OMPI users] Asynchronous behaviour of MPI Collectives

2009-01-23 Thread Gabriele Fatigati
Dear OpenMPI developers, I've noted that OpenMPI has an asynchronous behaviour in the collective calls. The processors, doesn't wait that other procs arrives in the call. This behaviour sometimes can cause some problems with a lot of processors in the jobs. Is there an OpenMPI parameter to lock al

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread Bernard Secher - SFME/LGLS
No i didn't run this program whith Open-MPI 1.2.X because one said to me there were many changes between 1.2.X version and 1.3 version about MPI_publish_name, MPI_Lookup_name (new ompi-server, ...), and it was better to use 1.3 version. Yes i am sure all processes reach MPI_Finalize() function

[OMPI users] MCA base component file not found

2009-01-23 Thread Yongqi Sun
Hello, Open MPI 1.3 is giving me the following warnigns no matter what executables I was launching, even a helloworld program: == openmpi-1.3/examples$ mpiexec -np 2 hello [MyPC:06046] mca: base: component_find: unable to open /usr/local/lib/openmpi/mca_ra

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread jody
Hi Bernard The structure looks as far as i can see. Did it run OK on Open-MPI 1.2.X? So are you sure all processes reach the MPI_Finalize command? Usually MPI_Finalize only completes when all processes reach it. I think you should also make sure that all MPI_Sends are matched by corresponding MPI_

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread Bernard Secher - SFME/LGLS
Thanks Jody for your answer. I launch 2 instances of my program on 2 processes each instance, on the same machine. I use MPI_Publish_name, MPI_Lookup_name to create a global communicator on the 4 processes. Then the 4 processes exchange data. The main program is a CORBA server. I send you th

[OMPI users] Error compiling v1.3 with icc 10.1.021: PATH_MAX not defined

2009-01-23 Thread Andrea Iob
It looks like icc 10.1.021 does not define PATH_MAX (version 10.1.013 works without problems). As a workaround I've included in those files where PATH_MAX is used. Hope it helps. Andrea

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread jody
For instance: - how many processes on how many machines, - what kind of computation - perhaps minimal code which reproduces this failing - configuration settings, etc. See: http://www.open-mpi.org/community/help/ Without any information except for "it doesn't work", nobody can give you any help wh

Re: [OMPI users] dead lock in MPI_Finalize

2009-01-23 Thread Bernard Secher - SFME/LGLS
Hello Jeff, I don't understand what you mean by "A _detailed_ description of what is failing". The problem is a dead lock in MPI_Finalize() function. All processes are blocked in this MPI_Finalize() function. Bernard Jeff Squyres a écrit : Per this note on the "getting help" page, we still

Re: [OMPI users] 1.3 hangs running 2 exes with different names (Ralph Castain)

2009-01-23 Thread Geoffroy Pignot
Hello I redid few tests with my hello world , here are my results. First of all my config : configure --prefix=/tmp/openmpi-1.3 --libdir=/tmp/openmpi-1.3/lib64 --enable-heterogeneous . you will find attached my ompi_info -param all all compil02 and compil03 are identical Rh43 64 bits nodes. *Tes