Re: [OMPI users] 'orte_ess_base_select failed'

2009-03-27 Thread Russell McQueeney
Jeff Squyres wrote: Hmm -- puzzling -- the error file you sent shows the following: bash: /opt/openmpi/orted: No such file or directory But that shouldn't happen; according to your config.log, you installed with a prefix of /opt/openmpi, so Open MPI should be looking for orted in /opt/openmpi

Re: [OMPI users] Does OpenMPI's MPI_Barrier automatically call thetuned version?

2009-03-27 Thread Jeff Squyres
On Mar 23, 2009, at 6:02 AM, Shanyuan Gao wrote: Here I am again with questions about MPI_Barrier. I did some benchmark on MPI_Barrier and wondered if OpenMPI's implementation automatically calls the tuned version of MPI_Barrier, e.g. tree algorithm, when the number of nodes exceeds 4? Yes,

Re: [OMPI users] Bug report [?] on spawn processes - blocking when morethan one Send/Recv

2009-03-27 Thread Jeff Squyres
It does not hang for me... But I do notice one odd thing in your extended program: you send 3 characters of the string "hi2" -- that will not include the trailing \0. You might want to send 4 characters to ensure to include the trailing \0. On Mar 25, 2009, at 9:52 AM, Lionel Gamet wrot

Re: [OMPI users] 'orte_ess_base_select failed'

2009-03-27 Thread Jeff Squyres
Hmm -- puzzling -- the error file you sent shows the following: bash: /opt/openmpi/orted: No such file or directory But that shouldn't happen; according to your config.log, you installed with a prefix of /opt/openmpi, so Open MPI should be looking for orted in /opt/openmpi/bin/orted. You s

Re: [OMPI users] Cannot build OpenMPI 1.3 with PGI pgf90 and Gnu gcc/g++.

2009-03-27 Thread Jeff Squyres
Sorry for the delay in replying. Can you send your exact configure command line? Also, do you need the F90 MPI bindings? If not, you can disable them with the following: --disable-mpi-f90 On Mar 27, 2009, at 9:50 AM, Gus Correa wrote: Dear OpenMPI pros. I've got no answer, so let me

Re: [OMPI users] 'orte_ess_base_select failed'

2009-03-27 Thread Russell McQueeney
command = mpirun --hostfile hostfile -np 2 echo `uname -a` PATH = ...:/opt/openmpi/bin LD_LIBRARY_PATH = /opt/openmpi/lib no MCA parameters used I set up the default shell to bash, and put some echo's in .bash_profile and .bashrc, and when i run the mpirun command, i see those echoes, but then

Re: [OMPI users] [Open MPI Announce] Critical bug notice

2009-03-27 Thread Åke Sandgren
On Fri, 2009-03-27 at 11:34 -0700, Jeff Squyres wrote: > The Open MPI team has uncovered a serious bug in Open MPI v1.3.0 and > v1.3.1: when running on OpenFabrics-based networks, silent data > corruption is possible in some cases. There are two workarounds to > avoid the issue -- please see

[OMPI users] Critical bug notice

2009-03-27 Thread Jeff Squyres
The Open MPI team has uncovered a serious bug in Open MPI v1.3.0 and v1.3.1: when running on OpenFabrics-based networks, silent data corruption is possible in some cases. There are two workarounds to avoid the issue -- please see the bug ticket that has been opened about this issue for fur

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURMon Debian (Lenny)]

2009-03-27 Thread Ralph Castain
I don't believe that code has moved to the 1.3 release seriesit shouldn't have. On Mar 27, 2009, at 11:27 AM, George Bosilca wrote: The range of ports for the OOB TCP has been removed by commit r20390. Apparently it was replaced by the static port functionality. Only the TCP BTL use t

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURMon Debian (Lenny)]

2009-03-27 Thread George Bosilca
The range of ports for the OOB TCP has been removed by commit r20390. Apparently it was replaced by the static port functionality. Only the TCP BTL use the range mechanism. george. On Mar 27, 2009, at 08:56 , Jeff Squyres wrote: George -- Did we change anything about the fixed ports stu

Re: [OMPI users] error polling LP CQ with status RETRY EXCEEDED ERROR

2009-03-27 Thread Gary Draving
Thanks for the advice, we tried "-mca btl_openib_ib_min_rnr_timer 25 -mca btl_openib_ib_timeout 20" but we are still getting errors as we increase the Ns of HPL.dat value into the thousands. Is it ok to just add these valuse to .openmpi/mca-params.conf for the user running the test or should w

Re: [OMPI users] Cannot build OpenMPI 1.3 with PGI pgf90 and Gnu gcc/g++.

2009-03-27 Thread Gus Correa
Dear OpenMPI pros. I've got no answer, so let me try again. I can't build OpenMPI 1.3 with a hybrid pgf90+gcc/g++ compiler set. However, OpenMPI 1.2.8 builds correctly with the same compilers, on the same computer (Linux x86_64 cluster), and same environment. See details in my original message b

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURMon Debian (Lenny)]

2009-03-27 Thread Jerome BENOIT
Hi ! I have just sent the materail to your private email. hth, Jerome Jeff Squyres wrote: George -- Did we change anything about the fixed ports stuff between 1.3.0 and 1.3.1? Jerome -- can you send the full mpirun command / environment variables / MCA file settings that you tried to run to

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURMon Debian (Lenny)]

2009-03-27 Thread Jeff Squyres
George -- Did we change anything about the fixed ports stuff between 1.3.0 and 1.3.1? Jerome -- can you send the full mpirun command / environment variables / MCA file settings that you tried to run to generate the error message that you showed? On Mar 27, 2009, at 5:52 AM, Manuel Prin

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURM on Debian (Lenny)]

2009-03-27 Thread Manuel Prinz
Am Freitag, den 27.03.2009, 20:34 +0800 schrieb Jerome BENOIT: > I have just tried the Sid package (1.3-2), but it does not work properly > (when the firewall are off) Though this should work, the version in Sid is broken in other respects. I do not recommend using it. > I have just read that the

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURM on Debian (Lenny)]

2009-03-27 Thread Jerome BENOIT
Hello List, Manuel Prinz wrote: Am Freitag, den 27.03.2009, 11:01 +0800 schrieb Jerome BENOIT: Finally I succeeded with the sbatch approach ... when my firewall are stopped ! So I guess that I have to configure my firewall (I use firehol): I have just tried but without success. I will try again

Re: [OMPI users] [Fwd: Re: Configure OpenMPI and SLURM on Debian (Lenny)]

2009-03-27 Thread Manuel Prinz
Am Freitag, den 27.03.2009, 11:01 +0800 schrieb Jerome BENOIT: > Finally I succeeded with the sbatch approach ... when my firewall are > stopped ! > So I guess that I have to configure my firewall (I use firehol): > I have just tried but without success. I will try again later. > Is there any other

Re: [OMPI users] 'orte_ess_base_select failed'

2009-03-27 Thread Ralph Castain
Could you please send the info shown here: http://www.open-mpi.org/community/help/ If the ess is failing, then we don't recognize the environment. Probably an issue with how it is configured vs being run. Thanks Ralph On Mar 26, 2009, at 3:42 PM, Russell McQueeney wrote: I installed OpenMP

Re: [OMPI users] PML add procs failed --> Returned "Unreachable" (-12) instead of "Success" (0)

2009-03-27 Thread Alessandro Surace
Hi Ralph, if what you say is true I don't understand why if I run a job in grid01 and grid03 it runs properly. They are on different network like grid03 and grid04. But if I run the same job in grid03 and grid04 it fails. If it is a network problem like you say I don't think that is about reachab