Re: [OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Brian McNally
Hi Dave, I looked through the INSTALL, VERSION, NEWS, and README files in the 1.5.4 openmpi tarball but didn't see what you were referring to. Are you suggesting that I launch mpirun similar to this? mpirun -mca plm ^rshd ...? What I meant by "the same parallel environment setup" was that

Re: [OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Reuti
Am 16.02.2012 um 00:41 schrieb Dave Love: > Brian McNally writes: > >> Hello Open MPI community, >> >> I'm running the openmpi 1.5.3 package as provided by Redhat Enterprise >> Linux 6, along with SGE 6.2u3. I've discovered that under RHEL 5 orted >> gets spawned via qrsh and under RHEL 6 orted

Re: [OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Dave Love
Brian McNally writes: > Hello Open MPI community, > > I'm running the openmpi 1.5.3 package as provided by Redhat Enterprise > Linux 6, along with SGE 6.2u3. I've discovered that under RHEL 5 orted > gets spawned via qrsh and under RHEL 6 orted gets spanwed via > SSH. This is happening in the sam

Re: [OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Reuti
Am 15.02.2012 um 22:59 schrieb Brian McNally: > For for responding so quickly Reuti! > > To be clear my RHEL 5 and RHEL 6 nodes are part of the same cluster. In the > RHEL 5 case qrsh -inherit gets called via mpirun. In the RHEL 6 case > /usr/bin/ssh gets called directly from mpirun. The cluste

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Abhinav Sarje
Hi Gus, I am building using the cray wrappers over the PGI compilers, which gives the errors. I tried building without the cray wrappers, but then it does not run in parallel on the XE6 system I am using. I am going to try the latest nightly build. Abhinav. On Wed, Feb 15, 2012 at 12:22 PM, Gust

Re: [OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Brian McNally
For for responding so quickly Reuti! To be clear my RHEL 5 and RHEL 6 nodes are part of the same cluster. In the RHEL 5 case qrsh -inherit gets called via mpirun. In the RHEL 6 case /usr/bin/ssh gets called directly from mpirun. The cluster setup looks like: qlogin_command /usr/

Re: [OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Reuti
Hi, Am 15.02.2012 um 22:21 schrieb Brian McNally: > Hello Open MPI community, > > I'm running the openmpi 1.5.3 package as provided by Redhat Enterprise Linux > 6, along with SGE 6.2u3. I've discovered that under RHEL 5 orted gets spawned > via qrsh and under RHEL 6 orted gets spanwed via SSH.

[OMPI users] Problems with gridengine integration on RHEL 6

2012-02-15 Thread Brian McNally
Hello Open MPI community, I'm running the openmpi 1.5.3 package as provided by Redhat Enterprise Linux 6, along with SGE 6.2u3. I've discovered that under RHEL 5 orted gets spawned via qrsh and under RHEL 6 orted gets spanwed via SSH. This is happening in the same cluster environment with the

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Gustavo Correa
On Feb 15, 2012, at 1:58 PM, Abhinav Sarje wrote: > Hi Gus, > I have not added any flags that include static, but when I do a > verbose compilation output for the point where error occurs, I see > that there are some -Bstatic flags. I tried to manually remove the > -Bstatic included just before t

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Abhinav Sarje
Hi Gus, I have not added any flags that include static, but when I do a verbose compilation output for the point where error occurs, I see that there are some -Bstatic flags. I tried to manually remove the -Bstatic included just before the libopen-pal.so library, and that particular line then compi

Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list

2012-02-15 Thread yanyg
> So the real issue is: the sm BTL is not working for you. > Yes. > What version of Open MPI are you using? > It is 1.4.3 I am using. > Can you rm -rf any Open MPI directories that may be left over in /tmp? Yes, I have tried that. The clean up does not help to make sm btl work.

Re: [OMPI users] Different Prefix for different nodes

2012-02-15 Thread Jeff Squyres
Thanks! I'll see that this gets into 1.6. On Feb 15, 2012, at 1:19 PM, Tohiko Looka wrote: > On Wed, Feb 15, 2012 at 9:03 PM, Jeff Squyres wrote: >> Can do. Can you point me to exactly where you saw that? > > In mpirun man pages, like here > http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php

Re: [OMPI users] Different Prefix for different nodes

2012-02-15 Thread Tohiko Looka
On Wed, Feb 15, 2012 at 9:03 PM, Jeff Squyres wrote: > Can do. Can you point me to exactly where you saw that? In mpirun man pages, like here http://www.open-mpi.org/doc/v1.4/man1/mpirun.1.php it says: "Note that --prefix can be set on a per-context basis, allowing for different values for diff

Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list

2012-02-15 Thread Jeff Squyres
On Feb 15, 2012, at 11:54 AM, ya...@adina.com wrote: > When I use sm btl layer, my program just hang at the MPI_Init() at > the very beginning. Ok, I think I was thrown off by the other things in this conversation. So the real issue is: the sm BTL is not working for you. What version of Ope

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread Jeff Squyres
On Feb 15, 2012, at 12:50 PM, Tohiko Looka wrote: > My computer doesn't have such a service. and I think that's the correct name > for Fedora > Also, what bugs me is that it used to work with no warnings before restarting > my computer. This seems to imply that *some* OpenFabrics service starte

Re: [OMPI users] Different Prefix for different nodes

2012-02-15 Thread Jeff Squyres
On Feb 15, 2012, at 12:47 PM, Tohiko Looka wrote: > Yes, I tried that and it worked.. Thanks > But I hope the people behind OpenMPI will correct this in the documentation Can do. Can you point me to exactly where you saw that? -- Jeff Squyres jsquy...@cisco.com For corporate legal information

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread Tohiko Looka
Gustavo, I will definitely try to compile OpenMPI myself and see if the problem persist Regarding your note on homogeneous nodes; I tried to do that as much as possible. But I had no control over two nodes and each of them had different setup. As Jeff suggested, using .bashrc seems to solve the is

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread Tohiko Looka
Jeff, My computer doesn't have such a service. and I think that's the correct name for Fedora Also, what bugs me is that it used to work with no warnings before restarting my computer. I will try to recompile openMPI myself (as opposed to installing it using yum) and see what happens On Wed, Feb 1

Re: [OMPI users] Different Prefix for different nodes

2012-02-15 Thread Tohiko Looka
Hello Jeff, Yes, I tried that and it worked.. Thanks But I hope the people behind OpenMPI will correct this in the documentation On Wed, Feb 15, 2012 at 7:08 PM, Jeff Squyres wrote: > On Feb 14, 2012, at 4:06 PM, Tohiko Looka wrote: > > > I'm trying to run my application on different nodes; eac

Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list

2012-02-15 Thread yanyg
> No, there are no others you need to set. Ralph's referring to the fact > that we set OMPI environment variables in the processes that are > started on the remote nodes. > > I was asking to ensure you hadn't set any MCA parameters in the > environment that could be creating a problem. Do you have

Re: [OMPI users] IB Memory Requirements, adjusting for reduced memory consumption

2012-02-15 Thread Shamis, Pavel
You may find additional information here : https://svn.open-mpi.org/trac/ompi/ticket/1900 Using the information there you may calculate actual memory consumption. Pavel (Pasha) Shamis --- Application Performance Tools Group Computer Science and Math Division Oak Ridge National Laboratory On

Re: [OMPI users] Different Prefix for different nodes

2012-02-15 Thread Jeff Squyres
On Feb 14, 2012, at 4:06 PM, Tohiko Looka wrote: > I'm trying to run my application on different nodes; each with a different > path to OpenMPI libraries and binaries. > According to the documentation I can set '-prefix' on a per-context basis, so > I can set '-prefix' differently > for each nod

Re: [OMPI users] help: sm btl does not work when I specify the same host twice or more in the node list

2012-02-15 Thread Jeff Squyres
On Feb 14, 2012, at 10:47 AM, ya...@adina.com wrote: > Yes, in short, I start a c-shell script from bash command line, in > which I mpirun another c-shell script which start the computing > process. The only OMPI related envars are PATH and > LD_LIBRARY_PATH. Any other OPMI envars I should set?

Re: [OMPI users] MPI_Barrier in Self-checkpointing call

2012-02-15 Thread Josh Hursey
When you receive that callback the MPI has ben put in a quiescent state. As such it does not allow MPI communication until the checkpoint is completely finished. So you cannot call barrier in the checkpoint callback. Since Open MPI did doing a coordinated checkpoint, you can assume that all process

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread Gustavo Correa
Hi Tohiko If you compiled Open MPI in a computer with IB hardware, then copied the installation tree to another machine, or if you installed from an RPM or other package generated in a machine with IB, your OpenMPI will have IB enabled, I think, even if the machine where it is running does not

Re: [OMPI users] MPI_Waitall strange behaviour on remote nodes

2012-02-15 Thread Jeff Squyres
Your code works fine for me. Have you disabled iptables / any other firewalling? On Feb 14, 2012, at 10:56 AM, Richard Bardwell wrote: > In trying to debug an MPI_Waitall hang on a remote > node, I created a simple code to test. > > If we run the simple code below on 2 nodes on a local > machi

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Jeff Squyres
Can you try the latest 1.5 nightly tarball? We've upgraded the autotools in the nightlies, but have not yet rolled a new rc. http://www.open-mpi.org/nightly/v1.5/ On Feb 15, 2012, at 12:18 AM, Abhinav Sarje wrote: > Hi Jeff, > > With the latest 1.5.5rc also I am getting the same error: >

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Gustavo Correa
Hi Abhinav Did you add to your compiler flags -Bstatic [or perhaps -static], or any optimization flags that may include -Bstatic/-static? Check also the compiler configuration, and any possible user customization [~/.mypggcc and friends] to see if -Bstatic is there, maybe inadvertently, and is s

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread Jeff Squyres
It is possible to have the OpenFabrics drivers loaded in your kernel, even if you have no OpenFabrics-based devices in your hardware. You probably just want to unload those drivers, and then Open MPI should not try to use OpenFabrics. Sometimes distros have init scripts that load the OpenFabri

Re: [OMPI users] [Open MPI Announce] Open MPI v1.4.5 released

2012-02-15 Thread Reuti
Hi, Am 15.02.2012 um 03:48 schrieb alexalex43210: > But I am a novice for the parallel computation, I often use Fortran to > compile my program, now I want to use the Parallel, can you give me some help > how to begin? > PS: I learned about OPEN MPI is the choice for my question solution. a

Re: [OMPI users] [Open MPI Announce] Open MPI v1.4.5 released

2012-02-15 Thread Jeff Squyres
There are many MPI tutorials available. A very good one that I have referred people to before is: http://www.citutor.org/browse.php (you'll have to sign up for a free account) There's 2 levels to the tutorial: introduction and intermediate. They should both get you started on MPI / paralle

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread TERRY DONTJE
Do you get any interfaces shown when you run "ibstat" on any of the nodes your job is spawned on? --td On 2/15/2012 1:27 AM, Tohiko Looka wrote: Mm... This is really strange I don't have that service and there is no ib* output in 'ifconfig -a' or 'Infinband' in 'lspci' Which makes me believe

[OMPI users] MPI_Barrier in Self-checkpointing call

2012-02-15 Thread Faisal Shahzad
Dear Group, I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)' in 'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every process is present in this call, it fails to synchronize. Is there any reason why cant we use barrier?Thanks in advance. Kind regards

Re: [OMPI users] MPI_Barrier, again

2012-02-15 Thread Evgeniy Shapiro
P.P.S. I ran the same test with OpenMPI 1.5.4, the behaviour is the same. Evgeniy Message: 10 List-Post: users@lists.open-mpi.org Date: Sat, 28 Jan 2012 08:24:39 -0500 From: Jeff Squyres Subject: Re: [OMPI users] MPI_Barrier, again To: Open MPI Users Message-ID: <1859c141-813d-46ba-97bc-4b029

Re: [OMPI users] Strange OpenMPI messages

2012-02-15 Thread Tohiko Looka
Mm... This is really strange I don't have that service and there is no ib* output in 'ifconfig -a' or 'Infinband' in 'lspci' Which makes me believe that I don't have such a network. I also checked on an identical computer on the same network with the same results. What's strange is that these mess

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Abhinav Sarje
Hi Gus, I had tried with the -noswitcherror flags, which removed the 'unknown-switch' error, but then I am still getting the "attempted static link of dynamic object" error as I reported earlier. Thanks. On Fri, Feb 10, 2012 at 5:57 AM, Gustavo Correa wrote: > Hi Abhinav > > Setting CC='pgcc --n

Re: [OMPI users] compilation error with pgcc Unknown switch

2012-02-15 Thread Abhinav Sarje
Hi Jeff, With the latest 1.5.5rc also I am getting the same error: - make[2]: Entering directory `/global/u1/a/asarje/hopper/openmpi-1.5.5rc2r25924-pgi/opal/tools/wrappers' CC opal_wrapper.o CCLD opal_wrapper /usr/bin/ld: attempted static link of dynamic object `../../../opal/.li