Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI v5.0.3)

2024-05-05 Thread T Brouns via users
Hi all,

I solved the problem by doing:

```
INSTALL_DIR=/usr/local/openmpi-5.0.3
export PATH=$INSTALL_DIR/bin:$PATH
export LD_LIBRARY_PATH=$INSTALL_DIR/lib:$LD_LIBRARY_PATH
export OPAL_PREFIX=$INSTALL_DIR
```

That OPAL_PREFIX line was the tricky one.

After doing that, these mpirun commands are now working correctly:

```
mpirun --version
mpirun uptime
```

Thanks for pointing me in the right direction!


@John Hearns,
I'm not setting up a Modules environment, but this sounds like a great
solution to the problem. I might need to look into that! Thanks.


Best,
Terence

On Sat, 4 May 2024 at 17:22, Jeff Squyres (jsquyres) 
wrote:

> You might want to see if your OS has Open MPI installed into default
> binary / library search paths; you might be able to uninstall it easily.
>
> Otherwise, even if you explicitly run the mpirun​ you just
> built+installed, it might find the libmpi.so​ from some other copy of
> Open MPI.
>
> Alternatively, your could prefix your LD_LIBRARY_PATH​ environment
> variable with the libdir from the Open MPI installation you just created.
> --
> *From:* T Brouns 
> *Sent:* Saturday, May 4, 2024 10:56 AM
> *To:* Jeff Squyres (jsquyres) ;
> users@lists.open-mpi.org 
> *Subject:* Re: [OMPI users] Fwd: Unable to run basic mpirun command
> (OpenMPI v5.0.3)
>
> Hi Jeff,
>
> I think you're onto something with the multiple copies.
>
> For this reason, I also tried to run:
>
> ```
> /usr/local/openmpi-5.0.3/bin/mpirun --version
> ```
>
> To make sure I'm running the correct copy, but this one crashes with the
> same error.
>
> As a next step, I can try to install OpenMPI on a different system to
> narrow down the problem. Or run it in a Docker container.
>
> And thanks for the pointer on the
> `mpirun hello_c.c`. This command made no sense.
>
> Best,
> Terence
>
>
> On Sat, 4 May 2024, 14:30 Jeff Squyres (jsquyres), 
> wrote:
>
> My apologies – I must have somehow been looking at the wrong config.log
> file.
>
> I see there's an extra -​ in the script on the help page; I'll get that
> fixed.
>
> Thanks for the tarball; that's easier to get everything.  Looking in
> there, it looks like you built with a prefix of /usr/local/openmpi-5.0.3,
> but your original email referred to looking for a help file in
> /usr/share/openmpi/help-mpirun.txt -- this seems to be a disparity.
>
> You might want to check that you don't have multiple copies of Open MPI
> installed, and you're not running an unexpected copy somewhere – not the
> one you just built.
>
> Also, your first mail mentioned "mpirun hello_c.c" – you don't want to do
> that.  mpirun is used for launching applications.  hello_c.c is the source
> code – you need to compile it first.  In the examples directory, you can
> make​, or you can manually build it via mpicc hello_c.c -o hello_c​.
>
> --
> *From:* T Brouns 
> *Sent:* Saturday, May 4, 2024 2:00 AM
> *To:* Jeff Squyres (jsquyres) 
> *Subject:* Re: [OMPI users] Fwd: Unable to run basic mpirun command
> (OpenMPI v5.0.3)
>
> Hi Jeff,
>
> Thanks for the response.
>
>
> *"Your config.log file shows that you are trying to build Open MPI 2.1.6
> and that configure failed."*
>
>
> Where are you seeing version 2.1.6 exactly? Version 5.0.3 is mentioned
> many times in the config.log file. Whereas if I do a recursive search for
> "2.1.6", it doesn't come up in any of the log files.
>
> Also, the configure didn't give any error message. It
> successfully completed with: configure: exit 0
>
> And I never installed version 2.1.6.
>
> Are you sure you are looking at the right file?
>
>
> *"Can you provide all the information from
> https://docs.open-mpi.org/en/v5.0.x/getting-help.html
> ?  (e.g., tar all
> the files up in a single file – makes it easier to download and examine
> everything)"*
>
>
> Here's the TAR file:
>
>
> https://drive.google.com/file/d/19cr7Y4gyCEP0Aa2isTnASItOe9wmfTSK/view?usp=sharing
>
> When I used the first script provided on that webpage, I got the following
> error:
>
> ```
> + tar -x -C /home/jupyter/openmpi-5.0.3/ompi-output -
> ++ find . -name config.log
> + tar -cf ./3rd-party/libevent-2.1.12-stable/config.log
> ./3rd-party/openpmix/config.log ./3rd-party/romio341/mpl/config.log
> ./3rd-party/romio341/config.log ./3rd-party/prrte/config.log ./config.log
> tar: This does not look like a tar archive
> tar: -: Not found in archive
> tar: Exiting with failure status due to previous errors
> ```
>
> This is why I didn't generate the TAR file in the first place. I fixed the
> script now.
>
>
> Best,
> Terence
>
>
>
> On Fri, 3 May 2024 at 23:43, Jeff Squyres (jsquyres) 
> wrote:
>
> Your config.log file shows that you are trying to build Open MPI 2.1.6 and
> that configure failed.
>
> I'm not sure how to square this with the information that you provided in
> your message... did you upload the wrong config.log?
>
> Can you provide all the information from
> https://

Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI v5.0.3)

2024-05-05 Thread Jeff Squyres (jsquyres) via users
Note that, depending on your environment, you might need to set these env 
variables on every node where you're running the Open MPI job.  For example: 
https://docs.open-mpi.org/en/v5.0.x/launching-apps/quickstart.html#launching-in-a-non-scheduled-environments-via-ssh
 and 
https://docs.open-mpi.org/en/v5.0.x/launching-apps/ssh.html#finding-open-mpi-executables-and-libraries.

From: T Brouns 
Sent: Sunday, May 5, 2024 4:37 PM
To: users@lists.open-mpi.org 
Cc: Jeff Squyres (jsquyres) ; hear...@gmail.com 

Subject: Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI 
v5.0.3)

Hi all,

I solved the problem by doing:

```
INSTALL_DIR=/usr/local/openmpi-5.0.3
export PATH=$INSTALL_DIR/bin:$PATH
export LD_LIBRARY_PATH=$INSTALL_DIR/lib:$LD_LIBRARY_PATH
export OPAL_PREFIX=$INSTALL_DIR
```

That OPAL_PREFIX line was the tricky one.

After doing that, these mpirun commands are now working correctly:

```
mpirun --version
mpirun uptime
```

Thanks for pointing me in the right direction!


@John Hearns,
I'm not setting up a Modules environment, but this sounds like a great solution 
to the problem. I might need to look into that! Thanks.


Best,
Terence

On Sat, 4 May 2024 at 17:22, Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:
You might want to see if your OS has Open MPI installed into default binary / 
library search paths; you might be able to uninstall it easily.

Otherwise, even if you explicitly run the mpirun​ you just built+installed, it 
might find the libmpi.so​ from some other copy of Open MPI.

Alternatively, your could prefix your LD_LIBRARY_PATH​ environment variable 
with the libdir from the Open MPI installation you just created.

From: T Brouns mailto:t.s.n.bro...@gmail.com>>
Sent: Saturday, May 4, 2024 10:56 AM
To: Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>>; 
users@lists.open-mpi.org 
mailto:users@lists.open-mpi.org>>
Subject: Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI 
v5.0.3)

Hi Jeff,

I think you're onto something with the multiple copies.

For this reason, I also tried to run:

```
/usr/local/openmpi-5.0.3/bin/mpirun --version
```

To make sure I'm running the correct copy, but this one crashes with the same 
error.

As a next step, I can try to install OpenMPI on a different system to narrow 
down the problem. Or run it in a Docker container.

And thanks for the pointer on the
`mpirun hello_c.c`. This command made no sense.

Best,
Terence


On Sat, 4 May 2024, 14:30 Jeff Squyres (jsquyres), 
mailto:jsquy...@cisco.com>> wrote:
My apologies – I must have somehow been looking at the wrong config.log file.

I see there's an extra -​ in the script on the help page; I'll get that fixed.


Thanks for the tarball; that's easier to get everything.  Looking in there, it 
looks like you built with a prefix of /usr/local/openmpi-5.0.3, but your 
original email referred to looking for a help file in 
/usr/share/openmpi/help-mpirun.txt -- this seems to be a disparity.

You might want to check that you don't have multiple copies of Open MPI 
installed, and you're not running an unexpected copy somewhere – not the one 
you just built.

Also, your first mail mentioned "mpirun hello_c.c" – you don't want to do that. 
 mpirun is used for launching applications.  hello_c.c is the source code – you 
need to compile it first.  In the examples directory, you can make​, or you can 
manually build it via mpicc hello_c.c -o hello_c​.


From: T Brouns mailto:t.s.n.bro...@gmail.com>>
Sent: Saturday, May 4, 2024 2:00 AM
To: Jeff Squyres (jsquyres) mailto:jsquy...@cisco.com>>
Subject: Re: [OMPI users] Fwd: Unable to run basic mpirun command (OpenMPI 
v5.0.3)

Hi Jeff,

Thanks for the response.


"Your config.log file shows that you are trying to build Open MPI 2.1.6 and 
that configure failed."

Where are you seeing version 2.1.6 exactly? Version 5.0.3 is mentioned many 
times in the config.log file. Whereas if I do a recursive search for "2.1.6", 
it doesn't come up in any of the log files.

Also, the configure didn't give any error message. It successfully completed 
with: configure: exit 0

And I never installed version 2.1.6.

Are you sure you are looking at the right file?


"Can you provide all the information from 
https://docs.open-mpi.org/en/v5.0.x/getting-help.html?  (e.g., tar all the 
files up in a single file – makes it easier to download and examine everything)"

Here's the TAR file:

https://drive.google.com/file/d/19cr7Y4gyCEP0Aa2isTnASItOe9wmfTSK/view?usp=sharing

When I used the first script provided on that webpage, I got the following 
error:

```
+ tar -x -C /home/jupyter/openmpi-5.0.3/ompi-output -
++ find . -name config.log
+ tar -cf ./3rd-party/libevent-2.1.12-stable/config.log 
./3rd-party/openpmix/config.log ./3rd-party/romio341/mpl/config.log 
./3rd-party/romio341/config.log ./3rd-party/prrte/config.log ./c